Harshitha's Personalized AI/ML Roadmap

Profile Analysis

The Brutal Honest Read

Before the roadmap, here is what's actually true about your situation.

9/10

Networking comfort
— your biggest asset

10/10

Building in public
— use this now

4/10

Deploying real projects
— critical gap to close

IIT

Dharwad credential
+ Dassault Systèmes proof

✓ Strength

IIT Credential + Live AI Internship

You're not a beginner. You have a premium institution badge AND active exposure to production AI at Dassault Systèmes. Most applicants have neither. This changes how you should position yourself entirely.

✓ Strength

Outrageous Networking Confidence

A 9/10 networking score is genuinely rare. Your biggest hiring friction — getting past resume screening — can be completely bypassed by reaching out directly to hiring managers and ML leads. You have the confidence. Use it.

✓ Strength

Strong Mathematical Foundation

Linear algebra, statistics, probability — already solid. This means you can go deep on ML theory immediately. You won't hit the math wall that stops most learners at Month 2.

✗ Weakness

Surface-Level Knowledge — No Real Depth

You said it yourself: "I know most topics but lack depth." Using ChatGPT to generate logistic regression code is not the same as understanding it. Interviewers will test depth. You need to own every line you write.

✗ Weakness

Nothing Deployed — Only Notebooks

4/10 on deploying real projects. Companies don't care about your Jupyter notebooks. They want to click a URL and see something running. Every project in this roadmap ends with a deployed URL or a GitHub repo that runs in one command.

→ Opportunity

Leverage Dassault Systèmes Now

You are inside an AI team at a major company. Whatever you're working on — document it, write about it (within NDA bounds), extract learnings. This is your proof. Hiring managers trust "worked on X at Dassault" over any personal project.

⚡ Critical Reframe

Your fear is: "I'll just consume videos and not build anything."

The real problem is: You've been learning topics instead of learning to solve problems. This roadmap has one rule: no topic is "done" until you have a deployed artifact, a GitHub commit, or a LinkedIn post with code. Emotion Creates Judgment — the law you know works for you — means every concept needs a real stake attached to it. You don't need more video time. You need more commit time.

Week 1–2 · Pre-Phase

The Diagnostic Sprint

Before learning anything new, find exactly where your depth breaks down.

⚠ Do This Before Anything Else

Most people skip this and waste months. You said you "know most topics but lack depth." This sprint reveals precisely which topics you understand vs which ones you just recognise. The roadmap then focuses only where depth is missing.

⚡

Law 1: Prediction Before Explanation

Before each test below, write your prediction: "I think I know X because..." Then test. The gap between prediction and test result is your actual learning target.

7-Day Depth Test Protocol

Day 1

Implement logistic regression from scratch using only NumPy. No sklearn.

Gradient descent converges, loss decreases, you can explain every line

Day 2

Take a messy real CSV (Kaggle Housing dataset). Clean it with Pandas. No tutorials.

Missing values handled, types correct, GroupBy works, you wrote every line

Day 3

Explain backpropagation to a rubber duck (or voice recorder). Draw the chain rule on paper for a 2-layer net.

You can derive ∂L/∂W without looking it up

Day 4

Open any popular ML GitHub repo (e.g., huggingface/transformers). Navigate it. Find where the attention mechanism is computed. Explain to yourself what one file does.

You can identify 3 architectural decisions made in the code

Day 5

Build a FastAPI endpoint that loads a scikit-learn model and serves predictions. Run it locally.

curl localhost:8000/predict returns a response. You understand every import.

Day 6

Write a LinkedIn post: "I tested my ML depth this week. Here's what I actually knew vs what I thought I knew. [3 bullet points]"

Post published. You feel the discomfort — that's the emotion creating judgment.

Day 7

Review your results. Mark each area: Solid / Needs Work / Broken. This becomes your Phase 1 priority list.

You have a written list of 5-8 specific gaps to close

Month 1–2 · Phase 1

PHASE 1 · Depth Over Breadth

Close the Depth Gap

You have breadth. Now you need to own each topic cold — without AI assistance. Focus: Python comprehension, classical ML from scratch, and your first deployed project.

Timeline

Weeks 3–8

🟡

Law 4: Emotion Creates Judgment — Your Anchor Law

You identified this as what used to work for you. Every topic in Phase 1 must have a stake: a deployed thing, a published post, or a benchmark. If there is no public output, the topic isn't done. This is non-negotiable for you specifically.

🟢

Law 3: Compression Beats Coverage — Skip What You Already Know

Do NOT restart from Python basics. Use your diagnostic results. If you passed the Day 1 logistic regression test, you skip to Month 2 content. This roadmap compresses to YOUR gaps — not to a generic beginner's curriculum.

P1.1

Python — Code Reading, Not Writing

Read real production Python: FastAPI source, scikit-learn internals, HuggingFace transformers. Goal: navigate any ML codebase and understand what each module does before running it.

Deliverable Annotated walkthrough of one real open-source ML file posted on GitHub. LinkedIn post: "I read the sklearn LinearRegression source. Here's what I found."

● Critical

P1.2

NumPy + Pandas — Deep Mastery

Vectorisation, broadcasting, GroupBy+merge chains. If diagnostic Day 2 was hard — this is your Week 1 focus. Stop using .iterrows(). Master method chaining. Real messy data only.

Deliverable Clean a real messy Kaggle dataset. Benchmark .iterrows() vs .vectorized(). Post results.

● Critical

P1.3

Classical ML — From-Scratch Implementations

Logistic Regression, Decision Trees, Gradient Boosting — implemented without sklearn. Understand every gradient, every split criterion, every hyperparameter. You already know the API; now know the math behind it.

Deliverable GitHub repo: "ML From Scratch" with your own implementations. Each model has a README explaining failure modes.

● Critical

P1.4

PyTorch Fundamentals + Backprop by Hand

If Day 3 diagnostic was hard — this is your Month 1.5 focus. Implement a 2-layer net in NumPy first. Then replicate with PyTorch autograd. Compare gradients. No black boxes.

Deliverable Notebook: manual backprop vs autograd. Post: "3 hours implementing backprop taught me more than 30 hours of video."

● High

P1.5

Git + FastAPI + Docker — First Deployment

Deploy your first ML model as a REST API with Docker. This is your anti-notebook move. It doesn't need to be impressive — it needs to be live. Target: running on Render or Railway for free.

Deliverable Public URL that serves ML predictions. GitHub repo with Dockerfile. LinkedIn post with the link.

● Build

Phase 1 Projects

▸ Project 1 · Deploy by Week 6

ML Performance Comparison — But Deployed

You already did a project comparing ML models. Rebuild it: clean code you wrote yourself, FastAPI endpoint, Docker container, live URL. Write a case study on what the models actually fail at — not just accuracy scores.

scikit-learn FastAPI Docker Render

▸ Project 2 · Deploy by Week 8

Open-Source Contribution (Small)

Pick a Python ML repo with "good first issue" tags. Fix a bug or improve documentation. Submit a PR. Getting it merged is the goal — even one line. This proves you can read and change other people's code.

GitHub PR submitted Real codebase

Month 2–4 · Phase 2

PHASE 2 · GenAI Stack + Production Thinking

Build Hireable Skills

The market is hiring for GenAI and MLOps. This phase targets the exact skills that appear in job descriptions at AI-first companies in India and UK. Every topic ends with something you can show.

Timeline

Weeks 9–18

🔴

Law 2: Failure Modes Over Features

For every technology in Phase 2, learn it by breaking it first. RAG fails with chunking strategy. LLMs fail on hallucination. Vector databases fail on recall. Know exactly how it breaks BEFORE you know how to build it — that's what senior engineers actually know.

P2.1

Transformers — Deep Understanding

Self-attention mechanics, positional encoding, multi-head attention. Read the original "Attention is All You Need" paper. Implement a tiny Transformer in PyTorch. Not a library — from the paper.

Deliverable GitHub: tiny Transformer implemented from paper. LinkedIn thread: "I read the original Transformer paper. 5 things that surprised me."

● Critical

P2.2

Embeddings + Vector Databases

Word2Vec, sentence embeddings, contextual embeddings (BERT). Similarity search, HNSW, Chroma/Qdrant. This is the foundation of every RAG system. Understand embedding geometry — why similar things cluster.

Deliverable Notebook visualising embedding space of a domain-specific dataset. Semantic search demo deployed.

● Critical

P2.3

RAG Systems — Build One End to End

Architecture, chunking strategies, dense vs sparse retrieval, re-ranking. This is the most in-demand GenAI skill in 2025–26. Not a tutorial RAG — a RAG that you stress-tested and found failure modes in.

Deliverable Deployed RAG app on a topic you care about. Post: "I broke my RAG system 7 ways. Here's what I learned."

● Critical

P2.4

Prompt Engineering + LLM Cost Maths

Chain-of-thought, few-shot prompting, system prompts, output formatting. ALSO: token pricing, request cost estimation. Companies care about cost. Being able to say "this system costs ₹X per 1000 queries" is a differentiator.

Deliverable Case study: cost analysis of a real LLM system with before/after optimisation.

● High

P2.5

MLOps Basics — Experiment Tracking + Model Registry

MLflow or Weights & Biases. Parameter logging, metric tracking, model versioning. At Dassault Systèmes, your team likely uses some version of this — connect the dots between what you see at work and these tools.

Deliverable GitHub project with MLflow tracking integrated. Shows your model experiments, not just final results.

● High

P2.6

System Design for ML — Basics

Latency vs throughput, batch vs real-time, caching, fallback chains. This is what senior engineers think about — and what interviewers ask about. Even a basic mental model here puts you ahead of 90% of candidates.

Deliverable Written design document: "How I would build a RAG system for 10K users." Posted publicly.

● High

Phase 2 Flagship Project

▸ Flagship · Deploy by Week 16

Domain RAG System — Dassault-Adjacent

Build a RAG system over a public technical dataset related to your internship domain (engineering, CAD, manufacturing, 3D simulation — whatever you can reference without violating NDA). Deploy it. Write a full case study: architecture decisions, failure modes found, cost analysis, what you'd do differently.

LangChain / LlamaIndex ChromaDB / Qdrant FastAPI Docker Render / Railway

▸ Side Project · Week 12–16

Open-Source Contribution (Real)

Target: LangChain, LlamaIndex, or any AI/ML library with active issues. Fix a real bug or add a documented feature. The goal is: your name in a merged PR of a repo with 1000+ GitHub stars. This is the resume equivalent of a gold badge.

GitHub PR Merged 1K+ stars repo

Month 4–5 · Phase 3

PHASE 3 · Open-Source Mastery

Read, Contribute, Tear Apart

Your stated goal: read difficult code, point out bugs, suggest improvements, and tear it apart. This phase makes that real. You move from user of libraries to contributor to critic.

Timeline

Weeks 19–24

🟣

Law 5: AI Accelerates, Humans Judge

Use AI to read code faster — Claude, Copilot, GPT-4. But the judgment — "this is a bad abstraction", "this will fail under load", "this is the wrong tradeoff" — must come from you. AI finds what; you decide so what.

Open-Source Target Repos

P3.1

Pick 2 Repos — Study Deeply

Recommended: scikit-learn (classical ML, Python, excellent docs), and one of: LangChain, LlamaIndex, or Haystack (GenAI, active community, good issues for newcomers). For each: read the architecture docs, trace one feature end-to-end through the code.

Deliverable Public architecture breakdown post: "I read the LangChain source so you don't have to. Here's how RAG actually works under the hood."

● Critical

P3.2

Bug Hunt — Find Real Issues

Don't wait for "good first issue." Run the test suite. Find edge cases. Write a minimal reproduction script. Then either: open a GitHub Issue with full repro, or open a PR with a fix. Even rejected PRs with good discussion are proof.

Deliverable GitHub Issue or PR opened with detailed analysis. Link on LinkedIn.

● High

P3.3

Write a Technical Deep-Dive Post

1500-word technical post on something you found while reading real code. Not a tutorial. An analysis: "I found a potential performance issue in X. Here's the profiling, the root cause, and what I'd change." This is what gets you noticed by engineers.

Deliverable Published on LinkedIn or a blog. Tag the maintainers of the repo. 500+ impressions target.

● Build

P3.4

Model Interpretability + SHAP

SHAP, LIME, feature importance. At Dassault Systèmes, explainability matters — industrial AI systems need to justify predictions. This is directly transferable and rare at the junior level.

Deliverable Interpretability analysis of your Phase 1 or Phase 2 model. Post: "Why my model makes the decisions it does."

○ Good-to-have

Month 5–6 · Phase 4

PHASE 4 · Packaging & Job Hunt

Get Past the Resume. Every Time.

Your 9/10 networking score is your weapon. The Offer Framework says: make NOT hiring you feel like a risk. Here is exactly how to do that for India and UK markets.

Timeline

Weeks 23–26

Your Target Market (Offer Framework: Ex 1)

🇮🇳 India — Primary Market

AI-first product startups (Series A–B, Bangalore / Delhi / Mumbai / Pune)
Fintech companies with ML teams (Zerodha, Razorpay, Groww — adjacent companies)
Mid-size tech companies building AI products (not IT services)
Defence / Industrial AI companies (your Dassault background is directly relevant)

🇬🇧 UK — Secondary Market

UK Tier 2 Graduate Route — IIT Dharwad qualifies. Target Edinburgh, Manchester, London startups.
AI-first startups via Y Combinator UK / Seedcamp portfolio
Engineering-adjacent AI companies (manufacturing AI, simulation AI — your Dassault angle)
Note: get your visa eligibility confirmed before investing heavily here

📋 Your Value Proposition (Ex 2)

"Get production-ready AI features deployed in 60 days without hiring a ₹30L+ senior engineer"
IIT credentialed + Dassault Systèmes production AI experience = proof, not promise
Can navigate complex industrial codebases — most junior candidates cannot

🚫 Your Hiring Frictions (Ex 4) & Fixes

Friction: No deployed projects → Fix: 3 live URLs before Month 4
Friction: Surface-level knowledge → Fix: From-scratch implementations on GitHub
Friction: Resume rejected → Fix: Bypass with direct outreach (your 9/10 advantage)
Friction: Can't prove production thinking → Fix: Published architecture case studies

The Bypass Play — Using Your 9/10 Networking

Do not wait until Month 6 to start this. Start outreach in Month 2 as soon as you have one deployed project. This is your primary resume bypass strategy.

LinkedIn DM Template — ML Lead / CTO at Target Company:

"Hi [Name], I noticed [Company] is building [specific AI feature from their job description or product page]. I'm a final-year CSE student at IIT Dharwad, currently interning at Dassault Systèmes on their AI team working on [general area, no NDA details]. I built a [RAG system / deployed ML API / open-source contribution] recently — link: [URL]. I'm genuinely curious how your team handles [specific technical challenge they'd face]. Happy to share what I learned at Dassault if it's useful — or just to hear how you've solved it."

This works because: (1) you're not asking for a job, (2) you have a real link to click, (3) IIT + Dassault is a credible combination, (4) it shows specific technical curiosity about their work.

What To Apply To — A Decision Framework

⚠ Your Q38 Answer — "What should I apply to?"

You asked: "What jobs should I apply to and how do I know what the hiring manager wants me to showcase?" Here is the decision filter.

✅ Apply If the JD Contains

"Deploy ML models to production" — you can prove this
"RAG", "LLM integration", "GenAI" — you build this in Phase 2
"Python, FastAPI, Docker" — your stack
"Work independently with minimal supervision" — highlight Dassault
"Open-source contributions welcome" — you have these

❌ Skip (For Now)

FAANG / big tech — 12–24 months of competitive preparation needed
Pure research roles without production component — mismatch
Roles requiring 2+ years experience — you don't need these
IT services companies labelling ML work as "AI" — these aren't AI roles

Reading the Hiring Manager's Mind

Hiring managers at AI-first startups are asking exactly ONE question: "Can this person ship without me babysitting them?" Everything you build, write, and post must answer that question.

Deployed project (live URL)

95%

From-scratch implementations

90%

Open-source PR merged

85%

Written case study (architecture)

80%

Dassault Systèmes (credential)

75%

LinkedIn post activity

60%

Approximate hiring manager confidence boost from each proof asset. A live URL is your single most powerful credential.

Daily / Weekly Structure

Your Real Schedule (8AM–7PM at Work)

You're at Dassault Systèmes from 8 to 7. That leaves 7PM–11PM weekdays and full weekends. Here is how to use them without burning out.

Weekday Evening (7PM–10PM)

7–7:30

Decompress. No screen. Walk, eat.

You came from 11 hours at work

7:30–9

Deep work: coding, implementing, reading code

Best focus window after rest

9–9:30

Write: LinkedIn post, doc update, README

Public output = accountability

9:30–10

Read: one paper section or one file of a codebase

Lower-energy but builds breadth

Weekend (Saturday + Sunday)

Sat AM

Build: 3-hour project session — implement and debug

Code committed to GitHub

Sat PM

Deploy / test / benchmark — make something run

URL or script that works

Sun AM

Write and publish: case study or tech post

LinkedIn post published

Sun PM

Review week + plan next week. Rest.

Written plan for Mon

⚡ Your 7-Day Litmus Test

At the end of every week, ask: "What did I ship this week?"
If the answer is "I watched videos and read articles" — the week failed, regardless of how much you studied.
If the answer is "I committed X, published Y, or deployed Z" — the week succeeded. Ship something every week. No exceptions.

Your 6-Month North Star

Success Looks Like This

Concrete, measurable, undeniable.

Month 2

First Deployed URL

A real ML model served over a REST API. Docker container. Public GitHub. You understand every line.

Month 3

First PR Merged

Any contribution to an open-source ML repo with real users. Your name in someone else's codebase.

Month 4

RAG System Live

Full GenAI project deployed with documented failure modes and cost analysis. Case study written and published.

Month 5

10 Hiring Manager Conversations

Using your 9/10 networking score. Not applications — direct conversations. At least 3 that turn into interviews.

Month 6

First Offer

AI Engineer or ML Engineer role at an AI-first company. India or UK. You proved you can ship — not just talk.

Ongoing

Open Source Identity

You read production code and find bugs. You can say so publicly with receipts. This compounds forever.

💬 To Your Q40: "Will this work? Can I keep up?"

You scored 9/10 on following through for 6 months. You already said you'd jump straight in this week. You're at IIT Dharwad. You're inside an AI team at Dassault Systèmes. You network like a senior engineer. You build in public at 10/10. The only version of this that fails is if you keep consuming instead of building. The roadmap is designed around your real constraint: depth, not coverage. One topic at a time. One deployment at a time. You've done harder things. This is just consistent reps.

Harshitha'sAI/ML Roadmap

IIT Credential + Live AI Internship

Outrageous Networking Confidence

Strong Mathematical Foundation

Surface-Level Knowledge — No Real Depth

Nothing Deployed — Only Notebooks

Leverage Dassault Systèmes Now

⚡ Critical Reframe

⚠ Do This Before Anything Else

Law 1: Prediction Before Explanation

7-Day Depth Test Protocol

Close the Depth Gap

Law 4: Emotion Creates Judgment — Your Anchor Law

Law 3: Compression Beats Coverage — Skip What You Already Know

Python — Code Reading, Not Writing

NumPy + Pandas — Deep Mastery

Classical ML — From-Scratch Implementations

PyTorch Fundamentals + Backprop by Hand

Git + FastAPI + Docker — First Deployment

Phase 1 Projects

ML Performance Comparison — But Deployed

Open-Source Contribution (Small)

Build Hireable Skills

Law 2: Failure Modes Over Features

Transformers — Deep Understanding

Embeddings + Vector Databases

RAG Systems — Build One End to End

Prompt Engineering + LLM Cost Maths

MLOps Basics — Experiment Tracking + Model Registry

System Design for ML — Basics

Phase 2 Flagship Project

Domain RAG System — Dassault-Adjacent

Open-Source Contribution (Real)

Read, Contribute, Tear Apart

Law 5: AI Accelerates, Humans Judge

Open-Source Target Repos

Pick 2 Repos — Study Deeply

Bug Hunt — Find Real Issues

Write a Technical Deep-Dive Post

Model Interpretability + SHAP

Get Past the Resume. Every Time.

Your Target Market (Offer Framework: Ex 1)

🇮🇳 India — Primary Market

🇬🇧 UK — Secondary Market

📋 Your Value Proposition (Ex 2)

🚫 Your Hiring Frictions (Ex 4) & Fixes

The Bypass Play — Using Your 9/10 Networking

What To Apply To — A Decision Framework

⚠ Your Q38 Answer — "What should I apply to?"

✅ Apply If the JD Contains

❌ Skip (For Now)

Reading the Hiring Manager's Mind

Weekday Evening (7PM–10PM)

Weekend (Saturday + Sunday)

⚡ Your 7-Day Litmus Test

First Deployed URL

First PR Merged

RAG System Live

10 Hiring Manager Conversations

First Offer

Open Source Identity

💬 To Your Q40: "Will this work? Can I keep up?"

Harshitha's
AI/ML Roadmap