# Roadmap Progress tracking for every phase and lesson. **Total estimated time: ~290 hours (at your own pace)** **Legend:** ✅ Complete | 🚧 In Progress | ⬚ Planned ## Phase 0: Setup & Tooling — ✅ (~14 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Dev Environment | ✅ | ~75 min | | 02 | Git & Collaboration | ✅ | ~45 min | | 03 | GPU Setup & Cloud | ✅ | ~75 min | | 04 | APIs & Keys | ✅ | ~75 min | | 05 | Jupyter Notebooks | ✅ | ~75 min | | 06 | Python Environments | ✅ | ~75 min | | 07 | Docker for AI | ✅ | ~75 min | | 08 | Editor Setup | ✅ | ~75 min | | 09 | Data Management | ✅ | ~75 min | | 10 | Terminal & Shell | ✅ | ~45 min | | 11 | Linux for AI | ✅ | ~45 min | | 12 | Debugging & Profiling | ✅ | ~75 min | ## Phase 1: Math Foundations — ✅ (~23 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Linear Algebra Intuition | ✅ | ~45 min | | 02 | Vectors, Matrices & Operations | ✅ | ~75 min | | 03 | Matrix Transformations & Eigenvalues | ✅ | ~75 min | | 04 | Calculus for ML — Derivatives & Gradients | ✅ | ~45 min | | 05 | Chain Rule & Automatic Differentiation | ✅ | ~75 min | | 06 | Probability & Distributions | ✅ | ~45 min | | 07 | Bayes' Theorem & Statistical Thinking | ✅ | ~75 min | | 08 | Optimization — Gradient Descent Family | ✅ | ~75 min | | 09 | Information Theory — Entropy, KL Divergence | ✅ | ~45 min | | 10 | Dimensionality Reduction — PCA, t-SNE, UMAP | ✅ | ~75 min | | 11 | Singular Value Decomposition | ✅ | ~75 min | | 12 | Tensor Operations | ✅ | ~75 min | | 13 | Numerical Stability | ✅ | ~45 min | | 14 | Norms & Distances | ✅ | ~45 min | | 15 | Statistics for ML | ✅ | ~45 min | | 16 | Sampling Methods | ✅ | ~75 min | | 17 | Linear Systems | ✅ | ~75 min | | 18 | Convex Optimization | ✅ | ~75 min | | 19 | Complex Numbers for AI | ✅ | ~45 min | | 20 | The Fourier Transform | ✅ | ~75 min | | 21 | Graph Theory for ML | ✅ | ~45 min | | 22 | Stochastic Processes | ✅ | ~45 min | ## Phase 2: ML Fundamentals — ✅ (~21 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | What Is Machine Learning — Types & Taxonomy | ✅ | ~45 min | | 02 | Linear Regression from Scratch | ✅ | ~75 min | | 03 | Logistic Regression & Classification | ✅ | ~75 min | | 04 | Decision Trees & Random Forests | ✅ | ~75 min | | 05 | Support Vector Machines | ✅ | ~75 min | | 06 | K-Nearest Neighbors & Distance Metrics | ✅ | ~75 min | | 07 | Unsupervised Learning — K-Means, DBSCAN | ✅ | ~75 min | | 08 | Feature Engineering & Selection | ✅ | ~75 min | | 09 | Model Evaluation — Metrics, Cross-Validation | ✅ | ~75 min | | 10 | Bias, Variance & the Learning Curve | ✅ | ~45 min | | 11 | Ensemble Methods — Boosting, Bagging, Stacking | ✅ | ~75 min | | 12 | Hyperparameter Tuning & AutoML | ✅ | ~75 min | | 13 | ML Pipelines & Experiment Tracking | ✅ | ~75 min | | 14 | Naive Bayes — Multinomial, Gaussian, Bernoulli | ✅ | ~75 min | | 15 | Time Series Fundamentals | ✅ | ~45 min | | 16 | Anomaly Detection | ✅ | ~75 min | | 17 | Handling Imbalanced Data | ✅ | ~75 min | | 18 | Feature Selection | ✅ | ~75 min | ## Phase 3: Deep Learning Core — ✅ (~15 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | The Perceptron — Where It All Started | ✅ | ~45 min | | 02 | Multi-Layer Networks & Forward Pass | ✅ | ~75 min | | 03 | Backpropagation from Scratch | ✅ | ~75 min | | 04 | Activation Functions — ReLU, Sigmoid, GELU & Why | ✅ | ~45 min | | 05 | Loss Functions — MSE, Cross-Entropy, Contrastive | ✅ | ~45 min | | 06 | Optimizers — SGD, Momentum, Adam, AdamW | ✅ | ~75 min | | 07 | Regularization — Dropout, Weight Decay, BatchNorm | ✅ | ~75 min | | 08 | Weight Initialization & Training Stability | ✅ | ~45 min | | 09 | Learning Rate Schedules & Warmup | ✅ | ~45 min | | 10 | Build Your Own Mini Framework | ✅ | ~120 min | | 11 | Introduction to PyTorch | ✅ | ~75 min | | 12 | Introduction to JAX | ✅ | ~75 min | | 13 | Debugging Neural Networks | ✅ | ~75 min | ## Phase 4: Computer Vision — ⬚ (~19 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Image Fundamentals — Pixels, Channels, Color Spaces | ⬚ | ~45 min | | 02 | Convolutions from Scratch | ⬚ | ~75 min | | 03 | CNNs — LeNet to ResNet | ⬚ | ~75 min | | 04 | Image Classification | ⬚ | ~75 min | | 05 | Transfer Learning & Fine-Tuning | ⬚ | ~75 min | | 06 | Object Detection — YOLO from Scratch | ⬚ | ~75 min | | 07 | Semantic Segmentation — U-Net | ⬚ | ~75 min | | 08 | Instance Segmentation — Mask R-CNN | ⬚ | ~75 min | | 09 | Image Generation — GANs | ⬚ | ~75 min | | 10 | Image Generation — Diffusion Models | ⬚ | ~75 min | | 11 | Stable Diffusion — Architecture & Fine-Tuning | ⬚ | ~75 min | | 12 | Video Understanding — Temporal Modeling | ⬚ | ~45 min | | 13 | 3D Vision — Point Clouds, NeRFs | ⬚ | ~45 min | | 14 | Vision Transformers (ViT) | ⬚ | ~45 min | | 15 | Real-Time Vision — Edge Deployment | ⬚ | ~75 min | | 16 | Build a Complete Vision Pipeline | ⬚ | ~120 min | ## Phase 5: NLP — Foundations to Advanced — ⬚ (~19 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Text Processing — Tokenization, Stemming, Lemmatization | ⬚ | ~45 min | | 02 | Bag of Words, TF-IDF & Text Representation | ⬚ | ~75 min | | 03 | Word Embeddings — Word2Vec from Scratch | ⬚ | ~75 min | | 04 | GloVe, FastText & Subword Embeddings | ⬚ | ~45 min | | 05 | Sentiment Analysis | ⬚ | ~75 min | | 06 | Named Entity Recognition (NER) | ⬚ | ~75 min | | 07 | POS Tagging & Syntactic Parsing | ⬚ | ~45 min | | 08 | Text Classification — CNNs & RNNs for Text | ⬚ | ~75 min | | 09 | Sequence-to-Sequence Models | ⬚ | ~75 min | | 10 | Attention Mechanism — The Breakthrough | ⬚ | ~45 min | | 11 | Machine Translation | ⬚ | ~75 min | | 12 | Text Summarization | ⬚ | ~75 min | | 13 | Question Answering Systems | ⬚ | ~75 min | | 14 | Information Retrieval & Search | ⬚ | ~75 min | | 15 | Topic Modeling — LDA, BERTopic | ⬚ | ~45 min | | 16 | Text Generation — Language Models Before Transformers | ⬚ | ~45 min | | 17 | Chatbots — Rule-Based to Neural | ⬚ | ~75 min | | 18 | Multilingual NLP | ⬚ | ~45 min | ## Phase 6: Speech & Audio — ⬚ (~13 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Audio Fundamentals — Waveforms, Sampling, Fourier Transform | ⬚ | ~45 min | | 02 | Spectrograms, Mel Scale & Audio Features | ⬚ | ~45 min | | 03 | Audio Classification | ⬚ | ~75 min | | 04 | Speech Recognition (ASR) | ⬚ | ~45 min | | 05 | Whisper — Architecture & Fine-Tuning | ⬚ | ~75 min | | 06 | Speaker Recognition & Verification | ⬚ | ~45 min | | 07 | Text-to-Speech (TTS) | ⬚ | ~75 min | | 08 | Voice Cloning & Voice Conversion | ⬚ | ~75 min | | 09 | Music Generation | ⬚ | ~75 min | | 10 | Audio-Language Models | ⬚ | ~45 min | | 11 | Real-Time Audio Processing | ⬚ | ~75 min | | 12 | Build a Voice Assistant Pipeline | ⬚ | ~120 min | ## Phase 7: Transformers Deep Dive — 🚧 (~14 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Why Transformers — The Problems with RNNs | ⬚ | ~45 min | | 02 | Self-Attention from Scratch | ✅ | ~75 min | | 03 | Multi-Head Attention | ⬚ | ~75 min | | 04 | Positional Encoding — Sinusoidal, RoPE, ALiBi | ⬚ | ~45 min | | 05 | The Full Transformer — Encoder + Decoder | ⬚ | ~75 min | | 06 | BERT — Masked Language Modeling | ⬚ | ~45 min | | 07 | GPT — Causal Language Modeling | ⬚ | ~75 min | | 08 | T5, BART — Encoder-Decoder Models | ⬚ | ~45 min | | 09 | Vision Transformers (ViT) | ⬚ | ~45 min | | 10 | Audio Transformers — Whisper Architecture | ⬚ | ~45 min | | 11 | Mixture of Experts (MoE) | ⬚ | ~45 min | | 12 | KV Cache, Flash Attention & Inference Optimization | ⬚ | ~75 min | | 13 | Scaling Laws | ⬚ | ~45 min | | 14 | Build a Transformer from Scratch — The Capstone | ⬚ | ~120 min | ## Phase 8: Generative AI — ⬚ (~14 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Generative Models — Taxonomy & History | ⬚ | ~45 min | | 02 | Autoencoders & VAE | ⬚ | ~75 min | | 03 | GANs — Generator vs Discriminator | ⬚ | ~75 min | | 04 | Conditional GANs & Pix2Pix | ⬚ | ~75 min | | 05 | StyleGAN | ⬚ | ~45 min | | 06 | Diffusion Models — DDPM from Scratch | ⬚ | ~75 min | | 07 | Latent Diffusion & Stable Diffusion | ⬚ | ~75 min | | 08 | ControlNet, LoRA & Image Conditioning | ⬚ | ~75 min | | 09 | Inpainting, Outpainting & Image Editing | ⬚ | ~75 min | | 10 | Video Generation | ⬚ | ~45 min | | 11 | Audio Generation | ⬚ | ~45 min | | 12 | 3D Generation | ⬚ | ~45 min | | 13 | Flow Matching & Rectified Flows | ⬚ | ~45 min | | 14 | Evaluation — FID, CLIP Score, Human Preference | ⬚ | ~45 min | ## Phase 9: Reinforcement Learning — ⬚ (~13 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | MDPs, States, Actions & Rewards | ⬚ | ~45 min | | 02 | Dynamic Programming | ⬚ | ~75 min | | 03 | Monte Carlo Methods | ⬚ | ~75 min | | 04 | Temporal Difference — Q-Learning, SARSA | ⬚ | ~75 min | | 05 | Deep Q-Networks (DQN) | ⬚ | ~75 min | | 06 | Policy Gradient Methods — REINFORCE | ⬚ | ~75 min | | 07 | Actor-Critic — A2C, A3C | ⬚ | ~75 min | | 08 | Proximal Policy Optimization (PPO) | ⬚ | ~75 min | | 09 | Reward Modeling & RLHF | ⬚ | ~45 min | | 10 | Multi-Agent RL | ⬚ | ~45 min | | 11 | Sim-to-Real Transfer | ⬚ | ~45 min | | 12 | RL for Games | ⬚ | ~75 min | ## Phase 10: LLMs from Scratch — 🚧 (~18 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Tokenizers — BPE, WordPiece, SentencePiece | ✅ | ~45 min | | 02 | Building a Tokenizer from Scratch | ✅ | ~75 min | | 03 | Data Pipelines for Pre-Training | ✅ | ~75 min | | 04 | Pre-Training a Mini GPT (124M) | ✅ | ~120 min | | 05 | Scaling — Distributed Training, FSDP, DeepSpeed | ✅ | ~75 min | | 06 | Instruction Tuning — SFT | ✅ | ~75 min | | 07 | RLHF — Reward Model + PPO Training | ✅ | ~75 min | | 08 | DPO — Direct Preference Optimization | ✅ | ~75 min | | 09 | Constitutional AI & Self-Improvement | ⬚ | ~45 min | | 10 | Evaluation — Benchmarks, Evals, LM Harness | ✅ | ~75 min | | 11 | Quantization — INT8, GPTQ, AWQ, GGUF | ✅ | ~75 min | | 12 | Inference Optimization | ✅ | ~75 min | | 13 | Building a Complete LLM Pipeline | ⬚ | ~120 min | | 14 | Open Models — Architecture Walkthroughs | ⬚ | ~45 min | ## Phase 11: LLM Engineering — ✅ (~15 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Prompt Engineering — Techniques & Patterns | ✅ | ~45 min | | 02 | Few-Shot, Chain-of-Thought, Tree-of-Thought | ✅ | ~45 min | | 03 | Structured Outputs | ✅ | ~75 min | | 04 | Embeddings & Vector Representations | ✅ | ~75 min | | 05 | Context Engineering | ✅ | ~75 min | | 06 | RAG — Retrieval-Augmented Generation | ✅ | ~75 min | | 07 | Advanced RAG | ✅ | ~75 min | | 08 | Fine-Tuning with LoRA & QLoRA | ✅ | ~75 min | | 09 | Function Calling & Tool Use | ✅ | ~75 min | | 10 | Evaluation & Testing LLM Applications | ✅ | ~45 min | | 11 | Caching, Rate Limiting & Cost Optimization | ✅ | ~45 min | | 12 | Guardrails, Safety & Content Filtering | ✅ | ~45 min | | 13 | Building a Production LLM Application | ✅ | ~120 min | ## Phase 12: Multimodal AI — ⬚ (~11 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Multimodal Representations | ⬚ | ~45 min | | 02 | CLIP — Connecting Vision and Language | ⬚ | ~75 min | | 03 | Vision-Language Models | ⬚ | ~45 min | | 04 | Audio-Language Models | ⬚ | ~45 min | | 05 | Document Understanding | ⬚ | ~75 min | | 06 | Video-Language Models | ⬚ | ~45 min | | 07 | Multimodal RAG | ⬚ | ~75 min | | 08 | Multimodal Agents | ⬚ | ~75 min | | 09 | Text-to-Image Pipelines | ⬚ | ~75 min | | 10 | Text-to-Video Pipelines | ⬚ | ~75 min | | 11 | Any-to-Any Models | ⬚ | ~45 min | ## Phase 13: Tools & Protocols — ⬚ (~11 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Function Calling Deep Dive | ⬚ | ~45 min | | 02 | Tool Use Patterns | ⬚ | ~45 min | | 03 | MCP — Model Context Protocol Fundamentals | ⬚ | ~45 min | | 04 | Building MCP Servers | ⬚ | ~75 min | | 05 | Building MCP Clients | ⬚ | ~75 min | | 06 | MCP Resources, Prompts & Sampling | ⬚ | ~45 min | | 07 | Structured Output Schemas | ⬚ | ~75 min | | 08 | API Design for AI | ⬚ | ~75 min | | 09 | Browser Automation & Web Agents | ⬚ | ~75 min | | 10 | Build a Complete Tool Ecosystem | ⬚ | ~120 min | ## Phase 14: Agent Engineering — 🚧 (~17 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | The Agent Loop | ✅ | ~45 min | | 02 | Tool Dispatch & Registration | ⬚ | ~75 min | | 03 | Planning — TodoWrite, DAGs, Goal Decomposition | ⬚ | ~75 min | | 04 | Memory — Short-Term, Long-Term, Episodic | ⬚ | ~75 min | | 05 | Context Window Management | ⬚ | ~45 min | | 06 | Context Compression & Summarization | ⬚ | ~75 min | | 07 | Subagents — Isolated Context, Delegation | ⬚ | ~75 min | | 08 | Skills & Knowledge Loading | ⬚ | ~45 min | | 09 | Permissions, Sandboxing & Safety | ⬚ | ~45 min | | 10 | File-Based Task Systems | ⬚ | ~75 min | | 11 | Background Task Execution | ⬚ | ~75 min | | 12 | Error Recovery & Self-Healing | ⬚ | ~75 min | | 13 | Hooks — PreToolUse, PostToolUse, SessionStart | ⬚ | ~45 min | | 14 | Eval-Driven Agent Development | ⬚ | ~45 min | | 15 | Build a Complete AI Agent from Scratch | ⬚ | ~120 min | ## Phase 15: Autonomous Systems — ⬚ (~11 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | What Makes a System Autonomous | ⬚ | ~45 min | | 02 | Autonomous Loops — The Core Pattern | ⬚ | ~75 min | | 03 | Self-Healing Agents | ⬚ | ~75 min | | 04 | AutoResearch — Autonomous Research Agents | ⬚ | ~75 min | | 05 | Eval-Driven Loops | ⬚ | ~45 min | | 06 | Human-in-the-Loop | ⬚ | ~45 min | | 07 | Continuous Agents | ⬚ | ~45 min | | 08 | Cost-Aware Autonomous Systems | ⬚ | ~45 min | | 09 | Monitoring & Observability | ⬚ | ~45 min | | 10 | Safety Boundaries — When to Stop | ⬚ | ~45 min | | 11 | Build an Autonomous Coding Agent | ⬚ | ~120 min | ## Phase 16: Multi-Agent & Swarms — 🚧 (~15 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Why Multi-Agent | ✅ | ~45 min | | 02 | Agent Teams — Roles, Specialization, Delegation | ⬚ | ~75 min | | 03 | Communication Protocols | ✅ | ~45 min | | 04 | Shared State & Coordination | ⬚ | ~75 min | | 05 | Message Passing & Mailboxes | ⬚ | ~75 min | | 06 | Task Markets — Agents Bidding for Work | ⬚ | ~45 min | | 07 | Consensus Algorithms for Agents | ⬚ | ~75 min | | 08 | Swarm Intelligence — Emergent Behavior | ⬚ | ~45 min | | 09 | Agent Economies | ⬚ | ~45 min | | 10 | Worktree Isolation | ⬚ | ~75 min | | 11 | Hierarchical Swarms | ⬚ | ~45 min | | 12 | Self-Organizing Systems | ⬚ | ~45 min | | 13 | DAG-Based Orchestration | ⬚ | ~75 min | | 14 | Build an Autonomous Agent Swarm | ⬚ | ~120 min | ## Phase 17: Infrastructure & Production — 🚧 (~13 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | Model Serving | ✅ | ~75 min | | 02 | Docker for AI Workloads | ✅ | ~75 min | | 03 | Kubernetes for AI | ✅ | ~75 min | | 04 | Edge Deployment | ⬚ | ~75 min | | 05 | Observability | ⬚ | ~45 min | | 06 | Cost Optimization | ⬚ | ~45 min | | 07 | CI/CD for ML | ⬚ | ~75 min | | 08 | A/B Testing & Feature Flags for AI | ⬚ | ~45 min | | 09 | Data Pipelines | ⬚ | ~75 min | | 10 | Security | ⬚ | ~45 min | | 11 | Build a Production AI Platform | ⬚ | ~120 min | ## Phase 18: Ethics, Safety & Alignment — ⬚ (~5 hours) | # | Lesson | Status | Est. | |---|--------|--------|------| | 01 | AI Ethics | ⬚ | ~45 min | | 02 | Alignment | ⬚ | ~45 min | | 03 | Red Teaming & Adversarial Testing | ⬚ | ~75 min | | 04 | Responsible AI Frameworks | ⬚ | ~45 min | | 05 | Privacy — Differential Privacy, Federated Learning | ⬚ | ~45 min | | 06 | Interpretability | ⬚ | ~45 min | ## Phase 19: Capstone Projects — ⬚ (~10 hours) | # | Project | Status | Est. | |---|---------|--------|------| | 01 | Build a Mini GPT & Chat Interface | ⬚ | ~120 min | | 02 | Build a Multimodal RAG System | ⬚ | ~120 min | | 03 | Build an Autonomous Research Agent | ⬚ | ~120 min | | 04 | Build a Multi-Agent Development Team | ⬚ | ~120 min | | 05 | Build a Production AI Platform | ⬚ | ~120 min | --- **Total: 20 phases, 260+ lessons | 96 complete | ~290 hours estimated** Want to help? Pick any ⬚ lesson and submit a PR. See [CONTRIBUTING.md](CONTRIBUTING.md).