--- id: ins_post-training-as-the-moat operator: Asha Sharma operator_role: CVP, Microsoft AI Platform source_url: https://www.lennysnewsletter.com/p/how-80000-companies-build-with-ai-asha-sharma source_type: podcast source_title: Asha Sharma — Product as organism, post-training, agentic society — Lenny's Podcast source_date: 2026-04-28 captured_date: 2026-05-01 domain: [ai-native, product, gtm] lifecycle: [strategy-bets] maturity: frontier artifact_class: framework score: { originality: 5, specificity: 4, evidence: 5, transferability: 5, source: 5 } tier: A related: [ins_product-as-organism, ins_seasons-not-roadmaps] raw_ref: raw/podcasts/asha-sharma--product-as-organism--2026-04-28.md --- # The economic moat in AI is post-training on proprietary data, not pre-training a base model ## Claim Beyond ~30B parameters, the capex of pre-training your own base model no longer makes economic sense. The defensible asset shifts to post-training: fine-tuning, RAG, reward design, and proprietary feedback loops on data you uniquely capture. Cursor is the canonical proof, its $300M ARR moat is the data of which suggestions users accept and reject, not its IDE chrome. ## Mechanism Pre-training capex scales with model size; revenue from a self-trained base model rarely justifies the spend once a frontier provider exists. Post-training is the cheaper, defensible layer: your fine-tuned model behavior is a function of your accumulated user interactions. As long as you keep collecting interaction data, the model gap to competitors compounds. Pre-training is a one-time spend in a public market; post-training is a moat that compounds in your private one. ## Conditions Holds when: - You have a UX surface that produces ranking-worthy interaction signals (accepts/rejects, ratings, follow-ups). - You can run the post-training loop frequently, synthetic data generation, eval design, RAG updates. Fails when: - The product surface produces sparse or noisy signals. Garbage data degrades fine-tunes. - The base model improves so quickly that your fine-tune becomes redundant. Watch for capability waves that obsolete your customizations. ## Evidence Asha cites Nathan Lambert's research: "Once a model hits 30B parameters, the CapEx to pre-train doesn't make economic sense; you should fine-tune instead." > "50% of developers are now fine-tuning. When you go through the full loop — synthetic data generation, rewards design, A/B testing rigorously, extract job-to-be-done — you get better results faster." · Asha Sharma on Lenny's Podcast, 2026-04-28 Cursor's moat is named explicitly: data from accepted vs. rejected suggestions, retrained continuously. ## Signals - The team's data flywheel produces monthly model updates with measurable improvement. - Fine-tuning is on a normal release cadence, not a research project. - The product gets better the more it's used, in measurable axes (suggestion acceptance rate, time-to-task, etc.). ## Counter-evidence Benjamin Mann at Anthropic argues the foundation-model layer is still where the most compounding happens, "the model will eat your scaffolding for breakfast." Sherwin Wu echoes: customer fine-tunes can be obsoleted by the next base model. The post-training moat exists for current generation; the pre-training disruption can collapse it overnight. ## Cross-references - `ins_product-as-organism`, the broader framing this moat lives inside - `ins_seasons-not-roadmaps`, the cadence implication