---
id: ins_post-training-as-the-moat
operator: Asha Sharma
operator_role: CVP, Microsoft AI Platform
source_url: https://www.lennysnewsletter.com/p/how-80000-companies-build-with-ai-asha-sharma
source_type: podcast
source_title: Asha Sharma — Product as organism, post-training, agentic society — Lenny's Podcast
source_date: 2026-04-28
captured_date: 2026-05-01
domain: [ai-native, product, gtm]
lifecycle: [strategy-bets]
maturity: frontier
artifact_class: framework
score: { originality: 5, specificity: 4, evidence: 5, transferability: 5, source: 5 }
tier: A
related: [ins_product-as-organism, ins_seasons-not-roadmaps]
raw_ref: raw/podcasts/asha-sharma--product-as-organism--2026-04-28.md
---

# The economic moat in AI is post-training on proprietary data, not pre-training a base model

## Claim
Beyond ~30B parameters, the capex of pre-training your own base model no longer makes economic sense. The defensible asset shifts to post-training: fine-tuning, RAG, reward design, and proprietary feedback loops on data you uniquely capture. Cursor is the canonical proof, its $300M ARR moat is the data of which suggestions users accept and reject, not its IDE chrome.

## Mechanism
Pre-training capex scales with model size; revenue from a self-trained base model rarely justifies the spend once a frontier provider exists. Post-training is the cheaper, defensible layer: your fine-tuned model behavior is a function of your accumulated user interactions. As long as you keep collecting interaction data, the model gap to competitors compounds. Pre-training is a one-time spend in a public market; post-training is a moat that compounds in your private one.

## Conditions
Holds when:
- You have a UX surface that produces ranking-worthy interaction signals (accepts/rejects, ratings, follow-ups).
- You can run the post-training loop frequently, synthetic data generation, eval design, RAG updates.

Fails when:
- The product surface produces sparse or noisy signals. Garbage data degrades fine-tunes.
- The base model improves so quickly that your fine-tune becomes redundant. Watch for capability waves that obsolete your customizations.

## Evidence
Asha cites Nathan Lambert's research: "Once a model hits 30B parameters, the CapEx to pre-train doesn't make economic sense; you should fine-tune instead."

> "50% of developers are now fine-tuning. When you go through the full loop — synthetic data generation, rewards design, A/B testing rigorously, extract job-to-be-done — you get better results faster."

· Asha Sharma on Lenny's Podcast, 2026-04-28

Cursor's moat is named explicitly: data from accepted vs. rejected suggestions, retrained continuously.

## Signals
- The team's data flywheel produces monthly model updates with measurable improvement.
- Fine-tuning is on a normal release cadence, not a research project.
- The product gets better the more it's used, in measurable axes (suggestion acceptance rate, time-to-task, etc.).

## Counter-evidence
Benjamin Mann at Anthropic argues the foundation-model layer is still where the most compounding happens, "the model will eat your scaffolding for breakfast." Sherwin Wu echoes: customer fine-tunes can be obsoleted by the next base model. The post-training moat exists for current generation; the pre-training disruption can collapse it overnight.

## Cross-references
- `ins_product-as-organism`, the broader framing this moat lives inside
- `ins_seasons-not-roadmaps`, the cadence implication