HS
Harshit Singh
Say hi

๐Ÿ“Writing PRDs in the AI Era

Modern PRDs include prompts, evals, model choices, and failure modes โ€” alongside the classic user problem and success metric.

aiwriting
Why it matters

AI features require different documentation than traditional features. PRDs that don't capture model choice, prompts, evals, and failure modes create surprises in production.

The core idea

An AI PRD adds five sections to the classic PRD: (1) Model choice and rationale, (2) Prompt (full system prompt with rationale), (3) Eval plan with success thresholds, (4) Failure modes and UX recovery, (5) Cost projection. Together these capture the AI-specific design choices that traditional PRDs miss.

The AI PRD additions

On top of the classic PRD (problem, user, solution, metric), add:

Model choice. Which model, why. Frontier (Claude Opus, GPT-5) for quality. Mid-tier (Sonnet, mid-tier GPT) for cost. Open-source self-hosted for cost-at-scale. State the choice and the conditions that would change it.

The full system prompt. Versioned. Include it in the PRD. Engineers need to see it; future PMs need the context for why it's written this way.

Eval plan. What inputs, what scoring method, what threshold qualifies as 'shipped.' "100 evals, LLM-as-judge, 90% pass rate to ship; 85% acceptable for beta."

Failure modes + UX recovery. Hallucination โ†’ cite sources. Timeout โ†’ fallback to cached response. Refusal โ†’ escalate to human. List the modes and the design responses.

Cost projection. Tokens per call ร— calls per user ร— users. Get to a $/user/month number. Compare to ARPU. Note: cost will likely drop 2x in 12 months due to model price decreases.

The AI PRD anti-patterns

  • 'We'll use GPT-4.' Vague. Specify model, version, why.
  • Skipping the prompt. The prompt IS the feature. It belongs in the PRD.
  • No eval plan. Means you'll ship and hope.
  • No failure-mode UX. The model WILL hallucinate; design for it.
  • No cost model. You'll be surprised on launch day.

The shorter doc principle

AI PRDs can be short โ€” often 1-2 pages โ€” if the additions are tight. Don't pad. The model choice can be one sentence. The eval plan can be a table. The prompt is the prompt (probably the longest section).

ChatPRD and similar tools

Tools like ChatPRD use AI to draft PRDs from bullet points. They're useful for first drafts but they miss:

  • The 'why now' strategic context
  • The cost/quality/latency reasoning
  • The specific failure-mode UX choices

The PM still does the judgment work. AI just removes the typing.

Real-world examples

Anthropic
Anthropic
Prompts in version control

Anthropic's product teams version-control their prompts alongside code. The PRD references the prompt by file path. The discipline keeps prompts reviewable, testable, and rollback-able.

Go deeper โ€” recommended reading

Interview questions (1)

Q1
What's different about a PRD for an AI feature vs a traditional feature?
ai-pmmid
โ–ผ

Five additions on top of the classic PRD:

  1. Model choice and rationale. Which model, why. Sets the cost/quality bar.
  1. The full system prompt. Versioned. Engineers and future PMs need to see it.
  1. Eval plan with thresholds. What inputs, how scored, what % pass rate qualifies as shippable.
  1. Failure modes and UX recovery. Hallucination โ†’ cite sources. Timeout โ†’ fallback. Bad output โ†’ easy correction.
  1. Cost projection. Per-user-per-month at scale. Compare to ARPU.

The classic PRD answers what / why / how. The AI PRD adds what could go wrong and how we'll handle it. The failure-mode discipline is the biggest difference โ€” AI features fail in ways traditional features don't, and the PRD has to design for that.

Related concepts