The cached-example fallback pattern for live AI demos
A small UX trick that makes every public AI demo on this site work convincingly with zero API keys, zero rate-limit anxiety, and zero "is it broken?" moments.
Every lab on OperatorLab is gated by an ANTHROPIC_API_KEY and a per-IP rate limit. Without the cached-example fallback, both gates would produce empty-error UX in the case the user is most likely to hit on first visit: clicking "Run" before the key is configured, or being the 11th visitor that hour.
This is the pattern I landed on after one bad first run.
What I tried first
The naive version: the API route returns 503 when the key is missing, the client shows a toast saying "configure your API key." This was wrong for three reasons:
- The user has no API key to configure. They're on a public demo site. The "fix" requires them to leave and never come back.
- The empty error state looks broken. Anyone screenshotting the site for a write-up captures the broken state.
- The point of the lab is the output, not the interactivity. If they can see what a real run would produce, they understand the value even without running it.
What I shipped
Every lab API route returns a 503 (no key) or 429 (rate limit) with a JSON body that includes a hand-written cached example run:
if (!hasAnthropicKey()) {
return Response.json({
error: "ANTHROPIC_API_KEY is not configured on this deployment.",
example: true,
body: SALES_ENABLEMENT_EXAMPLE,
}, { status: 503 });
}The client hook (useLabRun) reads the JSON body in the error path and renders body into the streaming panel exactly as if it had streamed, with a yellow notice strip explaining why:
Live model unavailable — showing a cached example run.
The cached examples themselves are hand-written, scenario-specific, and live in lib/ai/prompts/<lab>.ts next to the system prompt. They model what the good output looks like for a representative input, which doubles as documentation for the system prompt itself.
What I learned
- The cached example became the source of truth for the system prompt. When the live output drifted from the cached example, that meant the prompt had decayed and needed updating.
- The notice strip is load-bearing. Without it, sophisticated users assume the example IS the live output and dock the lab for being scripted. With it, they understand instantly.
- The 429 case felt more graceful than the 503 case. Rate-limited users see the same example with a slightly different message and don't feel punished for clicking.
- Total cost: ~150 lines across all 4 labs. Cheaper than the toast-and-link pattern would have been once you account for the abandoned-on-error UX cost.
When the pattern doesn't work
The pattern requires that the cached example be plausible — close enough to a real run that knowledgeable users don't catch the seams. If the lab's output is highly user-input-specific (e.g., "summarize this PDF the user uploaded"), there's no representative cached example to fall back to. In that case, the right move is probably to fail closed: render a clear empty state explaining what input is missing, not a fake-looking example.
For all four labs on this site, the inputs are scenario-shaped (industry, customer profile, etc.) so a representative cached scenario covers the demo case cleanly.
Code
The full pattern lives in:
lib/labs/use-lab-run.ts— the client hookapp/api/labs/sales-enablement/route.ts— the smallest example routelib/ai/prompts/sales-enablement.ts— the cached example next to the system prompt