operatorlab.ai
← All demos
Developer Tooling

AI-Native Feature Delivery

A glass-box demo of shipping a non-trivial feature from idea to production using a PRD-first, agent-assisted workflow — including the wrong turns and the recovery.

This is the demo I run when someone asks "what does AI-native delivery actually look like, end to end?" — not the marketing answer, the honest one. The feature itself doesn't matter; the workflow does. The story is intentionally messy because the real version always is.

Problem

A small product team wants to ship a "scheduled re-runs" feature for their data pipeline tool. It's the kind of work that's not technically hard but touches a lot of surface area: API, scheduler, billing limits, UI, docs, and a notification path. Estimated by gut: 3–4 weeks. Estimated by the JIRA breakdown: 6 weeks.

Traditional workflow

  • Product writes a doc. Engineering reads it, mostly. Two clarifying questions get asked in Slack and answered with emojis.
  • An engineer scaffolds the API change. Another picks up the UI. Sometime in week 2, they discover they assumed different things about how cancellation works.
  • Docs get written in the last 48 hours before launch by whoever lost the rock-paper-scissors.
  • The feature ships. Two weeks later, a customer hits an edge case nobody considered.

This works. Most software ships this way. The problem isn't quality — it's cycle time and consistency: the same team will execute the same shape of feature in 3 weeks or 8 weeks depending on which assumptions hold.

AI-native workflow

The structural change is putting the PRD-as-executable-spec at the center, and treating the agent as a teammate who reads it before doing anything.

Feature delivery, end to end
  1. Step 1PRDTight, agent-readable
  2. Step 2PlanAgent drafts architecture
  3. Step 3Vertical sliceAPI + UI + docs together
  4. Step 4ReviewHuman catches the taste calls
  5. Step 5IterateSame loop, narrower scope

What the human does:

  1. Writes a one-page PRD with explicit non-goals (no recurring billing for v1; no per-step retry policy).
  2. Pastes it into a plan-mode conversation. The agent asks three clarifying questions — all real tradeoffs, not "what framework?"
  3. Reviews the proposed plan. Edits it directly. Approves.
  4. The agent ships a vertical slice: the simplest end-to-end path that exercises every layer. Not all the features. The thinnest one that proves the architecture.
  5. Human runs it. Notices the cancellation semantics are wrong. Updates the PRD. The agent fixes it everywhere, including the docs and the test.

Technical breakdown

  • PRD: 600 words, one page, lives in the repo at docs/features/scheduled-reruns.md.
  • Plan: written by the agent into a tracked plan file, reviewed in-line, approved.
  • Execution: a single conversation handles the API change, the UI, the docs, and the tests in vertical-slice order. The agent runs the test suite after each layer and stops if anything breaks.
  • Review surface: a single PR with a generated description that links back to the PRD and the plan. The human reviewer focuses on the three or four taste calls the agent flagged.

Operational impact

  • Cycle time from idea to merged PR: 4 days, including review and the cancellation-bug iteration.
  • PR review time: 35 minutes (vs. ~2 hours for the same shape of feature pre-workflow). The reviewer's time is concentrated on the actually-interesting decisions.
  • Documentation quality at merge time: as good as anything in the codebase. It was written with the code, not after.
  • Test coverage of the new path: 100% of the happy path, 80% of the failure modes. The agent doesn't get bored writing test #14.

Lessons learned

  • Vertical slices over horizontal layers. The agent's natural tendency is to "finish the API, then the UI, then the docs." Override it. Build the thinnest end-to-end path first. The bugs you find are the architecture bugs, not the integration bugs.
  • The PRD owns the truth, not the chat history. When something is wrong, fix the PRD first, then ask the agent to re-derive. Trying to patch the implementation around a broken brief is how you end up with code that works but nobody understands.
  • Reviewers should review plans, not just code. Most of the leverage is at the plan stage. By the time the PR is open, the architectural die is cast.
  • Document the wrong turn. The most valuable artifact from this kind of workflow is often the short post-mortem on what the agent got wrong and what we changed in the PRD as a result. That's the playbook for the next feature.