Engineering

AI-Assisted Architecture Review

A repeatable two-pass workflow for reviewing system design docs with an agent in the loop — one pass for the structural critique, one pass for the writing.

9 min read

Architecture reviews fail in two distinct ways. The structural failure: the design is wrong and the reviewers miss it because they're busy noting that the diagrams are unlabeled. The writing failure: the design is fine but the doc is so dense nobody read past the first page, so it ships without real scrutiny. An agent does well on the second; humans should stay focused on the first.

This workflow separates the two passes deliberately.

When to run this

The doc is more than 5 pages.
At least two of the reviewers are senior enough that their time is the bottleneck.
The change is reversible enough that "ship and iterate" is on the table, but irreversible enough that you'd regret a bad call.

If it's a one-page diagram in a Slack thread, skip this. The overhead beats the win.

Pass 1 — the agent does the writing review

Pass 1 — writing review (agent)

Step 1Paste docWhole doc + scope
Step 2Structural checkTL;DR, sections, missing pieces
Step 3AnnotateInline questions a reader would have
Step 4Rewrite TL;DRIf it's not 5 lines, it's wrong

The agent is doing the work a junior PM with infinite patience would do — checking for clarity, finding the missing TL;DR, calling out where assumptions are stated as facts, flagging diagrams without legends. This pass costs ~$0.20 and takes ~3 minutes.

The opening prompt that works:

prompt template

Review this architecture doc for clarity and completeness, NOT for technical correctness.
We're going to do the technical review with humans next.

Specifically:
1. Is the TL;DR five lines or less? If not, rewrite it.
2. List every section where an assumption is stated as a fact.
3. List every diagram without a legend.
4. List every term used before it's defined.
5. End with: "If I were a reviewer who'd read this once, the three
 questions I'd ask in the meeting are..."

Be terse. Don't praise the doc.

The "three questions a reviewer would ask" output is the most valuable part. It surfaces the ambiguities that would otherwise eat the first 15 minutes of the meeting.

Pass 2 — humans do the structural review

Now the doc is readable. The reviewers can spend their hour on the question that actually matters: is this the right design?

A useful three-question structural prompt for the human reviewer to keep in mind:

Where does this fail under load you haven't seen yet? The author has tested the happy path. The reviewer is paid for the unhappy ones.
What's the rollback plan if this is wrong? If there isn't one, that's a finding.
Six months from now, who's the on-call engineer paged at 3am by something this design enables, and what do they wish you'd done differently? If the answer is "I don't know," the design isn't done.

What this workflow is NOT

It's not "the agent does the architecture review." Agents don't have the org context, the customer empathy, or the political read that real reviews require.
It's not "skip the meeting." The meeting still happens; it's just sharper.
It's not "review every doc this way." For small designs, the overhead beats the win.

Adopting it

The first time a team uses this, the most common failure is reviewers reading the agent's output as a substitute for reading the doc. Don't. The agent annotates the doc; it doesn't replace it. The reviewer reads the doc and the agent's notes, then writes their own structural critique.

After 3-4 reviews, the pattern becomes invisible — people stop talking about "the AI-assisted review" and just call it the review.