Architecture

Your PR Review Is Not an Architecture Review

Published in May 2026 · 7 minute read

The comment shows up in the thread. "This service shouldn't be calling the payment module directly. That belongs in the gateway layer." A brief back-and-forth follows. The author explains the deadline, the exception, the business reason. The reviewer approves with "let's track this as tech debt." The PR merges. No ticket gets opened.

This is not a story about a careless team or a weak reviewer. The reviewer saw the problem. The conversation happened. The outcome was the same.

The standard diagnosis is a discipline problem: reviewers need to hold the line harder, teams need to reject more, engineers need to internalize architectural boundaries more deeply. A more targeted version of that argument points to a specific role: the architect, the staff engineer, the technical lead. Someone whose explicit job is to hold architectural boundaries in review. That is a more reasonable position. But it does not change what the three sections below describe.

There is a second complication. When AI generates significant portions of the codebase, including sometimes the structural decisions, the person whose job is enforcement may no longer have a clear picture of what they are supposed to be enforcing. The rule felt intuitive because they participated in building it. That participation is thinner when architectural decisions happen inside a conversation between a developer and a model.

The view problem

A PR shows a diff. A diff shows what changed. Architecture is about what the system has become.

Previous posts in this series traced how implicit coupling accumulates silently, one PR at a time, each change localized, each addition passing every automated check. The reviewer of any individual PR faces exactly that situation: a bounded set of changes, a local context, a finite review window. What they cannot see is the ten other PRs that merged this month, each of which crossed the same boundary, each of which had its own reasonable explanation.

Architecture is a property of the aggregate. PR review is a tool for evaluating marginal changes. The mismatch is structural, not behavioral.

A reviewer can verify that the code does what it says. They can find bugs, improve names, enforce style. What they cannot verify from a diff is whether this change is the first exception to a rule or the fifteenth. That information is not in the PR. It lives in the history of everything that came before it, in the full state of the codebase, in the pattern of what has been accumulating over months.

Asking a reviewer to make an architectural judgment from a diff is asking them to detect a pattern from a single data point.

The cost asymmetry

Code review has a social contract, and that contract is asymmetric.

Rejecting a PR carries real cost: the author rewrites or argues, timelines shift, a conversation happens that might carry tension, and the reviewer owns the outcome if they turn out to be wrong. Approving with a comment costs almost nothing: no conflict today, the concern goes into a thread that no one will read again, work continues. Both are rational responses. One has much lower immediate cost.

This is not a character flaw. It is the predictable output of a system where enforcement falls entirely on the person who notices the violation, at the moment of maximum social friction, with no structural support. The reviewer who flags the problem correctly still has to convince the author, who has to convince their manager, who has to decide if the deadline moves. The reviewer who approves with a comment has done their job as far as the system is concerned.

Over time, teams develop a realistic sense of which violations get rejected and which get through. Architectural violations get through. Not because reviewers don't know the rules — they often do. Because the mechanism places a disproportionate burden on the individual at the worst possible moment and almost none on the system itself.

The parallel blindness problem

In an active codebase, multiple PRs are open at once. Each has a reviewer. None of the reviewers has a complete picture of what is merging this week.

PR A adds a direct call from the user service to the billing module. The reviewer asks why it isn't going through the gateway, gets an explanation about an ongoing migration, and approves. PR B, reviewed by someone else, adds the same kind of call from the notification service. Same justification: migration in progress. PR C follows. Each reviewer is operating in isolation, seeing their slice of the change surface.

By the time someone notices the pattern — if anyone does — the migration has been in progress for six months. There are seven direct calls that were each independently justified, and no one made a single decision to abandon the boundary. The boundary eroded through decisions that were individually defensible and collectively damaging.

The previous post in this series described documentation that drifts from the codebase one PR at a time, without any single obvious turning point. The mechanism here is the same. Each PR is a local decision. Local decisions compound in ways no reviewer can track from their position in the process.

The intent does not change what the tool can see

The natural pushback is that catching architectural violations is exactly what reviewers are for. That is true. The three problems above do not go away because the intent is there.

A staff engineer whose explicit job is architectural enforcement is still looking at one diff. They still pay a cost every time they reject a PR, smaller than a junior reviewer pays but not zero. They still have no visibility into what else is merging this week. Authority does not change the information available. Making the right person responsible for enforcement does not change what that person can observe.

A reviewer who catches an architectural violation and says nothing has failed their responsibility. A reviewer who catches the same violation and says something has also, most of the time, failed to stop it. The difference is whether the system has any memory.

The common response is more process: architectural sign-off for certain changes, a checklist, an explicit review template. These move in the right direction but leave the core problem intact. They make the conversation more likely to happen without changing what happens after the conversation. The PR still merges. The exception is approved. The record exists in a thread nobody will find.

A more recent version of the same response reaches for AI: put the architectural rules as context in the model and violations stop happening. That gives the model information. Information is not a constraint. Two developers running separate sessions with the same rules in context can still produce two different architectural decisions. There is no memory between PRs, no mechanism that fails when a rule is broken, no consistency guarantee across sessions. A rule applied probabilistically is not a rule.

The bit rot problem and the PR review problem have the same shape. Architecture as text decays because nothing connects the text to the code. Architecture as conversation decays because nothing connects the conversation to the state of the codebase over time. Both are enforcement gaps at different layers.

The gap is not in the people or the intent. It is in what the tool can observe. What enforcement actually needs is something that can see what individual PR review cannot: the aggregate, the history, the parallel changes. None of those live in a diff.