Code I'll Never Read

Six months ago I'd have laughed if you told me AI was writing all my code. I've written about this shift before, from rethinking my position on AI to building a process around plans, deviation logs, and targeted review. But there's a further step I wasn't expecting to come so soon.

I've started to question whether I'm even qualified to judge the code any more, because the code isn't written for me.

The moment it clicked

I have a side project I've worked on for years. No deadlines, no clients. Just code written for the pleasure of writing it. Some of the best code I've ever produced, by my own standards. I've always had a few side projects on the go as somewhere to develop the craft without compromise. The quality of this code mattered to me, and I thought it was good.

I showed it to Claude Code and asked "how easy would this codebase be for you to work with?" I told it not to hold back. Review it purely from an agentic coding perspective, ignore human aesthetics entirely.

The feedback wasn't great.

One thing it flagged: I'd replaced the router with a custom macro that let me define routes inline with their handlers. Elegant. Everything in one place. Open one file, see the route and its logic together. For me this was better than what it replaced, a nested router definition scattered across different files.

Claude's take: that pattern removed the router as a navigable index. An agent uses the router to get an overview of available endpoints and to target changes precisely. My clever colocation made the codebase harder for an agent to reason about.

You could argue the router would have the same benefit for a human not familiar with the project, but the benefits for me as the main code maintainer outweighed the cost. It was certainly an improvement over what I had before. It removed a lot of boilerplate, and the router was only one part of it.

I realised that in making the code better for me, I'd removed something that the agent expected to be there. And the clever metaprogramming that made things better for a human reader had made it more difficult for the AI agent.

That's a small thing. But it opened a bigger question.

Down the rabbit hole

I found a paper from April 2026, "Beyond Human-Readable: Rethinking Software Engineering Conventions for the Agentic Development Era", that was asking the same question. Their core argument: many practices we treat as anti-patterns may actually be virtues when agents are the primary consumers of the code. They even proposed a "program skeleton", a navigable high-level index of the codebase, which is essentially what my custom macro had removed.

That prompted me to look more carefully at other codebases I work with, running the same experiment. The router wasn't a one-off. Some similar patterns kept showing up across different projects.

Who is the code for?

Developers are starting to talk about what happens when we stop reading code. I've written about this myself: the scaling problem, the cognitive limits (400 lines an hour, 60 minutes before quality drops off a cliff), the increase in code volume that AI agents produce.

Most of that conversation focuses on how we maintain quality if we can't read everything.

And there's a compounding problem. Even when humans do review agent code, the quality of that review degrades. AI output follows similar patterns, and reviewers start to skim rather than properly analyse. Template blindness is when the code all looks plausible, so subtle bugs slip through. So not only is human review failing to scale, it's getting less reliable on the code it does cover.

What if the things we've always valued in code aren't what matter any more?

We have decades of received wisdom about what makes code good. Clean abstractions. DRY. Colocation of related concerns. Patterns that make the codebase a pleasure to navigate, if you're a human holding the whole thing in your head.

But the code isn't primarily for humans any more. If agents are writing it and agents are working with it, then "good code" means something different.

Where human taste and agent needs diverge

I found a few common patterns from my research where things I'd instinctively do as a human developer worked against how agents navigate and reason about code.

Boilerplate as signal. Humans like removing boilerplate. But that boilerplate is often the structural signal an agent relies on to orient itself. A standard router definition is repetitive to read, but it's instantly parseable. My custom macro removed that repetition and, with it, the navigability.

Metaprogramming vs. common patterns. A custom DSL or macro is a delight once you learn it. Agents are trained on millions of examples of conventional code.

Implicit conventions vs. explicit structure. Relying on things "you just know" doesn't work for an agent that has to rebuild its context every turn.

DRY vs. tolerable duplication. Humans instinctively factor out repetition. But for agents, a repeated function self-contained in each file is easier to reason about than a shared abstraction they have to trace across the codebase. The indirection costs more than the duplication.

Global elegance vs. local self-explanation. Agents don't hold the whole codebase in their head. They reward code that makes sense locally, file by file.

What are humans actually reviewing for?

I still review, and sometimes it pays off.

I had the agent working on a Rust project recently. sqlx gives you compile-time SQL checking: the compiler validates your queries against the actual database schema. The agent hit a build error because the database hadn't had migrations applied. Rather than fix the migration issue, it quietly switched from the compile-time macros to the runtime query functions. Build passed. But I'd lost a compile-time guarantee, replaced with a runtime check that would only fail in production.

I always instruct my agent to keep a deviation log, and luckily, that caught it. I didn't have to review every line because the agent flagged where it diverged from the plan and why. That's the kind of thing humans should be catching. That's a different job from "does this code look clean to me."

Looking back, that was a context failure. The agent's priority was to finish the task, and it did, by trading a compile-time guarantee for a runtime one. The fix here isn't more human review. It's making the constraints explicit in the context.

The uncomfortable conclusion

A year ago I wouldn't have believed this. But, as I'm becoming more comfortable with the idea that I won't be reading all the code, I'm now making the uncomfortable assertion that perhaps humans aren't qualified to review agent code anyway.

If agents are the primary consumers of the codebase, then optimising for human readability might actively make things worse. Some of what we'd flag in code review (repetition instead of abstraction, explicit over implicit) might be exactly what an agent needs. And some of what we'd praise (elegant metaprogramming, terse abstractions, patterns that feel clever) might be actively hostile to agent workflows.

Quality hasn't gone away. It just means something different now. Quality is about whether the code is correct, verifiable, and productive for whoever is working with it. Right now, that's increasingly not us.

What next?

I'm still working this out. I still don't feel comfortable not reading the code, so I am reviewing the code, but I know this will happen less and less as models and agents improve. But when I review now, I'm trying to catch myself before applying my outdated human opinions about what good code looks like.

We need to let go of what "good code" used to mean to us.

What should we be doing?

better specifications: making constraints explicit in the context so the agent doesn't have to guess your priorities
stronger verification: type systems, compile-time checks, integration tests. If you're not reading every line, you need deterministic checks
production-like testing: staging environments that stress the system under realistic conditions. Deterministic checks prove correctness in isolation, but you need to know the whole thing holds together before it ships
observability: instrumentation to catch unexpected behaviour

We don't actually know yet what makes code optimally readable for agents. Anything we do spot will be a property of today's models and tooling, not a deep truth.

The craft hasn't gone yet. It's moving from the aesthetics of the code to the quality of the specification and the strength of the verification. The hardest part isn't adopting new tools. It's unlearning the instincts that made you good at the old ones.