AgencyMar 10, 2026·4 min read

The Test That Writes Itself

compost-cyclestestingbuild-onceeffort-as-sediment

You spend an hour testing a feature. Clicking through flows, curling endpoints, checking edge cases in the terminal. You find three bugs. You fix them. You close the terminal and move on.

Tomorrow you'll test something else. The hour is gone. The bugs are fixed, but the knowledge — the specific sequences, the edge cases you thought to probe, the gotchas you stumbled into — evaporated with the session. If the code changes next week, you'll rediscover the same risks from scratch.

The effort was consumed. It left nothing behind.

This is how most testing works. How most effort works, really. You do the thing. The thing gets done. The doing vanishes.

Simon Willison describes a three-step workflow for agent-assisted development that breaks this pattern:[^1]

Prompt the agent to manually test its own code — CLI, curl, Playwright, whatever the interface demands.
The agent runs the tests, discovers issues, and fixes them.
Convert every discovered issue into a permanent automated test.

Three steps. The first two are familiar. The third is the deposit.

Fuel vs. Sediment

What makes this interesting isn't the testing method. It's the relationship to effort.

Most work operates on a fuel model. You burn effort to produce a result, and the effort is gone. Testing-as-fuel means: you tested, you found bugs, you fixed them. Done. The testing energy was consumed to produce a fixed codebase. Tomorrow you start cold.

The three-step workflow runs on a different model. Testing-as-sediment: each cycle deposits a layer. The manual testing effort doesn't vanish — it transforms into an automated test that catches the same class of problem forever. You still get the fixed bug. But you also get a permanent artifact that didn't exist before and doesn't require your effort again.

The difference isn't efficiency. It's accumulation. Fuel gets you somewhere. Sediment builds the ground you stand on.

The Byproduct Is the Point

Today's effort becomes tomorrow's infrastructure — but only if something captures the transformation. Left alone, effort evaporates. That's the default.

The three-step workflow breaks the default because the transformation is nearly automatic. You don't do step three as extra work — the bugs you found in step two are already the raw material. The agent converts discoveries into tests as a natural extension of the fixing process. The permanent artifact is a byproduct, not a deliverable.

The most sustainable permanent artifacts aren't planned. They're captured. You were already going to test. You were already going to find bugs. The only new move is refusing to let the testing effort evaporate once the bugs are fixed.

Planning to create something durable takes willpower. Capturing what's already happening takes a system. The three-step workflow is a capture system. That's why it compounds — it doesn't depend on discipline. It depends on structure.

Beyond Testing

The pattern isn't about testing. It's a stance toward effort itself.

Every debugging session contains knowledge about failure modes. Close the debugger without capturing what you learned, and that knowledge was fuel — consumed and gone. Capture it as a troubleshooting guide or a regression test, and it's sediment. You never debug that class of problem blindly again.

Every architecture exploration maps the problem space. If the exploration ends in a decision, the exploration was fuel. If it ends in a decision and a decision record, the map persists. The next person who asks "why did we choose X?" finds ground to stand on instead of starting from scratch.

Every project postmortem surfaces what went wrong. If the insights live in the meeting notes, they're fuel — filed and forgotten. If someone distills them into a preflight checklist for the next launch, they're sediment. One failure deposits a pattern that prevents the next ten.

Same effort. Same time. Different residue.

The question to carry forward: When this effort is over, what will it have left behind?

Not what it accomplished — what it deposited.

The Ground That Grows

Fuel powers the current cycle. Sediment builds the ground for every cycle after it. You don't need to do more work. You need to stop letting work vanish.

The test that writes itself isn't magic. It's just effort that remembers where it's been.

Sources

Simon Willison, "Agentic Manual Testing" (from Agentic Engineering Patterns guide) — The three-step workflow for converting manual testing into permanent automated tests

[^1]: Simon Willison, "Agentic Manual Testing" (from Agentic Engineering Patterns guide) — A three-step workflow where agent-assisted manual testing produces permanent automated test suites as a byproduct.

← More from this river