Garry Tan – Scott Loftesness

The Tax We No Longer Have to Pay

When Carol Coye Benson and I sat down to write Payments Systems in the U.S., one of the first problems we had to solve wasn’t about payments. It was about history.

To understand why the ACH network works the way it does, or why checks persisted decades longer than anyone expected, you need the institutional sediment underneath — the regulatory decisions, the failed experiments, the path dependencies baked in by choices made in the 1970s that nobody thought would still matter in the 2000s. The history is the explanation. Strip it out and you have a description of current practice with no account of why it exists or what it cost to get there.

But history takes pages. And pages test a reader’s patience. So you compress. You make judgment calls about what survives the cut and what gets left behind, and you make those calls knowing that every omission is a bet — a bet that the reader can follow without it, that the thread holds without that particular knot.

Writing it taught me something. The act of compressing, of finding the minimum sufficient version of a complex thing, forces a clarity that living inside the complexity never quite delivers. You don’t fully know what you understand until you have to say it precisely enough for someone else to follow.

But compression is always a loss. You feel it as you write. The version in the book is thinner than the thing you know.

Garry Tan uses a term — “tokenmaxxing” — that initially sounds like jargon from a performance optimization thread. The idea is simple: don’t be stingy with context. Give the model everything. Every source document, every relevant article, every piece of background that a human reader would never sit still for. Let it synthesize rather than guess.

The instinct it runs against is deep. We have spent decades building information systems around compression — search engines that retrieve rather than ingest, executive summaries that stand in for reports, one-pagers that distill months of work into something a decision-maker can absorb in four minutes. All of it was a rational response to a real constraint: human attention is finite and expensive. You couldn’t afford to read everything, so you built filters. The whole architecture of how organizations manage information was designed around that limit.

Tokenmaxxing is a bet that the limit has moved.

The model can read everything. The cost of giving it full context — the uncompressed history, the original sources, the institutional sediment — is low enough now that filtering before the model sees it may introduce more error than it prevents. You’re potentially discarding signal when you summarize for the model the way you’d summarize for a human. The model doesn’t need the one-pager. It can handle the report.

This doesn’t dissolve the need for curation entirely. More context isn’t always better — models can lose the thread in noise the same way humans do, just differently. The skill shifts from summarizing to selecting: not what’s the minimum version of this but what’s actually worth including. Different judgment, still essential.

But the deeper change is upstream of any particular project. The compression we built into every research process, every briefing, every book — that was never the goal. It was the tax we paid for human cognitive limits. Part of the process doesn’t pay that tax anymore.

When I think about writing that payments book today, I don’t think the book itself would change much — it still has human readers with finite patience. But the map we drew before writing it, the synthesis work, the “what connects to what across fifty years of regulatory history” work — that could happen at a different depth now. The understanding you bring to the writing can be informed by everything, not just the subset you had time to read.

The payments book was written entirely for humans, with all the compression that implies. But Tyler Cowen just published what he calls a “generative book” — 40,000 words released free online, paired on the same screen with a Claude interface so readers can discuss, interrogate, and extend it in real time. He’s writing for both audiences simultaneously now. The human reader and the model that will help that reader go deeper. The text is optimized not just to be understood but to be used — as context, as a jumping-off point, as raw material for a conversation that the author won’t be in.

That’s a different kind of writing. Not better or worse. Different. The compression decisions change when one of your readers has no patience to protect.

Writing still clarifies thinking. That part hasn’t changed. But what you’re clarifying, and who you’re clarifying it for, is quietly expanding.

The Scarcest Thing

Garry Tan woke up at 8 a.m. after sleeping at 4. Not because he had to. Because he wanted to see what his workers had done overnight.

The workers are AI agents. Ten of them, running in parallel across three projects. And something about that sentence — wanted to see what they’d done — keeps stopping me. That’s not the language of someone using a tool. That’s the language of someone managing a team.

Tan gave a name to the state this puts him in: “cyber psychosis.” He said it as a joke. But the joke has an insight in it. He’s not describing addiction to a productivity app. He’s describing a shift in what it means to do creative work — the strange vertigo of becoming a director when you’d always been a laborer.

I’m retired. I watch this from the outside now, which is its own kind of vantage point. For most of my career, the path from idea to working product ran through people — through hiring and managing and the slow accretion of execution capacity. You had the vision or you didn’t, but either way you needed the team. The idea and the means of making it real were, structurally, separate things. The gap between them was where companies lived.

What Tan is describing is that gap closing.

The thing he built — gstack, his open-sourced Claude Code configuration — got dismissed in some quarters as “just prompts.” And it is just prompts, in the same way that a conductor’s score is just notation. The abstraction is the invention. What he encoded is a model of how a startup team thinks: the CEO who interrogates the why before a line of code gets written, the engineer who builds, the paranoid staff reviewer who looks for what breaks. Each role blocks a different failure mode. Blurring them together produces, as his documentation puts it, “a mediocre blend of all four.”

That’s an organizational insight. It has nothing to do with code.

Tan described being a “time billionaire” — not because his biological clock had slowed, but because he can now purchase machine-consciousness-hours. The bottleneck of implementation, which has governed every creative project since the beginning of creative projects, is dissolving for those who know how to direct.

The scarcest thing is shifting. It’s no longer the hours of execution. It’s the clarity of intent — knowing what you want to build and why the journey matters, before any of the workers start moving. That’s harder than it sounds. For decades, most of us could muddle through in the making of it. The act of building taught you what you were building. Now the making is cheap, and that shortcut is gone.

For someone watching from retirement, that’s not a small thing to absorb. The model I internalized over a long career — that ideas become real through sustained organizational effort, through teams and timelines and the grinding work of execution — is being revised faster than I expected. Not invalidated. Revised. The judgment still matters. The taste still matters. The why matters more than ever.

It’s just that the how has found new hands. Many of them. More than any team I ever assembled, available the moment the intent is clear enough to direct them, gone when the work is done. The constraint was always the hands. It turns out it was always the knowing.

Share this:

Share this: