Category: Anthropic

Breakout

Jack Clark doesn’t panic easily. He spent years at OpenAI watching capabilities inch upward, then left to co-found Anthropic, and has been writing his Import AI newsletter long enough to have developed — and been wrong about — many priors. So when he publishes an essay saying he has reluctantly arrived at a 60% probability that fully automated AI R&D happens by the end of 2028, the word “reluctantly” deserves some weight.

His essay, published last week and titled “Automating AI Research,” isn’t a press release or a fundraising pitch. It reads more like a man thinking out loud at the edge of something large. “I don’t know how to wrap my head around it,” he writes, which is a notable thing to say publicly when you are one of the architects of the thing you can’t wrap your head around.

The argument is built from benchmarks — not any single one, but a mosaic of them assembled to reveal a trend. SWE-Bench, the test that measures an AI’s ability to solve real GitHub issues, was at roughly 2% when it launched in late 2023. A recent Anthropic model sits at 93.9%, effectively saturating it. METR’s time-horizon plot tracks how long an AI can work independently before needing human recalibration: 30 seconds in 2022, 4 minutes in 2023, 40 minutes in 2024, 6 hours in 2025, 12 hours today. The trajectory, if it holds, suggests 100-hour autonomous work sessions by the end of this year.

Clark marshals similar progressions across AI fine-tuning, kernel design, scientific paper replication, and even alignment research itself. His throughline is the same in each: AI is now genuinely competent at the unglamorous scaffolding of AI development — the debugging, the experiment runs, the parameter sweeps, the code reviews. And crucially, it can now do these things not just faster than humans, but for longer, with less supervision.

There’s a Thomas Edison quote at the center of the essay: “Genius is 1% inspiration and 99% perspiration.” Clark’s claim is that AI has become very good at the perspiration. The question of whether it can supply the inspiration — the paradigm-shifting insight, the Move 37 — remains open. But he argues it may not need to. Most of what has moved the AI field forward has been sustained, methodical work, not lone flashes of genius. If you can automate the 99%, you have something that compounds.

There’s a data point that makes Clark’s argument feel less like forecast and more like dispatch. Last month Boris Cherny, who runs Anthropic’s Claude Code, disclosed that he hasn’t written a line of code by hand in more than two months. Every pull request — 22 one day, 27 the next — written entirely by Claude. Company-wide, roughly 70–90% of Anthropic’s code is now AI-generated. Anthropic’s stated position: “We build Claude with Claude.” The loop Clark is describing as a probability by 2028 is already running, at least partially, today.

The word Clark uses for the threshold he’s describing is not “singularity” or “AGI.” It’s quieter than that. He calls it “automated AI R&D” — the point at which a frontier model can autonomously train its own successor. It’s a specific, falsifiable thing. And he puts a number on it: 60% by end of 2028, 30% by end of 2027.

I’ve been writing about the dark software factory and the 3D printer that prints better printers, finding metaphors for what seems like an inexorable process. Clark’s essay is a different kind of writing about the same thing — the primary source document, the engineer’s log, the inventory of evidence. Reading it is a little like watching someone carefully pack boxes before a move. Each individual item seems manageable. But there are a lot of boxes.

What he’s describing — if the trend holds — is not a feature or a product launch. It’s a breakout. The moment the loop closes and the system starts building itself. He’s not certain it happens. He just thinks it’s more likely than not, and he thought you should know.

Tags ai, AI R&D, AI research, anthropic, artificial intelligence, automated AI development, automation, benchmarks, Boris Cherny, claude, Claude Code, Future of Work, Import AI, Jack Clark, METR, Recursive Self-Improvement, Scott Loftesness, SWE-Bench, technology

AI Anthropic Business Google

The Weight of the Bill

Technicians working among server racks with real-time performance graphs displayed on overhead screens — Technicians monitor server performance and network data in a modern data center

Jordi Visser has been making the case for months — in his weekly YouTube commentary and on his Substack — that we are living through an exponential transition that most people are measuring with the wrong instruments. I think he’s right. I found two data points this week that suggest why.

I was somewhere in the middle of an Invest Like the Best episode when Dylan Patel said it — almost as an aside, the kind of thing you drop to establish context before moving on to the point you actually came to make. His firm, SemiAnalysis, analyzes the semiconductor and AI industries for a living. And their usage of Claude, he noted, has been growing. The costs have been growing too.

Exponentially.

He moved on. I didn’t.

I think Patel’s API bill might be one of the more honest documents in the current AI moment — more honest than the analyst reports his firm produces, more honest than the earnings calls where every public company performs its AI fluency for shareholders.

Surveys bend. When you ask someone whether they’re using AI in their work, you’re asking them to self-report on a technology that has become a proxy for relevance, for not being left behind. The incentive to say yes is enormous. And even when the yes is genuine, it tells you nothing about depth — whether AI has become load-bearing in how someone actually works, or whether it’s an impressive thing they do occasionally.

Nobody pays exponentially growing API costs for show. Money is the honest witness.

What makes Patel’s situation quietly strange is the recursion in it. SemiAnalysis exists to help sophisticated investors and technologists understand this industry — and they cannot predict their own consumption curve. They are inside the exponential the same way everyone else is. They just happen to be watching their bill.

Then this morning, a different number arrived. Google announced it will invest up to $40 billion in Anthropic — $10 billion committed now, another $30 billion contingent on performance milestones. This follows a separate $5 billion from Amazon, part of a broader arrangement under which Anthropic is expected to spend up to $100 billion on compute over time.

The temptation with numbers like these is to treat them as spectacle. Forty billion dollars is so large it becomes almost aesthetic — a statement about ambition, about the kind of bets that define eras. You feel the weight of the zeros and move on.

But I keep coming back to Patel’s API bill.

Because Google’s $40 billion and SemiAnalysis’s compounding monthly costs are saying the same thing, expressed at scales so different they almost don’t seem related. One is a research firm noticing that their tool usage has quietly escaped prediction. The other is one of the most sophisticated capital allocators on earth making a bet that strains comprehension. But both are pointing at the same reality: that this technology, wherever it takes hold, does not plateau. It compounds.

We have been waiting, I think, for the moment when AI adoption becomes legibly real — some threshold event that separates the signal from the noise, the press release from the actual change. The surveys were supposed to mark that moment. The enterprise announcements. The benchmark numbers.

Patel’s aside suggests we’ve been waiting for the wrong thing. You don’t arrive at the exponential. You just eventually notice you’re already in it — in an aside on a podcast, before moving on to the point you actually came to make.

Tags ai, Amazon, anthropic, artificial intelligence, dylan patel, exponential growth, google, innovation, jordi visser, Scott Loftesness, semianalysis

AI Anthropic Future

Escaping the Gravity of the Present

Post author By Scott Loftesness
Post date February 25, 2026
No Comments on Escaping the Gravity of the Present

I was watching a YouTube conversation with Dario Amodei recently, and the comments he shared at the end got me thinking about how remarkably bad we all are at imagining the future.

Whenever I try to picture what the world will look like in ten or twenty years, I usually end up picturing today—just slightly shinier. If a prediction sounds too weird or disruptive, my brain automatically rejects it. It just feels too unmoored from the reality I woke up in this morning. We all have this instinct to retreat to the safety of incremental change.

But as Amodei points out, that comfort zone is exactly what blinds us. He notes that we are constantly tempted to dismiss massive shifts simply because they feel like they “can’t happen.”

“However, by extrapolating simple curves or reasoning from first principles, one often arrives at counterintuitive conclusions that surprisingly few people believe.”

It’s a strange feeling to look at a simple data curve, follow the math, and realize the logical endpoint sounds completely unhinged. The truest maps of tomorrow often look like bad science fiction to us today.

But there is a catch here, and it’s a mental trap I know I’ve fallen into before. You can’t just sit in a room and logic your way into the future. Pure logic, stripped of real-world friction, usually just leads you confidently in the wrong direction. Amodei suggests a much more grounded formula:

“The right combination of a few empirical observations and thinking from first principles can allow one to predict the future in ways that are publicly available but rarely adopted.”

This struck a chord with me. It’s easy to get swept up in purely theoretical thinking. But the better approach is to start with what is actually happening on the ground—the messy, undeniable data. From there, you strip it down to its most basic truths and follow the thread, no matter how strange the destination looks.

It takes a certain kind of intellectual courage to trust the math when your gut is screaming that things are getting too weird. But learning to decouple what is true from what feels normal might be the only real way to prepare for what is coming.

AI Anthropic Claude Cybersecurity

The End of Obscurity

There is a particular kind of silence that surrounds a zero-day vulnerability. It is the silence of something waiting—a flaw in the logic, a gap in the armor, sitting unnoticed in the codebase for years, perhaps decades. We have slept soundly while these digital fault lines ran beneath our feet, largely because we assumed that finding them required a brute force that no one possessed, or a level of human genius that is incredibly rare.

But the silence is breaking.

I was reading Anthropic’s Red Team report from earlier this week (triggered by reading Bruce Schneier’s amazement), specifically their findings on the new Opus 4.6 model. The technical details are impressive, but the philosophical implication is what stopped me, like Bruce, cold.

For years, digital security has relied on “fuzzers”—programs that throw millions of random inputs at a system, banging on the doors to see if one accidentally opens. It is a noisy, chaotic, brute-force approach.

The new reality is different. As the report notes:

“Opus 4.6 reads and reasons about code the way a human researcher would—looking at past fixes to find similar bugs that weren’t addressed, spotting patterns that tend to cause problems.”

This is a fundamental phase shift. We are moving from the era of the Battering Ram to the era of the Jeweler’s Loupe. The machine is no longer guessing; it is understanding.

There is something deeply humbling, and slightly terrifying, about this. We have spent the last half-century building a digital civilization on top of code that we believed was “secure enough” because it had survived the test of time. We trusted the friction of complexity and the visibility of open source to keep us safe. We assumed that if a bug had existed in a core library for twenty years, surely it would have been found by now.

But the AI doesn’t care about time. It doesn’t get tired. It doesn’t have “developer bias” that assumes a certain function is safe because “that’s how we’ve always done it.” It simply looks at the structure, reasons through the logic, and points out the crack in the foundation that we’ve been walking over every day.

We are entering a period of forced transparency. The “security by obscurity” that held the internet together is evaporating. When intelligence becomes commoditized, vulnerabilities become commodities too. The question is no longer “is my code secure?” but rather, “what happens when the machine sees the flaws I cannot?”

It’s a reminder that complexity is a loan we take out against the future. Eventually, the bill comes due. We are just lucky that, for now, the entity collecting the debt is one we built ourselves, designed to tell us where the cracks are before the ceiling collapses. Let’s hope that we are out far enough in front of it.

Share this:

Share this:

Share this:

Share this: