Categories
AI AI: Large Language Models Apple

The Slipstream Strategy

Apple had a problem no amount of money could solve. An iPhone can’t draw the power or shed the heat of a data center, so ten different tasks can’t mean ten different models fighting for the same sliver of RAM. Apple’s answer was to freeze one small, efficient base model into the device and then swap tiny adapters in and out of it in milliseconds — a summarization adapter for your texts, a Siri adapter for on-screen actions, and a handoff to Private Cloud Compute for anything heavier. The phone behaves like it’s running many models. It’s running one model wearing many hats.

That architecture — a frozen base plus swappable adapters — is quietly becoming the default way serious AI companies build, and it’s worth understanding why, because it inverts the assumption most people still carry into this industry.

The assumption is that winning means owning a frontier model. Sierra co-founder Clay Bavor pushed back on that on a recent 20VC episode: pouring capital into your own pre-training, he argued, tends to leave you holding a highly perishable bag of floating-point numbers. Open-weight models improve fast enough that yesterday’s frontier is next quarter’s commodity. The companies playing this well aren’t racing to out-spend the labs. They’re slipstreaming behind them — taking the free, state-of-the-art engine and putting all their effort into what sits on top of it.

What sits on top is LoRA — low-rank adaptation. The old failure mode was catastrophic forgetting: fine-tune a model hard enough on your own data and it forgets how to reason generally. LoRA sidesteps this by leaving the base model untouched and training a small set of additional parameters alongside it — a thin layer of expertise bolted onto a frozen foundation. You get real domain depth without touching the thing that makes the model work at all.

The business logic that follows from this is the actual point, and it’s simpler than it looks:

You stop being hostage to any one model provider — if a better open-weight model ships next month, you port your adapter, not your whole product. You can serve hundreds of differently-customized clients off one base model on one piece of hardware, instead of running a separate giant model per customer. You can ship a fix in an afternoon, because an adapter is a few hundred megabytes, not a training run. And in regulated industries, your proprietary data can train an adapter that never leaves your own infrastructure.

None of this is really a story about model architecture. It’s a story about where the moat moved. For a while the moat was raw capability — whoever had the best model won. Apple and Sierra are betting the moat is now somewhere else entirely: in how tightly you can weave a commodity intelligence into a specific workflow, a specific dataset, a specific customer relationship. The engine is free. The adapter is the business.

Categories
AI Apple Bicycles History

The Best Lathe in the Shop

Part 3 of 3…

There is a version of this story where Apple is the Wright Brothers.

It is not an unreasonable version. Apple has done the safety bicycle move more times than almost any company in history — taken a technology the engineers built for engineers and brought it down to earth, made it a machine for everyone. The Mac. The iPod. The iPhone. Each one was a wheel coming down. Each one arrived after a period of apparent slowness, of critics saying Apple had lost its edge, of the industry having already moved on to the next thing. Each one was, in retrospect, obvious. Apple had been in the bicycle shop the whole time. You just couldn’t see what they were building.

So when Apple showed its hand at WWDC this week — a rebuilt Siri operating at the OS level, accessing your messages and mail and photos in real time, understanding context across apps, doing things the old Siri could only approximate — it is tempting to read it as Kitty Hawk. The long preparation made visible. The brothers finally leaving the shop.

It might be. It also might not be. That is the only honest thing to say.

What Apple showed was real. The new Siri, built on Apple’s own Foundation Models with help from Google’s Gemini, is not the Siri that became a punchline. It holds context. It moves across apps without being asked. It knows what you were doing five minutes ago and connects it to what you are doing now. It can surface a photo without opening Photos, build a navigation route from an image, draft a message in the tone of the conversation it is joining. These are not features. They are the beginning of an operating system that understands you, which is a different thing from an operating system that executes your commands.

The structure of the keynote said more than the words did. Apple led with fixes before features. iOS 27 is a Snow Leopard update — performance, reliability, the underlying machinery — and Siri AI was presented as one item on a long list rather than the main event. This is Apple’s tell. When they are doing something foundational they tend to understate it, the way a craftsman doesn’t announce the quality of his work but simply does it and lets you find it. The penny-farthing riders called their machine the ordinary. They didn’t think they needed to explain.

But here is the thing about the bicycle shop analogy that the optimistic version leaves out. The Wright Brothers knew what they were trying to build. They had been thinking about flight for years before Kitty Hawk. The bicycle shop gave them the craft knowledge, the physical intuition, the hands-on education in how machines move through space. What it did not give them was the destination. They brought the destination themselves.

The question Apple has not answered for me — the question this week’s keynote raised rather than resolved — is whether they know where they are going. Or whether this has only been a partial reveal and there’s much more behind the curtain?

The OS-level integration is the chain drive. Decoupling AI from the app, letting it run through the substrate the way a chain runs through a drivetrain, is exactly the kind of architectural insight that changes what a machine can do. It is not a feature you add. It is a rethinking of what the machine is for. Every previous AI assistant lived above the operating system, looking down at your data from a remove. Apple’s new architecture lives inside it, which is a different relationship entirely — the difference between a mechanic who reads about your car and one who has driven it for a year.

That is the Coventry precision. The tight tolerances. The discipline of making things that have to work at the level where failure is not an option.

What nobody knows, including Apple, is what you build with it.

There is also this: Tim Cook will not be driving this evolution. He announced that John Ternus takes over in September, which means this WWDC — this particular showing of the hand — is the last one Cook owns. Ternus is a hardware engineer, the man who built the Apple Silicon transition, the person most responsible for the Neural Engine that makes on-device inference possible. He is, in the bicycle shop metaphor, the craftsman who built the lathe. Whether he knows how to use it to make something that flies is the question the next several years will answer.

History is patient about these things. It lets the work speak.

In 1892, two brothers opened a shop on West Third Street in Dayton and started fixing bicycles. They were not trying to change the world. They were trying to make a living, to learn a machine, to understand in their hands what the books couldn’t teach them. The flying came later, and it came because of the shop, not despite it. The shop was the point. They just didn’t know it yet.

Apple has the best lathe in the bicycle shop. They have the chain drive architecture, the on-device precision, the installed base of two billion devices that will carry whatever they build into more hands than any other platform on earth. They have a new set of hands on the wheel starting in September, hands that know the metal intimately, that built the engine the whole thing runs on.

What they do not have yet — or if they have it, they are not showing it — is the image of what they are flying toward.

Maybe that’s the ordinary part. Maybe that’s always been the ordinary part. You don’t know what you’re building until you’ve built it, and by then the world has already changed, and everyone says it was obvious, and they are right, and they are also completely wrong about when the decision was made.

The shop is open. The lathe is running. Work is underway.

What happens when someone finally knows what to make?

Categories
AI Business

The Gravity of Compute

We are currently witnessing the single largest deployment of capital in human history. The “Hyperscalers”—the titans of our digital age—are pouring hundreds of billions of dollars into the ground, turning cash into concrete, copper, and silicon.

The prevailing narrative is one of unceasing, exponential growth: bigger models require bigger clusters, which require more power plants, which require more land. It relies on the assumption that the demand for centralized intelligence is insatiable and that the current architecture is the only way to feed it.

But history suggests that technology rarely moves in a straight line; it swings like a pendulum. Two forces are emerging from the periphery that could impact the ROI of this massive infrastructure build-out. One is hiding in your pocket, and the other is waiting in the sky.

A recent conversation with Gavin Baker outlines a potential “bear case” for datacenter compute demand: the rise of Edge AI.

We often assume we need the “God models”—the omniscient, trillion-parameter giants hosted in the cloud—for every interaction. But do we?

Baker suggests that within three years, our phones will possess the DRAM and battery density to run pruned versions of advanced models (like a Gemini 5 or Grok 4) locally. He paints a picture of a device capable of delivering 30 to 60 tokens per second at an “IQ of 115.”

“If that happens, if like 30 to 60 tokens at… a 115 IQ is good enough. I think that’s a bear case.” — Gavin Baker

Consider the implications of that specific number. An IQ of 115 isn’t omniscient, but it is competent. It is capable, nuanced, and helpful.

If Apple’s strategy succeeds—making the phone the primary distributor of privacy-safe, free, local intelligence—the vast majority of our daily queries will never leave the device. We will only reach for the cloud’s “God models” when we are truly stumped, much like we might consult a specialist only after our general practitioner has reached their limit. If 80% of inference happens on the edge for free, the economic model of the trillion-dollar data center begins to look fragile.

Then there is the second threat, one that attacks the terrestrial constraints of the data center itself: the Orbital Data Center. Elon Musk and SpaceX – along with Google’s Project Suncatcher – envision a future where the heavy lifting isn’t done on land, but in orbit. Space offers two things that are scarce and expensive on Earth: unlimited solar energy and an infinite heat sink for radiative cooling. If Starship can reliably loft “server racks” into orbit, the terrestrial moat of land and power grid access—currently the Hyperscalers’ greatest defensive asset—evaporates.

We are left with a fascinating juxtaposition. On one hand, we have the “Edge,” pulling intelligence down from the clouds and putting it into our hands, making it personal, private, and free. On the other, we have “Orbit,” threatening to lift the remaining heavy compute off the planet entirely to bypass the energy bottleneck.

There are hundreds of billions of dollars betting on a future of heavy, centralized gravity. But if the edge gets smart enough, and the orbit gets cheap enough, the gravity may have shifted.