Tag: open-weight models

The Slipstream Strategy

Apple had a problem no amount of money could solve. An iPhone can’t draw the power or shed the heat of a data center, so ten different tasks can’t mean ten different models fighting for the same sliver of RAM. Apple’s answer was to freeze one small, efficient base model into the device and then swap tiny adapters in and out of it in milliseconds — a summarization adapter for your texts, a Siri adapter for on-screen actions, and a handoff to Private Cloud Compute for anything heavier. The phone behaves like it’s running many models. It’s running one model wearing many hats.

That architecture — a frozen base plus swappable adapters — is quietly becoming the default way serious AI companies build, and it’s worth understanding why, because it inverts the assumption most people still carry into this industry.

The assumption is that winning means owning a frontier model. Sierra co-founder Clay Bavor pushed back on that on a recent 20VC episode: pouring capital into your own pre-training, he argued, tends to leave you holding a highly perishable bag of floating-point numbers. Open-weight models improve fast enough that yesterday’s frontier is next quarter’s commodity. The companies playing this well aren’t racing to out-spend the labs. They’re slipstreaming behind them — taking the free, state-of-the-art engine and putting all their effort into what sits on top of it.

What sits on top is LoRA — low-rank adaptation. The old failure mode was catastrophic forgetting: fine-tune a model hard enough on your own data and it forgets how to reason generally. LoRA sidesteps this by leaving the base model untouched and training a small set of additional parameters alongside it — a thin layer of expertise bolted onto a frozen foundation. You get real domain depth without touching the thing that makes the model work at all.

The business logic that follows from this is the actual point, and it’s simpler than it looks:

You stop being hostage to any one model provider — if a better open-weight model ships next month, you port your adapter, not your whole product. You can serve hundreds of differently-customized clients off one base model on one piece of hardware, instead of running a separate giant model per customer. You can ship a fix in an afternoon, because an adapter is a few hundred megabytes, not a training run. And in regulated industries, your proprietary data can train an adapter that never leaves your own infrastructure.

None of this is really a story about model architecture. It’s a story about where the moat moved. For a while the moat was raw capability — whoever had the best model won. Apple and Sierra are betting the moat is now somewhere else entirely: in how tightly you can weave a commodity intelligence into a specific workflow, a specific dataset, a specific customer relationship. The engine is free. The adapter is the business.

AI AI: Large Language Models China

Cranes on the Horizon

In 2005, during my first trip to Shanghai and Beijing, the most striking feature of the skyline wasn’t the architecture—it was the cranes. More than I could possibly count, perched atop half-finished skyscrapers like a mechanical forest. Entire districts seemed to be mid-construction simultaneously, as if someone had pressed a button and the whole country decided to build everything at once. Dan Wang in his book “Breakneck” described China as the “engineering state” that approaches national problems with physical solutions. Back in 2005, coming from Silicon Valley, I thought I understood what growth looked like. I didn’t.

I’ve been thinking about that trip while reading Nathan Lambert’s recent piece, “Notes from Inside China’s AI Labs.” Lambert — who runs the Interconnects newsletter and does serious work tracking the open-weight LLM ecosystem — just returned from visiting essentially every major AI lab in China. Moonshot, Zhipu, Meituan, Xiaomi, Qwen, Ant Ling, 01.ai. He went in with genuine curiosity and came back with humility. That combination is rarer than it should be.

What he found was the cranes. Different domain, same energy.

Lambert’s central observation is about culture, not capability. The Chinese labs aren’t winning on any single technical breakthrough — they’re winning on execution discipline. He describes researchers, many of them active students, who bring no ego to the work. They absorb context fast, drop assumptions faster, and seem genuinely unbothered by the philosophical debates that seem to swirl constantly in the American AI community. When he tried to engage Chinese researchers on the long-term social risks of models or the ethics of AI behavior, those questions “hung in the air with a simple confusion. It’s a category error to them.” Their role is to build the best model. Full stop. To them, an LLM isn’t a philosophical entity to be interrogated; it’s a piece of infrastructure to be optimized.

That description landed for me. Not as a criticism of American research culture, but as a real observation about what the moment demands. Building good LLMs today is, as Lambert puts it, meticulous work across the entire stack — “all points of the model can give some improvements, and fitting them in together is a complex process.”

The work that matters most right now isn’t the 0-to-1 creative leap; it’s the thousand unglamorous decisions executed without complaint. Students who haven’t yet learned to lobby for their own ideas turn out to be well-suited for exactly this.

Lambert ends on a note that’s hard to shake. Looking up from his laptop on a high-speed train, he keeps seeing cranes on the horizon. He draws the same connection I did, though from the inside: the construction everywhere fits the broader culture and energy around building. “When I look up from my laptop and always see bunches of cranes on the horizon, it obviously fits in with the broader culture and energy around building in China.”

Twenty years after my first visit, the cranes are still there. They’ve just moved indoors — into server rooms and training runs and model releases that land every few months with quiet confidence. In 2005, what China was building was obvious: you could see the steel frames going up. What’s being built now is harder to see, which may be exactly why it keeps surprising us.

Check out Lambert’s essay – it’s remarkable. If the 20th century was defined by who could move the most earth, the 21st will be defined by who can move the most tokens. And right now, the cranes are moving faster than we think.

Tags ai, china, engineering, Future of Technology, Global Economy, infrastructure, innovation, large language models, llm, Nathan Lambert, open-weight models, Research and Development, Scott Loftesness, Silicon Valley, Software Engineering, technology culture

AI China

What’s new in AI from China?

Post author By Scott Loftesness
Post date February 17, 2026
No Comments on What’s new in AI from China?

February 2026 is a busy time in AI coming out of China coinciding with the Chinese New Year.

I asked Grok this morning for an update and double checked it with Gemini.

Here’s what Grok reported:

Chinese AI is in a very active phase right now (mid-February 2026), with a massive wave of new model releases, upgrades, and applications timed around the Lunar New Year (Spring Festival) holiday. This echoes the “DeepSeek shock” from exactly one year ago, when DeepSeek’s low-cost, high-performance models disrupted global expectations.

Major Recent Releases and Upgrades (mostly this month)

Chinese companies are shipping frontier-level models at a furious pace, often open-source or very low-cost, focusing on efficiency, reasoning, coding, multimodality, agents, and real-world applications.

Alibaba → Just launched Qwen 3.5 (with agentic features, multimodal inputs like text + photo + video, up to 2-hour video analysis). They also released RynnBrain (specialized for physical AI/robotics, helping robots understand and interact with the real world).
ByteDance (TikTok parent) → Released Doubao 2.0 (claims to match GPT-5.2 / Gemini 3 Pro level reasoning and multi-step tasks). Their video gen model Seedance 2.0 went viral for high-quality deepfakes and creative uses, sparking huge youth/creator interest.
Zhipu AI → Dropped GLM-5 (open weights, strong in coding, long tasks, agent capabilities; one of the highest open-weight intelligence scores).
MiniMax → M2.5 and M2.5 Lightning (near state-of-the-art at ~1/20th the cost of top Western models like Claude Opus; very strong on coding/agent benchmarks).
Kuaishou → Kling 3.0 (advanced AI video generation).
Moonshot AI → Kimi K2.5 (very close to top proprietary models at 1/7th the price; excellent reasoning).
DeepSeek → Widely expected to drop V4 soon (focused on coding, potentially 1M+ context, efficient MoE architecture). They’ve been the pace-setter since early 2025.

Many of these are open-weight or open-source, leading to Alibaba’s Qwen family overtaking Meta’s Llama in cumulative Hugging Face downloads.

Broader Themes Right Now

Low-cost + open ecosystems → China is winning on price/performance and rapid iteration. Models often cost 5–20× less than Western equivalents while matching or nearing performance in many areas.
Two different races → Analysts say the US focuses on perfecting frontier models (bigger, more compute-heavy), while China emphasizes adoption — deploying cheap, efficient AI at massive scale in manufacturing, robotics, consumer apps, and the real economy.
Physical & embodied AI → Huge push here. Robotics demos (backflips, kung fu with nunchucks synced across many bots on live TV) show fast progress in “physical intelligence.” Alibaba’s RynnBrain targets robotics directly.
Other notable advances → In-orbit AI computing constellation (“Three-Body”) successfully tested (running LLMs on satellites for low-latency edge processing). Military applications (e.g., bio-inspired drone swarms using hawk/wolf behaviors) are advancing rapidly.
Challenges & realism → Some top Chinese AI leaders (e.g., Alibaba’s Qwen lead) estimate <20% chance of fully overtaking US frontier models in 3–5 years due to compute gaps and chip sanctions. But open collaboration and talent pipelines are closing the gap fast in many practical areas.

Overall, February 2026 feels like China’s AI industry hitting a coordinated sprint — not just catching up, but pulling ahead in accessible, deployable, cost-effective AI that reaches everyday users, creators, and industries quickly. The US still leads in raw frontier capability and monetization scale, but the gap is shrinking noticeably in deployment and iteration speed.

Gemini’s review included the following:

The report’s observation about the “Two Different Races” is the most insightful part:

• US Strategy: Scaling laws for maximum “Frontier Intelligence.”

• China Strategy: Scaling for “Deployment Efficiency.” By making models like MiniMax M2.5 and Kimi K2.5 available at a fraction of the cost ($0.15–$0.60 per 1M tokens), China is prioritizing the “AI Agent” economy, where reliability and low cost per task matter more than raw benchmark scores.

Tags ai, Chinese ai, deepseek, doubao 2.0, glm-5, Kimi k2.5, low-cost ai, open-weight models, qwen 3.5, Scott Loftesness, seedance 2.0

Share this:

Share this:

Major Recent Releases and Upgrades (mostly this month)

Broader Themes Right Now

Share this: