Categories
AI AI: Transformers

The State You Never See

The transaction arrives in milliseconds. A purchase attempt โ€” a gas station in Phoenix, a grocery store in suburban Atlanta, a wire transfer at 2 a.m. โ€” and somewhere in the authorization chain, a system has to decide. Not later. Now. The clock is already running.

When I led the fraud detection team at Visa, this was the problem that lived in your chest. You couldnโ€™t see what you needed to see. You couldnโ€™t know whether the person presenting that card was the person who owned it, whether the account had been compromised six hours ago in a breach you hadnโ€™t yet detected, whether the behavioral signature of these transactions was the legitimate cardholder running errands or a fraudster working methodically through a stolen number before the window closed. You could only see what the transactions said. You could never see the state underneath.

That distinction โ€” between what you can observe and what is actually true โ€” turns out to be one of the organizing problems of our time. It has a name, a formal structure, and a history that runs from mid-century mathematics through the trading floors of quantitative hedge funds to the frontier of artificial intelligence. The name is the hidden Markov model. But the problem it addresses is older than the math, and more human than the jargon suggests.

Categories
AI Consulting

The Judgment Layer

An analyst’s note about the CEO of one of the largest consulting companies making comments at an investor conference includes a line that deserves more attention than it got: “token volume used on a project isn’t a proxy for AI maturity.”

Translation โ€” clients are burning money on frontier models for problems that don’t need frontier models, and they’re not getting the outcomes they expected.

This firm’s CEO offered this as a business opportunity. I read it as a confession.

The old consulting model was simple: client has a technology problem, firm deploys humans to solve it. Billing followed effort. The new problem is different in kind โ€” clients have an AI strategy problem. They know they’re supposed to be using AI. They’ve heard the word “frontier.” They’re spending accordingly. They just don’t know why, and the outcomes are showing it.

So the CEO is right that there’s an opportunity here. The value proposition shifts from implementation to judgment โ€” not deploying AI, but knowing when not to deploy the expensive one. Matching capability to problem. Being trusted enough to tell a client that their $50M frontier model contract is solving a $500K problem.

Here’s the irony that the comment skates past: that advice is structurally difficult for a large consultancy to give.

The business model that built consulting firms was billing for doing. The more you deploy, the more you bill. Helping a client spend less, or choose the cheaper model, or run a narrower project, is genuinely good advice that the incentive structure actively works against. You don’t grow a $70 billion professional services firm by talking clients out of scope.

The judgment layer, if it becomes the real value, requires something closer to a doctor’s relationship with a patient than a contractor’s relationship with a client. Doctors get paid whether they prescribe or not. The value of the visit is the diagnosis โ€” including the diagnosis that says you don’t need the expensive intervention. Consultants, historically, get paid to prescribe, and paid more when the prescription is larger.

There’s a reason we trust doctors with that asymmetry and not contractors. Licensing, malpractice, professional norms built over centuries โ€” all of it exists to align the incentive. Consulting has none of that infrastructure. What it has instead is reputation, which is slower-acting and easier to game.

Whether the large firms can actually make the shift โ€” rather than just reframe the same billable-hours model in the language of AI optimization โ€” is the real question the market is wrestling with. The CEO’s comment is genuinely perceptive about where client value lies. It’s less clear that consulting firms are currently built to capture it honestly.

Categories
AI AI: Large Language Models China

Cranes on the Horizon

In 2005, during my first trip to Shanghai and Beijing, the most striking feature of the skyline wasn’t the architectureโ€”it was the cranes. More than I could possibly count, perched atop half-finished skyscrapers like a mechanical forest. Entire districts seemed to be mid-construction simultaneously, as if someone had pressed a button and the whole country decided to build everything at once. Dan Wang in his book “Breakneck” described China as the “engineering state” that approaches national problems with physical solutions. Back in 2005, coming from Silicon Valley, I thought I understood what growth looked like. I didn’t.

I’ve been thinking about that trip while reading Nathan Lambert’s recent piece, “Notes from Inside China’s AI Labs.” Lambert โ€” who runs the Interconnects newsletter and does serious work tracking the open-weight LLM ecosystem โ€” just returned from visiting essentially every major AI lab in China. Moonshot, Zhipu, Meituan, Xiaomi, Qwen, Ant Ling, 01.ai. He went in with genuine curiosity and came back with humility. That combination is rarer than it should be.

What he found was the cranes. Different domain, same energy.

Lambert’s central observation is about culture, not capability. The Chinese labs aren’t winning on any single technical breakthrough โ€” they’re winning on execution discipline. He describes researchers, many of them active students, who bring no ego to the work. They absorb context fast, drop assumptions faster, and seem genuinely unbothered by the philosophical debates that seem to swirl constantly in the American AI community. When he tried to engage Chinese researchers on the long-term social risks of models or the ethics of AI behavior, those questions “hung in the air with a simple confusion. It’s a category error to them.” Their role is to build the best model. Full stop. To them, an LLM isn’t a philosophical entity to be interrogated; it’s a piece of infrastructure to be optimized.

That description landed for me. Not as a criticism of American research culture, but as a real observation about what the moment demands. Building good LLMs today is, as Lambert puts it, meticulous work across the entire stack โ€” “all points of the model can give some improvements, and fitting them in together is a complex process.”

The work that matters most right now isn’t the 0-to-1 creative leap; it’s the thousand unglamorous decisions executed without complaint. Students who haven’t yet learned to lobby for their own ideas turn out to be well-suited for exactly this.

Lambert ends on a note that’s hard to shake. Looking up from his laptop on a high-speed train, he keeps seeing cranes on the horizon. He draws the same connection I did, though from the inside: the construction everywhere fits the broader culture and energy around building. “When I look up from my laptop and always see bunches of cranes on the horizon, it obviously fits in with the broader culture and energy around building in China.”

Twenty years after my first visit, the cranes are still there. They’ve just moved indoors โ€” into server rooms and training runs and model releases that land every few months with quiet confidence. In 2005, what China was building was obvious: you could see the steel frames going up. What’s being built now is harder to see, which may be exactly why it keeps surprising us.

Check out Lambert’s essay – it’s remarkable. If the 20th century was defined by who could move the most earth, the 21st will be defined by who can move the most tokens. And right now, the cranes are moving faster than we think.

Categories
AI Thinking Tools

Outsourcing Thinking but not Understanding

Thereโ€™s a line mentioned in a recent discussion by Andrej Karpathy that I keep turning over: You can outsource your thinking but you canโ€™t outsource your understanding.

It sounds like a warning. Maybe it is. But the more I sit with it, the more it feels like something older โ€” a distinction philosophers have been trying to draw for centuries, suddenly made urgent by the fact that we now have a tool that makes outsourcing thinking almost frictionless.

Hereโ€™s what I notice when I use AI well: I get the answer, and I feel satisfied. Thereโ€™s a small dopamine tick. Task closed. But if someone asks me an hour later to explain the reasoning, I often canโ€™t. The thinking happened โ€” somewhere โ€” but not in me. I was a conduit. A confident one, too, which is the dangerous part.

This is different from looking something up. When I Google a fact and paste it into a document, I know Iโ€™m borrowing. The seam is visible. But when I ask an AI to reason through a problem with me, the output arrives in first person, in fluent prose that matches my own register, and something in my brain says I worked this out. The seam disappears. Thatโ€™s new. Thatโ€™s the thing we donโ€™t yet have good instincts for.

Karpathyโ€™s deeper point is about construction. Heโ€™s a builder by temperament โ€” his mantra, which he traces to Feynman, is that if you canโ€™t build it, you donโ€™t understand it. What you canโ€™t yet construct, you merely think you understand. There are always micro-gaps in your knowledge, invisible until you try to arrange the pieces yourself and find they donโ€™t quite fit. The AI doesnโ€™t change that equation. It just makes it easier to mistake the map for the territory โ€” and to feel strangely proud of a map you didnโ€™t draw.

Hesse understood this, in a different century and a different idiom. In Siddhartha, the young seeker travels to meet the Buddha himself โ€” the most perfectly articulated wisdom in the world, delivered by the man who actually found it. Siddhartha listens, acknowledges that the teaching is flawless, internally consistent, the most complete account of liberation ever assembled. And then walks away. Not from arrogance, but from recognition: even the Illustrious One cannot hand you his liberation. The path was his. He walked it. That walking is not transferable, no matter how perfect the description of the destination. Received knowledge, however exquisite, is not the same as earned knowledge. The gap between them is exactly the size of your own unlived experience.

Thatโ€™s the same argument, made across two and a half millennia. Feynman says you have to build it. Hesse says you have to live it. Karpathy says the AI can do neither for you.

Heโ€™s also made a related observation about educational video โ€” that a lot of content on YouTube gives the appearance of learning but is really just entertainment, convenient for everyone involved. Nobody has to do the hard part. AI-assisted thinking has the same shape, just more intimate. Youโ€™re not passively watching โ€” youโ€™re actively typing, prompting, engaging. It feels like cognition. But engagement isnโ€™t understanding. Typing a question is not the same as wrestling with it.

I donโ€™t think the answer is to use AI less. Thatโ€™s not Karpathyโ€™s argument either โ€” heโ€™s spent the last year building a school premised on AI tutors expanding what people can learn. The lesson is about custody. When I hand a problem to an AI, I need to stay in the loop as a learner, not just as a reviewer. Thereโ€™s a real difference between asking give me an answer and asking help me build the reasoning. The first outsources thinking. The second โ€” if you insist on it, if you refuse to be a passenger โ€” can still leave the understanding in you, where it belongs.

But insisting is the work. And the work is now easier to skip than it has ever been.

Understanding isnโ€™t a product you receive. Itโ€™s a residue โ€” what settles in you after genuine struggle, after the confusion and the dead ends and the small hard-won moments of clarity. Siddhartha couldnโ€™t get it from the Buddha. You canโ€™t get it from the AI. Karpathyโ€™s line is a custody argument: the thinking can travel, but the understanding has to stay home.

What unsettles me is that weโ€™re building tools that make the borrowing invisible โ€” that dress outsourced reasoning in the first person, that let us feel like weโ€™ve understood something weโ€™ve only processed. Siddhartha at least knew he was walking away from the teaching. He felt the gap. We might not even notice ours.

Categories
AI

Beyond the Summary: Using AI to Find the “Friction” in Your Thinking

Weโ€™ve reached the “Summary Plateau.”

You see it everywhere. Every browser extension, every note-taking app, and every enterprise LLM now offers a “Summarize” button. Itโ€™s the ultimate promise of the efficiency era: Give us the 2,000-word essay, and weโ€™ll give you the three bullet points. But thereโ€™s a hidden tax on this kind of efficiency. When we ask an AI to summarize, we are asking it to smooth out the edges. We are asking it to remove the “noise.” The problem is, in the world of ideas, the noise is often where the signal lives. The frictionโ€”the parts of an argument that make us uncomfortable or that we don’t quite understandโ€”is where the actual learning happens.

If we only consume the summaries, we aren’t thinking; weโ€™re just acknowledging.

The Mirror, Not the Maker

Iโ€™ve been experimenting with a different approach. Instead of asking the model to make the content shorter, Iโ€™ve been asking it to make my engagement with the content harder.

I don’t want a “Maker” to write my thoughts for me. I want a “Mirror” to show me where my thoughts are thin.

When Iโ€™m wrestling with a complex pieceโ€”perhaps a deep dive on the future of venture capital or a philosophical treatise on Areteโ€”Iโ€™ve stopped clicking “summarize.” Instead, I feed the text into the LLM and use these “Friction Prompts” to find the sand in the gears:

The Essential Toolkit

  • The “Steel Man” Challenge: “I am inclined to agree with this authorโ€™s conclusion. Find the three strongest counter-arguments that this text ignores, and explain why a reasonable person would hold them.”
  • The “Recursive Logic” Audit: “Identify the three most critical ‘logical leaps’ the author makesโ€”points where a conclusion is reached without sufficient evidence. If those leaps are wrong, how does the entire argument collapse?”
  • The “Blind Spot” Audit: “What are the underlying cultural or economic assumptions this author is making that they haven’t explicitly stated?”
  • The “Cross-Pollination” Filter: “Connect the central thesis of this article to a seemingly unrelated field (e.g., Stoic philosophy or biological ecosystems). How does the logic of this text hold upโ€”or failโ€”when applied to that different domain?”
  • The “Analog Translation” Test: “If I had to explain the core mechanism of this abstract concept using only physical, analog metaphors (like plumbing or woodworking), how would I do it? Where does the metaphor break down?”
  • The “Socratic Sharpening”: “Don’t summarize this. Instead, ask me three probing questions that force me to apply the core logic of this essay to a completely different industry.”

Sharpening the Blade

Summary is about completion (getting it done). Friction is about cognition (getting it right).

When the AI points out a blind spot in an article I loved, it creates a moment of cognitive dissonance. That “click” of discomfort is the sound of a mental model being updated. Itโ€™s the digital equivalent of using a whetstone on a bladeโ€”you need the friction to get the edge.

As we move further into this age of “Flash-Frozen Cognition,” the temptation to automate our understanding will only grow. But discernmentโ€”that uniquely human trait weโ€™ve discussed here beforeโ€”cannot be outsourced to a bulleted list.

The next time youโ€™re faced with a daunting PDF or a dense long-read, resist the “Summarize” button. Ask the machine to challenge you instead. You might find that the most valuable thing the AI can give you isn’t an answer, but a better version of your own question.


A Deep Dive (Further Reading from the Archive)

If you resonated with this piece on cultivating discernment, you might find these earlier synthesis experiments worth a revisit:

  • On Flash-Frozen Cognition: A foundational post discussing how LLMs are freezing the current consensus, and how we must resist it.
  • The Harvest and the Algorithm: Comparing 1920s ice harvesting to 2020s cognitionโ€”the critical shift from scarcity to abundance.
  • The Arete of Attention: A look at the Stoic concept of virtue as the intentional direction of our most scarce resource: focus.
  • Longhand Thinking: Why the physical act of writing is the ultimate antidote to digital velocity.
Categories
AI AI: Large Language Models

The Echo Effect: Why Prompt Repetition is AI’s Best Kept Secret

In our relentless pursuit of complexity, we often overlook the elegant simplicity of a fundamental human habit: repeating ourselves.

We build colossal architectures, weave intricate neural networks, and throw mountains of computational power at our artificial intelligence systems, hoping to squeeze out a few more drops of reasoning and logic. Yet, sometimes the most profound breakthroughs require no new code, no additional latency, and no extra training data.

Sometimes, you just have to say it twice.

In a fascinating December 2025 paper titled Prompt Repetition Improves Non-Reasoning LLMs,” researchers Yaniv Leviathan, Matan Kalman, and Yossi Matias uncovered an almost absurdly simple “free lunch” in AI optimization.

Their premise is straightforward: when you aren’t using a heavy reasoning model, simply copying and pasting your input prompt multiple times significantly boosts the model’s performance.

“When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.”

The mechanics behind this are elegantly pragmatic.

By repeating the prompt, you are moving the heavy computational lifting to the parallelizable “pre-fill” stage of the model’s processing. The AI’s causal attention mechanism gets to process the same tokens again, allowing the later iterations of the prompt to attend to the earlier ones. It effectively acts as a hack to simulate bidirectional attention in a decoder-only architecture.

What’s even more telling is the paper’s observation on why this works so well.

The researchers noted that models trained with Reinforcement Learning (like OpenAI’s deep-thinking variants) naturally learn to “restate the problem” in their internal monologue. They figured out on their own what these researchers are suggesting we do manually: repeat the question to focus the mind.

Reading this paper, I couldn’t help but draw a parallel to the human condition and the nature of listening.

How often do we assume that because we have articulated a thought once, it has been fully absorbed? We fire off a single, dense instruction to a colleague, a partner, or a friend, and then marvel when the nuance is lost in translation.

We suffer from our own attention bottlenecks.

Like a non-reasoning LLM trying to parse a complex query in a single pass, we are constantly bombarded with a stream of tokensโ€”emails, notifications, conversations, fleeting thoughts. To truly understand, to truly digest and synthesize information, we need the grace of repetition.

There is a strange poetry in the fact that to make our most advanced digital minds smarter, we have to talk to them the way we talk to a distracted child or a busy spouse. The “microscope effect” highlighted in the studyโ€”where repeating a prompt drastically improved extraction tasksโ€”shows that the failure wasn’t in the model’s capacity to know, but in its capacity to focus. Repetition forces focus. It creates a resonant echo in the context window, a digital highlighter that screams, โ€œThis matters. Look here again.โ€

As we continue to navigate a world increasingly augmented by artificial intelligence, this paper serves as a humbling reminder. The bleeding edge of technology isn’t always found in the most complex equation; sometimes, it’s hidden in the most basic principles of communication.

Whether you’re prompting a billion-parameter language model or trying to connect with the human sitting across from you, the lesson is clear.

Clarity isn’t just about the words you choose. It’s about giving those words the space, the resonance, and the repetition they need to be truly understood.

Say it once to be heard; say it twice to be understood.

Categories
AI

The Jagged Mind

There is a peculiar kind of genius that has always made us uneasy โ€” the savant who can calculate the day of the week for any date in history but cannot tie his own shoes. We admire the capability. We are troubled by the gap.

Demis Hassabis, speaking at this weekโ€™s India AI Impact Summit in Delhi, gave that unease a name. He called todayโ€™s most powerful AI systems โ€œjagged intelligences.โ€

It is a phrase worth sitting with.

A jagged intelligence can win a gold medal at the International Mathematics Olympiad โ€” solving problems that would humble most PhD mathematicians โ€” and then, in the very next breath, stumble on elementary arithmetic if the question is phrased in an unfamiliar way.

The peaks are extraordinary. The valleys are bewildering. And crucially, you never quite know which terrain youโ€™re standing on.

Hassabis identified three specific gaps between where we are and what he called โ€œa kind of general intelligence.โ€

The first is continual learning โ€” todayโ€™s models are trained, then frozen. They are, in a sense, educated and then released into a world they can no longer learn from.

The second is long-term planning. Current systems can reason tactically, but they lack the capacity to hold a coherent thread of intention across months or years the way a human architect, scientist, or entrepreneur does.

The third โ€” and perhaps the most philosophically interesting โ€” is that jaggedness itself: the wild inconsistency that makes todayโ€™s AI feel more like a force of nature than a reliable mind.

โ€œA true general intelligence system shouldnโ€™t have that kind of jaggedness.โ€

What strikes me about Hassabisโ€™s framing is how it reorients the conversation.

We have spent years debating whether AI is โ€œintelligent.โ€ His point is more subtle: intelligence without consistency is not yet wisdom. A system that is brilliant and brittle in equal measure is something genuinely new in the world โ€” not human, not the robots of science fiction, but a third thing we donโ€™t yet have good language for.

The road from jagged to coherent is, I suspect, the central engineering and philosophical challenge of the next decade.

Continual learning means systems that grow with us. Long-term planning means systems that can be trusted with consequential goals. Consistency means systems whose judgment we can actually rely on.

Until then, we are working with something that resembles a prodigy โ€” dazzling, occasionally humbling, and not yet quite whole.

Questions to Consider

  1. The Consistency Problem: If you knew an AI system could solve a problem brilliantly 90% of the time but fail unpredictably the other 10%, how would that change the decisions youโ€™d trust it to make?
  2. Frozen in Time: What does it mean that the systems we rely on most are, at their core, educated in the past and unable to learn from the present? What human analog does that bring to mind?
  3. Jagged vs. General: Hassabis draws a line between โ€œjagged intelligenceโ€ and โ€œgeneral intelligence.โ€ Do you think general intelligence is the right destination โ€” or is there value in systems that are deeply specialized, even if inconsistent?
  4. The Savant Question: Weโ€™ve always had a complicated relationship with uneven genius in humans. Does the โ€œjagged AIโ€ problem feel categorically different to you, or just a new version of an old puzzle?
Categories
AI

Research Prompts

I recently came across a prompt in a post on X which has proven to be quite useful in brainstorming.

Hereโ€™s the prompt, tailored in this example to research the area of AI super intelligence:

You are a professional ghostwriter. Generate 15 high-signal content ideas on how AI labs will reach artificial super intelligence.

For each idea:
- Give me a hook line (<= 15 words, curiosity-driven)
- Outline the structure in 3 parts (hook, point, action)
- Include an example or analogy that will resonate with an audience of college graduates.

Make them practical, non-generic, and designed to spark discussion.

To see how it works, just copy it and put it into your favorite large language model. I think youโ€™ll be surprised and pleased with what results you obtain.

After the first pass, you can try this next:

Pick the best one and draft it.

Youโ€™ll get back a draft article about the modelโ€™s choice for the best among the 15 it first produced.

You can then steer the model using a prompt like this:

Tune the draft for the vc/tech founder voice. Also speculate that Google is most likely the winner. 

Itโ€™s fun to see how the article evolves further with that voice and more speculation about the possible winner.

You can then ask the model to redraft it further:

Sketch two spicy counterarguments to the main thesis. 

And so on. Itโ€™s fun to do a deep dive on a topic using this approach. The wide range of the first fifteen results narrows and deepens as you ask the model to refine the draft it has produced.

Iโ€™ve lost track of time exploring a topic of interest to me as I got back and forth with the model evolving my understanding. Some models will even assist you in that process by suggesting next steps along the way.

Let me know of your experiences using this kind of approach!

Categories
AI AI: Large Language Models

LLM Learning / Daydreaming

Following up on my earlier post about Dwarkesh Patelโ€™s lament about LLMs not really learning, Gwern writes LLM Daydreaming.

I propose a day-dreaming loop (DDL): a background process that continuously samples pairs of concepts from memory. A generator model explores non-obvious links between them, and a critic model filters the results for genuinely valuable ideas. These discoveries are fed back into the systemโ€™s memory, creating a compounding feedback loop where new ideas themselves become seeds for future combinations.

Categories
AI AI: Large Language Models Apple

Why AI Works

Based upon his own personal explorations of why AI large language models work so well, former Apple exec Bertrand Serlet has created an excellent 30 minute video introduction to them. He introduces the notion of the “curse of dimensionality” – how the scale of LLMs increase so dramatically – and then the “blessing of dimensionality” as helping to explain some of the “magic” of neural networks. Worth watching!