Categories
AI AI: Large Language Models

The Echo Effect: Why Prompt Repetition is AI’s Best Kept Secret

In our relentless pursuit of complexity, we often overlook the elegant simplicity of a fundamental human habit: repeating ourselves.

We build colossal architectures, weave intricate neural networks, and throw mountains of computational power at our artificial intelligence systems, hoping to squeeze out a few more drops of reasoning and logic. Yet, sometimes the most profound breakthroughs require no new code, no additional latency, and no extra training data.

Sometimes, you just have to say it twice.

In a fascinating December 2025 paper titled Prompt Repetition Improves Non-Reasoning LLMs,” researchers Yaniv Leviathan, Matan Kalman, and Yossi Matias uncovered an almost absurdly simple “free lunch” in AI optimization.

Their premise is straightforward: when you aren’t using a heavy reasoning model, simply copying and pasting your input prompt multiple times significantly boosts the model’s performance.

“When not using reasoning, repeating the input prompt improves performance for popular models (Gemini, GPT, Claude, and Deepseek) without increasing the number of generated tokens or latency.”

The mechanics behind this are elegantly pragmatic.

By repeating the prompt, you are moving the heavy computational lifting to the parallelizable “pre-fill” stage of the model’s processing. The AI’s causal attention mechanism gets to process the same tokens again, allowing the later iterations of the prompt to attend to the earlier ones. It effectively acts as a hack to simulate bidirectional attention in a decoder-only architecture.

What’s even more telling is the paper’s observation on why this works so well.

The researchers noted that models trained with Reinforcement Learning (like OpenAI’s deep-thinking variants) naturally learn to “restate the problem” in their internal monologue. They figured out on their own what these researchers are suggesting we do manually: repeat the question to focus the mind.

Reading this paper, I couldn’t help but draw a parallel to the human condition and the nature of listening.

How often do we assume that because we have articulated a thought once, it has been fully absorbed? We fire off a single, dense instruction to a colleague, a partner, or a friend, and then marvel when the nuance is lost in translation.

We suffer from our own attention bottlenecks.

Like a non-reasoning LLM trying to parse a complex query in a single pass, we are constantly bombarded with a stream of tokens—emails, notifications, conversations, fleeting thoughts. To truly understand, to truly digest and synthesize information, we need the grace of repetition.

There is a strange poetry in the fact that to make our most advanced digital minds smarter, we have to talk to them the way we talk to a distracted child or a busy spouse. The “microscope effect” highlighted in the study—where repeating a prompt drastically improved extraction tasks—shows that the failure wasn’t in the model’s capacity to know, but in its capacity to focus. Repetition forces focus. It creates a resonant echo in the context window, a digital highlighter that screams, “This matters. Look here again.”

As we continue to navigate a world increasingly augmented by artificial intelligence, this paper serves as a humbling reminder. The bleeding edge of technology isn’t always found in the most complex equation; sometimes, it’s hidden in the most basic principles of communication.

Whether you’re prompting a billion-parameter language model or trying to connect with the human sitting across from you, the lesson is clear.

Clarity isn’t just about the words you choose. It’s about giving those words the space, the resonance, and the repetition they need to be truly understood.

Say it once to be heard; say it twice to be understood.

Categories
AI India

Intelligence as a Public Good: India’s “AI ka UPI” Revolution

There is a recurring rhythm to human progress: a breakthrough is born as a luxury, matures into a commodity, and ultimately solidifies into infrastructure.

We saw it with electricity, we saw it with the internet, and in 2016, we saw India do it with money through the Unified Payments Interface (UPI). UPI took the friction out of digital finance, transforming it from a walled garden guarded by private banks into a digital public good.

Now, it appears India is attempting to do for intelligence what they did for payments.

The global narrative around Artificial Intelligence is currently dominated at one end by massive private moats. At the other end are various open source/open weight efforts.

Silicon Valley primarily approaches AI as a capital-intensive arms race. Trillion-dollar tech players ramp huge compute, train very large models, and rent out intelligence via by the drink APIs. This intelligence is a proprietary and monetized luxury.

Enter the “AI ka UPI” initiative and the IndiaAI Mission discussed by Ashwini Vaishnaw at this week’s India AI Impact Summit.

Instead of treating AI as a product to be sold, India is architecting it as a Digital Public Infrastructure (DPI). The government is doing the heavy lifting—subsidizing the compute, curating population-scale datasets, and building foundational models.

Currently, they are making over 38,000 GPUs available to startups and researchers at around ₹65 (less than a dollar) an hour, a sheer fraction of the global cost. They are rolling out sovereign stacks like BharatGen and conversational models fluent in 22 regional languages.

“They are building an ‘orchestration layer’ for cognition.”

If a developer wants to build a voice-agent to help a rural farmer diagnose a crop disease, they don’t have to worry about the backend compute, the dataset acquisition, or paying a premium to a tech giant. They just plug into the public rails.

As I watch this unfold, I am struck by the philosophical shift it represents. We have become deeply conditioned to view AI through the lens of scarcity and subscription. But what happens when intelligence becomes a public utility?

It shifts the center of gravity of innovation. It becomes about who can solve the most acute, localized, human problems. The friction of creation drops to near zero. A bootstrapped team in a tier-two city can suddenly wield the same computational reasoning as a VC funded Silicon Valley startup.

There is also an element of sovereignty here. In the 21st century, relying on foreign infrastructure for your population’s cognitive processing seems akin to relying on a foreign nation for your electricity. True technological independence requires sovereign AI—models trained on indigenous data, reflecting local culture, nuances, and values, rather than the implicit biases of others.

The implications could be staggering. We are moving from an era where AI is an elite tool to an era where it is the invisible, ubiquitous fabric of daily life for over a billion people.

The true measure of AI’s ultimate impact won’t be found in benchmark scores on a server farm. It will be found in the quiet dignity of a citizen accessing global markets through a vernacular voice assistant, or a rural clinic predicting patient outcomes with public compute.

I look forward to following India’s AI efforts as this and other AI initiatives are more clearly defined.

Questions to consider

1. The Value of Human Capital: If artificial intelligence becomes as ubiquitous, reliable, and cheap as public electricity, what uniquely human skills will become the new premium in a hyper-automated society?

2. Cognitive Sovereignty: How will the geopolitical landscape shift when emerging economies no longer need to import their “cognitive infrastructure” and inherent cultural biases from Western tech players?

3. The Centralization of Truth: When a government builds and curates the foundational AI models for over a billion people, where is the line between providing a democratized public good and engineering a centralized cultural narrative?

What else???

Categories
AI Anthropic Claude Cybersecurity

The End of Obscurity

There is a particular kind of silence that surrounds a zero-day vulnerability. It is the silence of something waiting—a flaw in the logic, a gap in the armor, sitting unnoticed in the codebase for years, perhaps decades. We have slept soundly while these digital fault lines ran beneath our feet, largely because we assumed that finding them required a brute force that no one possessed, or a level of human genius that is incredibly rare.

But the silence is breaking.

I was reading Anthropic’s Red Team report from earlier this week (triggered by reading Bruce Schneier’s amazement), specifically their findings on the new Opus 4.6 model. The technical details are impressive, but the philosophical implication is what stopped me, like Bruce, cold.

For years, digital security has relied on “fuzzers”—programs that throw millions of random inputs at a system, banging on the doors to see if one accidentally opens. It is a noisy, chaotic, brute-force approach.

The new reality is different. As the report notes:

“Opus 4.6 reads and reasons about code the way a human researcher would—looking at past fixes to find similar bugs that weren’t addressed, spotting patterns that tend to cause problems.”

This is a fundamental phase shift. We are moving from the era of the Battering Ram to the era of the Jeweler’s Loupe. The machine is no longer guessing; it is understanding.

There is something deeply humbling, and slightly terrifying, about this. We have spent the last half-century building a digital civilization on top of code that we believed was “secure enough” because it had survived the test of time. We trusted the friction of complexity and the visibility of open source to keep us safe. We assumed that if a bug had existed in a core library for twenty years, surely it would have been found by now.

But the AI doesn’t care about time. It doesn’t get tired. It doesn’t have “developer bias” that assumes a certain function is safe because “that’s how we’ve always done it.” It simply looks at the structure, reasons through the logic, and points out the crack in the foundation that we’ve been walking over every day.

We are entering a period of forced transparency. The “security by obscurity” that held the internet together is evaporating. When intelligence becomes commoditized, vulnerabilities become commodities too. The question is no longer “is my code secure?” but rather, “what happens when the machine sees the flaws I cannot?”

It’s a reminder that complexity is a loan we take out against the future. Eventually, the bill comes due. We are just lucky that, for now, the entity collecting the debt is one we built ourselves, designed to tell us where the cracks are before the ceiling collapses. Let’s hope that we are out far enough in front of it.

Categories
AI Software

The Thermodynamics of Thought

For the last two decades, we have lived in the era of zero marginal cost. The defining characteristic of the internet age was that once software was written, distributing it to the billionth user cost virtually the same as distributing it to the first. We grew accustomed to the economics of abundance—infinite copies, infinite reach, lightweight infrastructure.

But the recent commentary regarding the true nature of Artificial Intelligence forces a jarring mental correction:

“AI is not software riding on old infrastructure. It is a new industrial system that converts energy into intelligence – requiring a capital stack measured in trillions, not billions.”

This distinction is not merely semantic; it is physical.

When we view AI through the lens of traditional SaaS (Software as a Service), we miss the magnitude of the shift. We are looking for an app; what is being built is a refinery. We are witnessing a return to heavy industry, but the commodity being refined isn’t crude oil—it is information, and the byproduct is reasoning.

This requires us to think less in terms of code and more in terms of thermodynamics. In this new industrial system, intelligence is an energy-intensive output. Every token generated, every inference drawn, requires a specific, measurable conversion of electricity into heat and computation. Unlike the static code of a website, an AI model is a furnace. It must be fueled constantly.

This explains the capital stack. We are seeing numbers that seem irrational in the context of venture capital—trillions, not billions. But if you view a data center not as a server farm, but as a power plant that generates intelligence, the numbers align with historical precedents. We are not funding startups; we are funding the modern equivalent of the electric grid, the transcontinental railroad, or the petrochemical complex.

We are pouring concrete, smelting copper, and manufacturing silicon on a planetary scale. The “cloud” was always a misleading metaphor—it sounded fluffy and ethereal. The reality of the AI transition is heavy, hot, and incredibly expensive.

We are moving from an era where we organized the world’s information (low energy) to an era where we synthesize new reasoning (high energy). We are building a machine that eats electricity and excretes intelligence. That isn’t a software update; that is a new industrial revolution.

Categories
AI

The Alien in the Silicon

I recently found myself listening to a conversation with Anna Goldie and Azalia Mirhoseini, the founders of Ricursive Intelligence, discuss the future of chip design. Here’s the video.

On the surface, it’s a conversation about efficiency—about breaking the bottleneck between how fast we build AI models and how slow we build the chips that run them.

But as I listened, I felt that prickly sensation of standing on the edge of a paradigm shift that is both exhilarating yet slightly terrifying.

We are witnessing the transition from “Fabless” to “Designless.” Just as TSMC allowed companies to build chips without owning a factory, Ricursive wants to allow companies to build chips without employing a single chip designer.

They call it a “Cambrian explosion” of custom silicon—chips for hearing aids, chips for space data centers, chips for specific neural networks. This democratization is fascinating. It promises a world where hardware is as fluid and adaptable as software.

“The straight line is a human invention. The future of silicon is curved, chaotic, and completely alien.”

But here is what disturbs me, and perhaps what should give us pause.

Goldie and Mirhoseini talk about the designs their AI agents create. When humans design chips, we think in Manhattan geometry: straight lines, neat blocks, logical order. We crave readability and structure. When their AI, originally born from the AlphaChip project at Google, designs a chip, it creates “alien” structures. It draws curves. It makes donut shapes. It creates layouts that look less like engineering diagrams and more like organic, biological growths.

The engineers’ initial reaction was displeasure. They looked at these chaotic, curved designs and rejected them. It wasn’t until later data proved undeniably that these “alien” layouts were faster, smaller, and more efficient that the humans conceded.

This seems like the “Move 37” moment for hardware. We are handing over the architecture of our physical reality to an intelligence that optimizes for physics, not for human comprehension. Some additional quick thoughts…

What should we be surprised by?

We should be surprised by the geometry of efficiency. It turns out that the rigid, orthogonal logic we humans (and our EDA software tools to date) have imposed on silicon for decades was a human constraint. The AI is showing us that the “natural” state of high-performance compute looks … weird. It looks biological.

What should we be afraid of?

We should be wary of the recursive loop itself. The company is named “Ricursive” for a reason: AI designs better chips, which train better AI, which designs even better chips. It is a closed loop of self-improvement. As we move to a “design-less” world, we are effectively stepping out of that loop. We become the requesters, the “vibe coders,” while the actual logic of the machine infrastructure becomes increasingly opaque to us. Seems like we’ve been evolving that way anyway in chip design – but this feels like an earthquake really shaking things up.

We seem to be building a foundation for our civilization that we may soon be unable to read, optimize, or fully understand. We are trading interpretability for performance.

And while the speed and performance is intoxicating, it is disturbing to realize yet again that the engine driving our future is becoming a black box—not just in its software, but in its very atoms.

Ricursive said they’re planning to release their initial product with a year. I’ll be watching from the sidelines – anxious and excited!