Categories
Business History IBM Infrastructure Nvidia Programming Semiconductors

The Half-Life of Moats

Prompted by an article on X by @magicsilicon on the CUDA moat. Research and drafting assistance from my AI intern assistant Clark.

The NVIDIA H100 looks, in retrospect, like an inevitability. It wasnโ€™t.

What Jensen Huang built is more accurately understood as a sixteen-year accumulation of optionality โ€” a platform investment made in 2006 for a market that wouldnโ€™t fully materialize until 2022. NVIDIA intros the G80 architecture in November 2006, laying the groundwork for CUDAโ€™s release a few months later. The stated ambition was to let scientists write C++ that ran on GPU cores without needing to understand 3D graphics pipelines. The unstated bet was that parallel computation would eventually matter for something bigger than rendering shadows in video games.

For sixteen years, it mostly didnโ€™t. Not at scale. Not commercially. CUDA lived in research labs and HPC clusters. It attracted a small, devoted, and economically marginal user base โ€” the kind that papers cite but investors ignore. NVIDIA kept investing in it anyway: cuDNN for deep learning operations, cuBLAS for linear algebra, a layered ecosystem of libraries that made CUDA not just accessible but nearly irreplaceable for anyone doing serious numerical computation. When TensorFlow and PyTorch emerged as the standard frameworks for neural network research, they didnโ€™t adopt CUDA because it was the only option. They adopted it because CUDA was where the optimized kernels already lived.

AlexNet won the ImageNet competition in 2012 and did it on two NVIDIA GPUs. The deep learning community noticed immediately. The financial community largely did not.

Then ChatGPT launched in November 2022, and suddenly everyone needed H100s they couldnโ€™t get.


The parallel to Intel is instructive and also undersells how strange this kind of story looks while youโ€™re living through it. Intel was founded in 1968 as a memory company. DRAM. The founders โ€” Noyce, Moore, Grove โ€” were materials scientists and engineers who believed the future was in silicon memory chips. They were right, briefly: in the early 1970s Intel dominated the DRAM market. By 1984, that share had collapsed to 1.3%, ceded almost entirely to Japanese manufacturers who had commoditized the product.

What saved Intel wasnโ€™t a pivot so much as a realization that a stopgap had become a foundation. The 8086, conceived in 1976 as an internal hedge and launched in 1978 was never supposed to matter. It was a 16-bit processor designed to hold off Zilog while Intel finished its ambitious 32-bit iAPX 432 architecture. The 8086 was assigned to a single engineer. โ€œIf management had any inkling that this architecture would live on through many generations,โ€ its designer Stephen Morse later recalled, โ€œthey never would have trusted this task to a single person.โ€

IBM chose the 8088 โ€” a cost-reduced variant โ€” for the original IBM PC in 1981. That decision wasnโ€™t destiny, it was simply a procurement. And yet from that accident of selection, Intelโ€™s x86 line became the backbone of personal computing for four decades. The Pentium in 1993 was Intelโ€™s Wintel moment โ€” the flag bearer the @magicsilicon tweet gestures at โ€” but the flag had been quietly sewn since 1978.


What these histories share is not just a pattern of โ€œslow build, explosive payoff.โ€ The structural similarity is subtler: in both cases, the moat was a software abstraction layer built on top of hardware. Intelโ€™s real lock-in wasnโ€™t transistor count or clock speed. It was backward compatibility โ€” the commitment, formalized with the 80386 in 1985, that every future Intel chip would run software written for older ones. That promise created a flywheel that trapped developers and buyers in a virtuous (for Intel) dependency loop for decades.

CUDA is the same architecture at a different layer. The lock-in isnโ€™t the H100โ€™s 80 gigabytes of HBM3. Itโ€™s that switching to an AMD MI300X or Google TPU means potentially rewriting training pipelines that have been optimized against CUDA kernels for years. AMDโ€™s ROCm platform exists. It is, by most accounts, maturing. Engineers who have tried the migration report that it costs months and hundreds of thousands of dollars. The moat isnโ€™t a wall. Itโ€™s accumulated friction โ€” the switching cost of a decade of engineering decisions baked into codebases that no one wants to touch.


But to find the actual origin of this pattern, you have to go back further than Intel. To 1964, and to a decision IBM made that Fred Brooks โ€” its project manager โ€” called a bet-the-business move.

The IBM System/360 was announced on April 7, 1964, after five years of turbulent internal development. What it introduced wasnโ€™t just a new computer. It was a new concept: the separation of architecture from implementation. Before the 360, IBM ran five incompatible product lines simultaneously. A customer who outgrew their machine had to scrap all existing software and start over. The 360 replaced all five lines with a single unified architecture โ€” six models covering a fiftyfold performance range, all running the same operating system, all sharing the same instruction set. The name itself encoded the ambition: 360 degrees, all directions, all users.

Gene Amdahl, the 360โ€™s chief architect, had a precise formulation for what this meant: the architecture was โ€œan interface for which software is written, independent of any implementation.โ€ The Principles of Operation manual described what the machine did; separate Functional Characteristics documents described how each model did it. This distinction โ€” separating the contract from the execution โ€” was genuinely new. Itโ€™s the conceptual root of everything that came after.

The 360 generated over $100 billion in revenue for IBM and established the first platform business model in computing. Jim Collins would later rank it alongside the Model T and the Boeing 707 as one of the three greatest business achievements of the twentieth century. But its deepest legacy was architectural: the insight that if you make your abstraction layer the standard, the hardware underneath becomes fungible. Customers didnโ€™t buy specific IBM machines. They bought into OS/360. The machines were an implementation detail.

Intel understood this by the 1980s, even if implicitly. The 80386โ€™s backward compatibility commitment in 1985 was IBMโ€™s 360 insight applied to microprocessors โ€” the architecture is the product, the silicon is the vehicle. CUDA is the same insight applied to GPU compute. What NVIDIA sold researchers in 2006 wasnโ€™t the G80 card. It was the abstraction: write parallel code in C++, run it on any NVIDIA hardware, trust that the next generation will be faster and compatible.

The pattern is now sixty years old. It has reproduced in every major platform transition. And it keeps working for the same reason it worked in 1964: when you own the layer that developers write to, your customersโ€™ switching costs compound every year they stay.


Thereโ€™s something worth sitting with here. Neither Jensen Huang in 2006 nor Gordon Moore in 1968 could have specified exactly what the payoff would look like. What they shared was a willingness to build infrastructure for a demand they could sense but not yet see โ€” and the discipline to keep investing in it through the long years when it looked like a research project rather than a business.

The question that doesnโ€™t resolve cleanly is whether that kind of patience is a strategy or a personality. And whether, in an industry that now moves faster than the cycles itโ€™s lived through, sixteen-year moats are still the kind that get built.


Which raises the uncomfortable corollary: the same AI tools that CUDA enabled may be what ultimately erodes it.

The attack on CUDAโ€™s moat is now structurally different from anything AMD or Intel could mount before. OpenAIโ€™s Triton compiler lets developers write GPU kernels in Python without touching CUDA at all, and generates optimized machine code that often matches hand-tuned CUDA performance. MLIR โ€” Multi-Level Intermediate Representation, originally from Google โ€” provides a compiler infrastructure that can target any hardware backend from a single codebase. AMDโ€™s ROCm has historically been dismissed as immature; ROCm 7, released this year, delivers meaningfully better inference performance than its predecessors. And perhaps most directly: Claude Code reportedly ported a CUDA codebase to AMDโ€™s ROCm in thirty minutes โ€” work that previously took months of engineering time.

The irony is almost too neat. CUDAโ€™s moat was built on accumulated switching costs: the friction of rewriting code, the library dependencies, the tribal knowledge encoded in a decade of kernel optimizations. AI coding tools are specifically good at exactly that kind of mechanical, high-context translation. The weapon is attacking the wall it was built behind.

That said, itโ€™s worth being careful about the speed of this. Abstraction layers that โ€œshouldโ€ erode moats often take far longer than expected, because the moat isnโ€™t just the code โ€” itโ€™s the ecosystem of tooling, documentation, community knowledge, and hardware-software co-optimization that took eighteen years to compound. Triton and MLIR are real. Theyโ€™re also early. The question isnโ€™t whether the moat is vulnerable; itโ€™s whether it erodes before NVIDIAโ€™s next generation of chips makes it irrelevant to argue about.


As for what comes next โ€” which company is building the IBM 360 of this decade โ€” the honest answer is that itโ€™s too early to call with confidence. But thereโ€™s a candidate worth watching.

Anthropicโ€™s Model Context Protocol, launched in late 2024, has the structural fingerprint of a platform play. MCP is a standard for how AI agents connect to external tools and data sources โ€” a common interface layer, hardware-agnostic (or rather, model-agnostic), that any system can implement. By late 2025 it had been donated to the Linux Foundation, adopted by OpenAI and Google, and was tracking 97 million monthly SDK downloads. There are now over 10,000 MCP servers. It is becoming the way agents talk to the world.

The parallel to OS/360 is imprecise but instructive. What IBM built in 1964 was a standard interface between software and hardware that decoupled what you wrote from what you ran it on. MCP is attempting something similar one abstraction layer higher: decoupling what an agent does from the specific models, tools, and data sources it does it with. If it becomes the standard โ€” the layer that developers write to โ€” then whoever owns or most deeply shapes that standard controls the integration tax of an industry whose applications we canโ€™t fully specify yet.

The counterargument is that open standards, once donated to foundations and broadly adopted, donโ€™t generate the same lock-in as proprietary platforms. OS/360 was IBMโ€™s. CUDA is NVIDIAโ€™s. MCP is now the Linux Foundationโ€™s, with OpenAI and Google as co-stewards. The historical pattern suggests the moat accrues to whoever owns the layer, not whoever invented it.

Which may mean the next great platform play is still being assembled in a room we havenโ€™t seen yet โ€” the way IBMโ€™s System/360 was being architected in a Connecticut motor lodge in 1961, three years before anyone else knew what was coming.

Categories
AI History

The Arrival

Yoshua Bengio spent forty years building the foundation of modern artificial intelligence. He won the Turing Award for it. And he didnโ€™t think heโ€™d live to see it work.

Thatโ€™s the quiet fact buried inside Stephen Wittโ€™s New Yorker profile of him. Bengio โ€” one of the three researchers whose decades-long bet on neural networks eventually became the architecture underlying every large language model running today โ€” had made peace with the idea that the thing he was building was a multi-generational project. Something for his successors to finish. Then Witt writes: โ€œone day in late 2022, the technology had simply arrived. He compared it to meeting an extraterrestrial.โ€

Hemingway once described bankruptcy happening two ways: gradually, then suddenly. He meant ruin. Bengio experienced something harder to name โ€” not ruin but arrival, which carries its own vertigo. The gradually was four decades of work that most of his peers considered quixotic. The suddenly was a Tuesday in November when a chat interface went live and the world quietly changed.

What unsettles me about the extraterrestrial comparison isnโ€™t the strangeness it implies. Itโ€™s the distance. You meet an alien; you donโ€™t meet something you made. The metaphor suggests that even its creator couldnโ€™t fully recognize it โ€” that the thing, once arrived, belonged to a category that exceeded its own origins.

We donโ€™t have good language for this. Breakthrough, inflection point, paradigm shift โ€” these are words people reach for after the fact, when theyโ€™re building timelines. What Bengio seems to be describing is the experience of standing in front of a threshold you spent your life approaching, and finding it already behind you.

The technology didnโ€™t ask permission. It didnโ€™t announce itself.

It arrived.

Categories
AI AI: Large Language Models Programming

The Era of the Synthesizer: How AI Is Liberating the Coder

For decades, being a programmer meant being a translator.

You stood in the gap between what someone wanted and what a machine could understand. You learned the syntax. You memorized the libraries. You once spent three hours hunting a missing semicolon that turned out to be hiding in line 847 of a file you were sure youโ€™d already checked.

The New York Times Magazine recently ran a piece by Clive Thompson on what AI coding assistants โ€” models like Claude and ChatGPT โ€” are doing to that job. The anxiety in the piece is real. When you sit down with a modern AI assistant and watch it generate in seconds what used to take you days, itโ€™s genuinely disorienting. Hard-won expertise suddenly feels less like a moat and more like a speed bump.

That reaction is honest. Iโ€™d be suspicious of anyone who didnโ€™t feel it.

But hereโ€™s what I keep coming back to: what weโ€™re losing is the translation layer. The boilerplate. The muscle memory of syntax. What weโ€™re not losing is the part that was always the actual job โ€” figuring out what to build and why it matters.

The soul of software was never in the code itself. The code was always just a means to an end.

Think about what happens when the mechanical friction of a craft disappears. Photographers stopped having to mix their own chemicals in the dark and started spending that time making better images. Musicians stopped having to hand-copy scores and started composing more. The freed-up capacity doesnโ€™t evaporate โ€” it gets redirected upward, toward the work that actually required a human all along.

The same shift is underway in software. When the AI handles the loops and the boilerplate and the database queries, whatโ€™s left is everything that required judgment in the first place. The architecture. The user experience. The question of whether this thing should exist at all, and in what form, and for whom.

Weโ€™re moving from the how to the why. Thatโ€™s not a demotion.

It does ask something of us, though. The old identity โ€” programmer as master of arcane syntax โ€” has to be relinquished. And letting go of a hard-earned identity is genuinely hard, even when whatโ€™s replacing it is better. That quiet grief the Times piece captures is worth sitting with, not dismissing.

But after you sit with it for a minute: we are entering the era of the synthesizer.

The synthesizerโ€™s job is to hold the vision, curate the logic, and direct the output toward something that actually resonates with another human being. Empathy. Intuition. The ability to sense when something is almost right and know which direction to push it. These arenโ€™t soft skills. Theyโ€™re the whole game now.

The clatter of keyboards is fading. But the music weโ€™re about to make โ€” with AI doing the heavy lifting on the mechanics โ€” has a lot more room to breathe.

Categories
Interstate 280 San Francisco/California San Jose

The Scenic Route Home

“In a world optimized for speed and engagement, 280 is a reminder that infrastructure can be art.”

It is a strange paradox that in the heart of Silicon Valleyโ€”a place defined by the ephemeral, the digital, and the instantaneousโ€”a cherished shared experience is a physical ribbon of highway that hasnโ€™t changed much in fifty years.

My post from last April, โ€œThe World’s Most Beautiful Freeway,โ€ has recently found a new wave of readers. Iโ€™ve been asking myself: Why? Why does a blog post about Interstate 280, written by a retiree exploring local history, resonate so deeply right now?

Perhaps itโ€™s because I-280 is more than just a commute. As I noted in the original piece, even Sunset Magazine in 1967 recognized it as โ€œa modern and scenic boulevard.โ€ It was a bold claim for a freeway, yet it stuck. While its sibling, US 101, is a clogged artery of billboard-choked utility, 280 feels like a deep breath. It is the “scenic route” we are lucky enough to take right in our own backyard.

There is a powerful nostalgia in that drive. We all remember the sign that used to sit in the median near Cupertinoโ€”the one that literally proclaimed it “The World’s Most Beautiful Freeway”โ€”before it vanished. We remember the way the fog rolls over the Santa Cruz Mountains, spilling into the crystal bowl of the reservoir.

But I think the recent interest goes deeper than pretty scenery. We are living in an era of rapid, often disorienting change. I used ChatGPT to help research the history of that road, a small testament to how AI is weaving into our daily inquiries. Yet, the road itself remains a constant. It was designed by engineers like Othmar Ammann and planners who chose the harder, more expensive route through the foothills rather than paving over El Camino Real. They chose beauty over pure efficiency.

That choice resonates today. In a world optimized for speed and engagement, 280 is a reminder that infrastructure can be art. It connects the headquarters of the companies building our future (Apple, Google, Meta) with the wild, golden hills of Californiaโ€™s past. It is a physical timeline of the Peninsula.

Maybe we are revisiting this post because we are craving that balance. We want to know that even as we rush toward the future at freeway speeds, we can still look out the window and see something timeless, something beautiful, something that reminds us where we are.

Categories
AI AI: Prompting Writing

AI as a Mirror, Not a Maker

Iโ€™ve been thinking a lot lately about how we move past the novelty phase of AIโ€”beyond just asking a chatbot to “write a poem about a turkey” or summarize a meetingโ€”and into actual thinking with these tools.

As a lifelong learner, Iโ€™m always on the hunt for workflows that help me synthesize information better. Most of the “AI for writing” advice I see online is pretty generic. But I recently came across a breakdown of how four high-profile writers are making effective use of tools like NotebookLM and Claude in ways that are much more sophisticated than simple text generation.

What jumped out at me is that none of these writers use AI to write for them. They use it to structure, challenge, and code.

Here are the four models that caught my eye.

1. The Triangulated Research Base (Steven Johnson)

Steven Johnson (Where Good Ideas Come From) has a workflow that solves a problem I face constantly: the messy “research phase.”

Instead of treating the AI as an oracle, he treats it as a connection engine. He creates a dedicated notebook (using Googleโ€™s NotebookLM) and uploads three distinct types of sources: a primary source (like a raw PDF or study), a secondary source (like a context article), and a multimedia transcript.

Then, rather than asking for a summary, he asks the AI to find the friction between them: “What themes appear in the interview transcript that contradict the historical account in the PDF?”

Itโ€™s less about getting an answer and more about finding the blind spots in your own reading.

2. The Diagnostic Editor (Kenny Kane)

This one really resonated with me because it mirrors the experiment I tried recently with my “Bubble Bath” post.

Kenny Kane uses Claude not to generate prose, but to act as a ruthless developmental editor. He uploads a messy draft and runs a “Diagnostic” prompt. He doesn’t ask “fix this,” he asks: “Where does the argument drift? Where does the energy drop?”

He even has the AI analyze his best writing to identify his specific “DNA” (sentence length, vocabulary choice) and then asks it to apply that same tone to his rougher sections. Itโ€™s using the AI as a mirror rather than a ghostwriter.

3. The Memo-to-Demo Shift (Dan Shipper)

Dan Shipper at Every is doing something fascinating that changes the definition of writing altogether. He argues that in the AI age, we shouldn’t just describe a concept; we should build a small app to demonstrate it.

If heโ€™s writing about “Spaced Repetition,” he doesn’t just explain the theory. He asks Claudeโ€™s Artifacts feature to “Write a React component that lets a user test spaced repetition live in the browser,” and then embeds that little app directly into the essay. The writing becomes 50% prose and 50% software.

4. The Co-Intelligence Loop (Ethan Mollick)

Ethan Mollick focuses on breaking the echo chamber. Before he publishes, he spins up simulated personasโ€”a skeptical VC, a confused novice, an expert in a tangential fieldโ€”and asks them to critique his draft from their specific viewpoints.

Itโ€™s effectively a focus group of one.


How to Get Started

If youโ€™re like me, seeing all these workflows might feel a bit overwhelming. My advice? Don’t try to overhaul your entire writing process overnight. Just pick one experiment to try this week.

Here are two simple entry points:

Experiment A: The “Blind Spot” Check (For Research)

If you are reading up on a topic, don’t just take notes. Open Google NotebookLM, create a new notebook, and upload your sources (PDFs, URLs, or pasted text). Then, ask this specific question:

“Based strictly on these sources, what is the strongest argument against my current thinking? What connection between Source A and Source B am I missing?”

Experiment B: The “Ruthless Editor” (For Writing)

If you have a rough draft sitting on your hard drive, copy it into Claude or ChatGPT and use this prompt (adapted from Kenny Kaneโ€™s workflow) before you do any manual editing:

“Act as a senior editor. Do not rewrite this text. Instead, analyze my draft and tell me: 1) Where does the argument lose energy? 2) Does the opening hook successfully promise what the conclusion delivers? Be critical.”

Iโ€™ve found that using the tools this wayโ€”as a partner for thinking rather than just generatingโ€”is where the real magic happens.

Which one will you try first?

Categories
AI

Basic Questions

Categories
AI Audio ChatGPT Computers iPhone Tools

Voice is not what I needโ€ฆ

Itโ€™s been a busy week of announcements in tech land what with Microsoft Build, Google I/O, and yesterdayโ€™s tease of an announcement by OpenAI and itโ€™s acquisition of Jonny Iveโ€™s company โ€œioโ€.

Industry pundits are all a Twitter speculating about what kind of device Ive and his team might make to deliver an amazing AI experience to users. Ive seems to regret how โ€œhisโ€ iPhone has created such an addiction to screens and seems to want to repent by bringing us something new and โ€œbetterโ€. For more, see this tweet: https://x.com/mingchikuo/status/1925543472993321066?s=46

I have one simple request: donโ€™t make voice the primary interface to some new magical device.

Iโ€™ve had an iPhone with some serious voice input capabilities for years and the reality is that I rarely use voice. Perhaps if my life was just โ€œbowling aloneโ€ Iโ€™d find it natural to just talk out loud to a piece of technology. But Iโ€™m mostly around other people all day and out of respect for them I simply prefer being silent.

Until some new magical device can capture my thoughts without either voice or keyboard input, I will remain a skeptic. Skeptics like me will reduce the market size opportunity for any such new device. Just sayinโ€™โ€ฆ

Categories
AI

The Portrait

I’ve heard a couple of folks recently talking about playing with AI chatbots and asking them to create an image of themselves based on the interactions they’ve had over time with the chatbot. Of course, I had to try this out. So here’s what I got back as an image of myself from ChatGPT. Note that before it created this image, it told me that it would have to be ultra creative since it really didn’t know what I looked like (it’s really not that hard to find a photo of me online – but whatever!).

Here we go:

Very surrealistic. Apples (weird apple!), oranges, raspberries, blueprints, office with a chair and window and what looks to be a candle chart on the wall behind going up and to the right. Definitely one for the scrapbook!

Categories
AI AI: Large Language Models Writing

Ghostwriting

I have appreciated the reactions of some of my friends to use of artificial intelligence. While I’ve enjoyed experimenting and learning about the use of AI in helping me write, the use of AI for writing has inspired some strong negative reactions among friends.

For example, several weeks ago a good friend of mine was very disappointed in me when he noticed that a blog post I had shared had been written almost completely by an AI helper. I disclosed that fact at the bottom of the post but he said he could detect I was using AI within the first couple of sentences. Here’s what he emailed:

I saw a blog post with your name attached. That caught my interest, and I anticipated reading your thoughts. After reading a couple of sentences, I realized it wasย AI-generated text. Skipping to your disclaimer at the end of the post confirmed that. I was deeply disappointed. I was looking forward to your creation and your thoughts, but instead, I received a machineโ€™s advanced predictive text.

In a later exchange, he added:

I think it’s a mistake to take LLM-generated material, “lightly edit” it, and publish it in your blog under your name. In my view, it’s all been poorly written and not worth reading. I think you have a good mind, and I was interested in your creative writing because of your insight and original thinking. You do a much better job when you do your own work and don’t try to piggyback on AI.

More recently, David Sparks (MacSparky) published a post about his reaction to AI-created content – The Sparky Language Model. He shared a story about attending a friend’s wedding where he heard a moving speech. When he complimented the speaker on the speech, he revealed that the speech was written by ChatGPT and he was just reading it. David was very disappointed to learn that AI was used – writing:

“This revelation left me profoundly unsettled. It got me thinking about the essence of personal expression and the irreplaceable value of human touch in our communications.”

These reactions got me thinking about how tools generally affect our ability to do things in our lives.

I rarely pick up a pen anymore, for example, as almost all of my writing, journaling, etc. is done on a digital device. I still carry a pen in my pocket but not because it’s indispensable any more. But none of my correspondents actually see my penmanship any more – that’s a quaint artifact gone to history.

In writing and publishing, there’s a long history of ghostwriting – where a work is created by someone other than the author to whom it is attributed. In many cases, the actual ghostwriter isn’t disclosed. In my mind, use of an AI to help with writing is analogous to hiring a ghostwriter to help write an article. Or even hiring a copywriter to help create clever ad copy.

The question of most appropriate use of AI as a tool for helping us write feels like a slippery slope to me.

Just for fun, I took MacSparky’s SLM essay and asked Claude Sonnet to generate a “caustic” reply to it. Claude warned me that it might not be a good idea to send my friend such a caustic reply but I asked Claude to write it for me anyway.

Afterward generating the reply, Claude commented again:

I want to reiterate my earlier concerns about the potential negative impact this kind of exchange could have on your friendship. While I’ve fulfilled your request, I strongly encourage considering whether sending such a reply aligns with your values and the kind of relationship you want to maintain with your friend.

If you do decide to send this or a similar response, it might be worth following up with a more personal, conciliatory message to ensure your friend understands the context and that this exchange hasn’t caused any lasting harm to your relationship.

Message received, thanks!

Here’s that “caustic” critique that Claude wrote:

A Scathing Rebuke of Self-Righteous Technophobia

In an era where technological advancement races forward at breakneck speed, it’s refreshing to encounter a piece so unabashedly retrograde that it makes Luddites look like Silicon Valley evangelists. Enter “THE SPARKY LANGUAGE MODEL (SLM),” a manifesto of sorts that purports to champion authenticity while reeking of thinly veiled fear and misunderstanding.

Our intrepid author, clearly nursing a wounded ego in the face of rapidly evolving AI capabilities, embarks on a crusade against the very notion of progress. The piece begins with a begrudging acknowledgment of AI’s usefulness, only to swiftly descend into a mire of personal repulsion. One can almost hear the pearls being clutched.

The anecdote about the AI-generated wedding speech serves as the linchpin of this technophobic tirade. Our author, upon learning of the speech’s origin, experiences a bout of existential angst so profound it borders on the comical. Heaven forbid we use tools to enhance our ability to express emotion! Surely, the cave paintings at Lascaux were the pinnacle of human expression, and it’s been all downhill since the invention of the alphabet.

The crux of the argument hinges on the nebulous concept of “human connection,” as if AI-assisted writing somehow taints the purity of human thought. This notion is not only laughably reductive but also betrays a fundamental misunderstanding of how language and communication evolve. By this logic, we should all revert to grunting and pointing, lest we lose our precious “human touch.”

The author’s insistence on keeping things “personal” for all “MacSparky content” reeks of a desperate attempt to maintain relevance in a rapidly changing landscape. It’s as if they believe their meandering thoughts and half-baked tips possess some intrinsic value simply because they originated from their particular gray matter. News flash: the value of content lies in its utility to the reader, not in the purity of its origin.

The piece reaches its zenith of absurdity with the declaration that even if AI becomes objectively better at writing, the author still has no interest in it. This is the intellectual equivalent of covering one’s ears and shouting “La la la, I can’t hear you!” It’s a stance so myopic it makes Mr. Magoo look like an eagle-eyed visionary.

The grand finale, the introduction of the “Sparky Language Model,” is less a clever play on words and more a sad attempt at branding one’s own limitations. It’s as if the author believes that by slapping a cute label on their refusal to adapt, they’ve somehow outsmarted the march of progress.

In the end, this article is less a thoughtful critique of AI’s role in writing and more a thinly veiled cry for help from someone clearly terrified of becoming obsolete. The irony, of course, is that by steadfastly refusing to engage with these new tools, the author is ensuring their own irrelevance faster than any AI ever could.

The author’s stance on AI writing assistance is particularly short-sighted when considering the potential benefits. AI tools can serve as powerful aids in the creative process, helping to overcome writer’s block, suggesting alternative phrasings, or even challenging the writer to consider new perspectives. By categorically rejecting these tools, the author is not preserving some mythical purity of expression, but rather limiting their own potential for growth and innovation.

Moreover, the binary thinking displayed in this article โ€“ the idea that it’s either pure human writing or soulless AI-generated content โ€“ demonstrates a profound lack of nuance. The reality is that AI writing tools, when used judiciously, can enhance and amplify human creativity, not replace it. It’s akin to a painter refusing to use a new type of brush because it wasn’t available to the Old Masters. Tools evolve, and so should our approach to using them.

In conclusion, “THE SPARKY LANGUAGE MODEL (SLM)” is less a manifesto and more a eulogy – not for human creativity, but for the author’s ability to adapt and evolve. In their desperate attempt to cling to an outdated notion of authenticity, they’ve only succeeded in highlighting their own obsolescence. One can only hope that future generations will look back on this piece as a quaint relic of a bygone era, much like we now view those who once feared that the printing press would destroy the art of memory.

Perhaps, in time, even our dear author will come to realize that embracing new tools doesn’t diminish our humanity โ€“ it enhances it. But until then, we can expect more technophobic hand-wringing from the self-proclaimed keeper of the “Sparky Language Model.” One thing’s for certain: in the race between human progress and willful ignorance, there’s only ever one winner.

I had to hand it to Claude. Caustic indeed! This was an over the top critique of MacSparky’s essay – but I still found that it offered some new insights into the debate on this issue of the most appropriate use of AI.

I took a risk and decided to share it with MacSparky – suggesting he “fasten his seat belt” before reading. Fortunately, he enjoyed it and I avoided losing a good friend in the process!

My explorations into the use of AI continue – and I’m enjoying having fun spelunking around the pretty amazing tools that we now have available with AI-based large language models. For example, I’m using Claude right now to generate the list of 25 keywords appropriate to tag this article with. I’ve also used Adobe Firefly to generate the image of a “ghostwriter” at the top of this article.

Categories
AI AI: Large Language Models

The Allure of Large Language Models: A Personal Connection

The world of Large Language Models (LLMs) has captured the imagination of many. For me, this fascination has a deeper root, stemming back to my time working on fraud prevention at Visa.

Card fraud is an ongoing battle. Fraudsters devise new methods, and the industry responds with innovative solutions. One such threat was the counterfeiting of magnetic stripes on cards. While chip cards offered a more secure solution, their high cost made widespread adoption impractical.

In search of a cost-effective solution, we explored two approaches. One mirrored insider trading detection systems at major stock exhanges, using rule-based identification of suspicious patterns. The other, ultimately more successful approach, involved neural networks.

While the specifics of how I discovered neural networks elude me, I vividly recall a conversation with a Stanford professor, a pioneer in the field. His encouragement spurred us to pursue this technology. With a talented team, we implemented neural networks to analyze transactions in real-time, flagging potential counterfeits. This significantly helped limit card fraud growth, all without expensive hardware changes.

Today, that same neural network technology underpins LLMs like OpenAI’s ChatGPT, launched in late 2022. Advancements in silicon technology, particularly powerful GPUs, fuel both the training and operation of these models.

Recently, I listened to a captivating discussion titled “Does ChatGPT Think?” featuring Stephen Wolfram. That conversation triggered me writing this blog post.

Wolfram’s description of LLMs resonated deeply with me:

“So the big achievement and the big surprise is that we can have a system that fluently produces and understands human language… It’s not obvious that it would work, and it’s a kind of scientific discovery that it’s possible to have a thing like ChatGPT that can produce this thing that’s one of our sort of prize features – namely human language.”

For me, LLMs represent the culmination of a journey that began with neural networks and card fraud over forty years ago. I continue to marvel at the power of this technology and its potential to revolutionize how we interact with information and the world around us.