AI AI: Large Language Models AI: Transformers

Large Language Models and Transformers

This morning I noticed an article titled “Understanding AI” in the Financial Times about an upcoming two-week series on artificial intelligence. That prompted me to just ask ChatGPT to write a 1,000 word blog post describing large language models (LLM) and transformers. Below is what I got back from ChatGPT…a pretty useful description of both! By the way, today’s Wall St. Journal has an article titled “Sports Illustrated Publisher Uses AI to Write Articles” that discusses how some publications are beginning to use AI chat bots to generate stories for publication.

Also earlier this week I watched a YouTube video from Stanford with Professor Christopher Potts titled GPT-3 and Beyond – a bit technical for me but he provided a fascinating introduction to the progress in AI.

Large Language Models and Transformers

Large language models and transformers are artificial intelligence technologies that have been revolutionizing the field of natural language processing. These models are capable of performing complex language tasks such as text classification, question answering, and machine translation. In this post, we will provide a comprehensive overview of large language models and transformers, their history, how they work, and their current and future applications.

A brief history of language models

The study of language models has a long history dating back to the 1950s when the first attempts to build machine translation systems were made. At that time, the focus was on building rule-based systems that relied on pre-defined grammar and syntax rules to translate between languages. However, these systems were limited in their ability to handle ambiguity and variability in natural language.

In the late 1980s, the field of machine learning emerged, and language models started to use statistical techniques to improve their performance. Over time, researchers have developed more sophisticated models that can learn from large amounts of data and achieve better performance on various language tasks.

Introduction to large language models

Large language models are deep neural networks that are trained on massive amounts of text data to perform a wide range of natural language processing tasks. The key innovation behind these models is the ability to process and understand the context of words and phrases in a sentence. This allows them to generate more accurate predictions and perform more complex language tasks than traditional rule-based systems.

Large language models are trained on billions of words from a vast variety of text sources, including books, news articles, websites, and social media. This enables them to learn the patterns and relationships between words and sentences, as well as the meaning of words and phrases in different contexts.

Introduction to transformers

Transformers are a type of large language model that have been specifically designed for natural language processing. They were introduced in 2017 by Vaswani et al. in a paper called “Attention is All You Need.” The key innovation behind transformers is the attention mechanism, which allows the model to focus on specific parts of the input sequence when making predictions.

Traditional language models process sequential data by breaking it down into smaller pieces and processing each piece in a linear fashion. This makes it difficult for the model to consider the relationships between words that are far apart in the input sequence. Transformers, on the other hand, use attention mechanisms to allow the model to weigh the importance of different parts of the input sequence when making predictions.

How transformers work

Transformers work by first encoding the input sequence into a set of continuous vectors. These vectors are then processed by a number of self-attention layers, which use the attention mechanism to focus on specific parts of the input sequence when making predictions.

The attention mechanism in transformers works by calculating the relationships between all pairs of words in the input sequence. The model then uses these relationships to weigh the importance of different parts of the input sequence when making predictions. This allows the model to consider the context of words and phrases in the input sequence, leading to more accurate predictions.

After the self-attention layers, the encoded vectors are processed by a feed-forward neural network to make the final predictions. The feed-forward network includes a number of dense layers, which are fully connected layers that allow the model to make more complex predictions.

Applications of large language models and transformers

Large language models and transformers have a wide range of applications in the field of natural language processing. Some of the most notable applications include:

  • Text classification: Large language models can be used to classify text into different categories, such as sentiment analysis, topic classification, and spam detection.
  • Question answering: Large language models can be used to answer questions by extracting relevant information from a text corpus and providing a concise answer.
  • Machine translation: Transformers have been shown to be highly effective for machine translation, as they are able to handle the complexity of translating between languages and capture the relationships between words and phrases in different languages.
  • Summarization: Large language models can be used to summarize long documents into shorter versions that retain the main information.
  • Dialogue systems: Large language models can be used to build dialogue systems that can understand and respond to human language in a natural way.
  • Generative models: Large language models can also be used for generative tasks, such as writing creative fiction or composing music.

Current and future developments

In recent years, there has been rapid progress in the development of large language models and transformers. This is due in part to the availability of massive amounts of text data and the advancement of computational resources, which have allowed researchers to train models that are larger and more complex than ever before.

One of the current challenges in the field is reducing the computational cost of training and using large language models. This is important because large language models require vast amounts of computational resources, and their deployment is limited by the availability of these resources.

There are also ongoing efforts to improve the ethical and social implications of large language models and transformers. For example, there is concern about the potential for these models to reinforce biases and perpetuate harmful stereotypes. Researchers are working to develop methods for mitigating these risks and ensuring that large language models are used in a responsible and ethical manner.


Large language models and transformers have been making a significant impact on the field of natural language processing. Their ability to process and understand the context of words and phrases in a sentence has led to advances in a wide range of applications, including text classification, question answering, and machine translation. As these models continue to evolve, we can expect to see even more exciting developments in the future.

Apple General Magic Innovation

Magic in the Air

I’m not sure what provoked my interest earlier this week, but I enjoyed watching the documentary from a few years ago on the history of General Magic. Living in Silicon Valley, I remember hearing stories about the company – and the talented people working at the company led by CEO Marc Porat.

Watching the documentary provided some great background into how the company innovated in so many ways yet wasn’t able to be a commercial success. Lessons learned included how great ideas fail when they’re too early for the technology to support them, the implications of trying to partner with other large companies, and the dangers of competition that “fast follows” on your ideas.

Last night, I stumbled across another fun video produced by the Computer History Museum about the history of MacPaint, Quickdraw, and more as told by Bill Atkinson and Andy Hertzfeld. They were both early employees at Apple, very important software developers on the Mac platform, who left Apple to be early founders of General Magic. Listening to them talk about the early days of development at Apple was very interesting.

Aviation Travel

Ode to a Queen

A few days ago, Boeing rolled out the last production model of its 747 line of wide body aircraft. Wikipedia noted:

The first flight took place on February 9, 1969, and the 747 was certified in December of that year. It entered service with Pan Am on January 22, 1970. … The final 747 was delivered in January 2023 after a 54-year production run, with 1,574 aircraft built.

I have many fond memories of this great airplane – from seeing it for the very first time flying in to land at San Francisco International Airport (which must have been in 1970) to my first flight which – best of my recollection – was on TWA flying from San Francisco to New York. My last 747 flight was on a Lufthansa 747-8 flying out of Bangalore to Frankfurt in 2016. 

I had a few flights in a seat on the 747’s upper deck – a special treat! The upper deck was also where the cockpit was located – it always seemed to be so high up off the runway that landing a 747 seemed to require some special skill and depth perception! On the other hand, the airplane’s landing gear really smoothed out landings as it had this swing mechanism where the rear wheels on the main landing great touched down first and then pivoted to smoothly bring down the front wheels. The early models of the 747 had a circular staircase to the upper deck where there was a lounge instead of seating. This upper deck cockpit design also facilitated adding a nose door which pivoting upward in the freighter version of the 747. In fact, the last 747 delivered was a freighter to Atlas Air.

Speaking of Atlas Air and that last 747, after delivery it flew from Seattle to Cincinnati as it was put into service. The pilots on the first flight of that last 747 has some fun on their flight – trading a lovely tribute to the “Queen of the Skies” on their radar track.

NASA used a special version of the 747 as the transporter for the space shuttle. In 2012, NASA flew a final flight of its 747 carrying the shuttle Endeavor to its new home in a museum in Los Angeles. Along the way, the NASA 747 toured the San Francisco Bay Area and I was fortunate to be able to make a few images of that flight:

While the last 747 has been delivered by Boeing, it will continue to be flown for many years ahead. Most airlines have removed 747s from their fleets but a few (including Lufthansa) continue flying them. I’d enjoy taking another trip on one of these beautiful aircraft. In the meantime, I’ve got lots of good memories about trips and sightings of the 747.

Update: February 03, 2023 – a few additional thoughts on the 747.

  • For many years, two 747’s have provided air transport for the US presidents – and the current 747’s are soon (?) to be replaced by two new models that have been undergoing customization for some time.
  • Wired added an article about the negatives of the 747 – saying it should have been retired “many years ago”. The article also adds more interesting details about the history of the 747. Fuel economy is the biggest detractor: “A Boeing 747-400, which was manufactured between 1989 and 2009, costs around $26,635 an hour to run. A Boeing 787-8, which is still produced today, costs $14,465 an hour to operate—45 percent cheaper.”

Update: February 04, 2023 – The New Yorker also has an new article about the 747 titled “The World the 747 Didn’t Predict“.

Because the 747 could now seat more travellers on a single flight, airlines were able to sell more tickets at lower prices. Suddenly, travel, particularly intercontinental travel, was accessible to people who had rarely, if ever, been in the air. The 747, in a sense, taught the world to fly.


What’s love got to do with it?

Like many of you I’m sure, for many, many years my go-to purchasing behavior has started with doing a search on Amazon to look for whatever it is I might be needing to buy. Rather than wasting time trying to shop locally or doing online searches for items, it was just easier and simpler to do that quick Amazon search for an item.

With Prime, Amazon had biased my behavior in favor of doing a quick purchase from them rather than looking elsewhere. I had confidence that I was getting a fair price and that the item would be in my hands within a day or two. Maybe it’s too strong a word for something as mundane as shopping, but I did love the experience of buying from Amazon – because it was so simple an easy.

But that’s changed – slowly over time – and I’m not the only one who’s noticed it. A couple of recent stories (among others) highlight this:

  • New York Magazine: The junkification of Amazon – “Late last year, The Wall Street Journal reported that Amazon’s customer satisfaction had fallen sharply in a range of recent surveys, which cited COVID-related delivery interruptions but also poor search results and “low-quality” items. More products are junk. The interface itself is full of junk. The various systems on which customers depend (reviews, search results, recommendations) feel like junk.”
  • Barry Ritholtz: How Amazon became ordinary – “Generally speaking, I have been a satisfied Amazon consumer – at least up until the pandemic. That was where cracks in the Amazon armor began to show.”

This unfortunate evolution in how Amazon feels to me, a long-time satisfied customer, makes me sad and disappointed. Clearly, I’m not alone. I wish the senior management team at Amazon would sit up and pay better attention to the impacts their “optimization efforts” have been having on the satisfaction of long-time customers. They must understand that an erosion of satisfaction leads directly to switching behavior.

More recently, Amazon announced the elimination of another benefit (Amazon Smile) which, in the overall scheme of things, wasn’t a big deal – but which did make me feel better about my spending at Amazon. Amazon Smile benefited a local non-profit by donating 0.5% of my spending on Amazon at no additional cost to me. This really feels like a “penny wise, pound foolish” move – one that doesn’t lift up my opinion of the company. Another brand diminishing move.

Meanwhile, this morning an Amazon blog post trumpeted how it’s ranked as a widely admired company in a recent Fortune survey: “Amazon ranks No. 2 for the seventh year in a row.”

There is so much to admire about Amazon and everything it provides. Yet it feels to me like it’s slipping and leaving me disappointed. I’d like to get back to feeling delighted with the Amazon shopping experience instead.

Let’s hope the company’s management realizes I’m not alone.


The Ghost in the Box

I happened to hear a discussion on a YouTube video yesterday about Niklas Luhmann’s mention of how his “zettlekasten” seemed to act like a “ghost in the box”. I did a quick Google search for that phrase but didn’t find anything particularly helpful.

Earlier I had watched Nick Milo’s video about how he’s been using OpenAI’s ChatGPT for research and other things. He talks about a near future where he expects to have it nearby whenever he’s working on his computer.

So I decided to ask Chat GPT to help me understand more about the “ghost in a box” by writing a 1,000 word article about it. Here’s what I got back from ChatGPT:

The Ghost in the Box

The concept of “ghost in the machine” was popularized by philosopher Gilbert Ryle in his 1949 book “The Concept of Mind.” The phrase refers to the idea that human consciousness and mental states cannot be reduced to mere physical processes in the brain.

German sociologist and systems theorist Niklas Luhmann took this concept a step further in his theory of social systems. He proposed that society itself can be seen as a “ghost in the box,” a complex system that is made up of smaller, interconnected parts that cannot be fully understood or controlled by any single individual or group.

According to Luhmann, society is made up of various “function systems,” such as the legal system, the economic system, and the political system. These systems operate independently of one another, but are also interconnected and dependent on each other. Each system has its own “code,” or set of rules and norms that govern its operation.

Luhmann argues that society as a whole can be seen as a “communication system,” which is constantly producing and processing information. This information is then used by the various function systems to make decisions and take action. However, the complexity and sheer volume of this information make it impossible for any one individual or group to fully understand or control the system.

This idea of society as a “ghost in the box” has important implications for how we understand and interact with the world around us. For example, it suggests that we can never fully understand the causes of social phenomena or predict their outcomes. It also implies that we should be cautious about attempts to manipulate or control the system, as these efforts are likely to have unintended consequences.

Furthermore, Luhmann’s theory also highlights the importance of communication and understanding the code of the different function system in order to function in the society. He also emphasizes on the importance of social differentiation, as it allows different function system to operate independently while still being interconnected.

In conclusion, Luhmann’s concept of society as a “ghost in the box” offers a unique perspective on the complexity and unpredictability of social systems. It reminds us that we can never fully understand or control the world around us, and that our actions and decisions are likely to have unintended consequences. Thus, it is important to be aware of the limitations of our understanding, and to approach the world with humility and caution.

Note: This was generated using ChatGPT Jan 9 Version.

Living Tracy Loftesness

Back to a Better Future

Half Moon Bay – 2006

Last year, in my first blog post titled “Life is a Contact Sheet“, I said:

Happy New Year! Let’s work towards better outcomes in 2022 than we had in 2021! Like most I’m looking forward to leaving 2021 behind and excited about what the future could bring! Now onward to my first post of this new year!

While I was excited for what 2022 might bring, I now look back on this past year with much sadness – as the hopes of me and many others just didn’t happen and way too much sadness and gloom came into our world instead.

Late last month, our daughter Tracy passed away – the victim of a very aggressive cancer that was identified only weeks before and which failed to yield to treatment. Tracy was the true adventurer in our family – and her spirit lives on to inspire us towards a better future. We have really appreciated all of the kind words and shared memories from so many who knew Tracy.

New Year’s Eve each year is our Dad’s birthday – bringing back so many memories of our times with him. 2022 was his 101st birthday. He also loved Tracy very much.

Of course there we some bright spots in 2022 too – although it’s difficult at times to remember those and to keep them in perspective. The challenge – and ultimately our reward, of course – is to remember that life is full of these ups and downs.

So here’s hoping for a better future in 2023 – we’ll hug our loved ones tighter, try to do a better job being good friends, and savor those good moments and memories when they come along. Onward!

In November 2010, Tracy and I took a cooking class at Village Pub in Woodside – learning how to cook a Thanksgiving feast. We had such a good time that day – a great memory.

Books Future Quotations

Thinking about the future…

Virtually everyone thinks in first person when they imagine their recent past, present, or near future. Likewise, almost everyone switches to third person when they think about their far past or far future, usually defined in the scientific literature as ten years in either direction from today. This shift in mental perspective is why you can often look back at emotionally charged moments in your life, after enough time has passed, and see things from a more detached, clearer point of view. Your brain is literally processing them from a more insightful vantage point. Likewise, this is why taking a mental time trip ten years to the future can help you feel “unstuck” emotionally. You momentarily get a break from your normal mode of thinking and feeling and get to float above it all, like a satellite looking down from space.

Jane McGinigal – Imaginable

AI ChatGPT Haiku

Haiku with ChatGPT

Like over a million others, I’ve been enjoying playing with the new ChatGPT tool from Among many other things, ChatGPT can help generate haiku – here are some examples from a recent session of mine:

Write a haiku about film photography

In moments captured
Light on film, forever frozen
Art of the past lives

Write a haiku about a quiet morning in a boat on a lake

Quiet morning calm
Ripples gently lap the shore
Birds in harmony

Write a haiku about snow falling on cedars

Snowflakes fall like tears
Upon the cedars’ green boughs
Winter’s silent grace

Write a haiku about stillness in Kyoto

In the still of night
Kyoto’s ancient temples bask
In the moon’s soft glow

Write a haiku about watching the first rain of the year

First rain of the year
Falling gently from the sky
Refreshing the earth

Write a haiku about geese flying overhead

Geese fly in formation
Honking as they pass overhead
Autumn’s wild symphony

Write a haiku about farming with a John Deere tractor

Green John Deere tractor
Plowing fields under bright blue sky
Farming’s steadfast work

Write a haiku about writing haiku

Ink flows from my pen
Expressing thoughts in simple verse
Haiku, my meditation

Try it ChatGPT yourself with a free account at Image created using DiffusionBee with the phrase: “a painting of an poet in a red beret writing haiku on a paper tablet”


An AI Thanksgiving

I asked my new friend GPT-3 to help with tell a memorable Thanksgiving story – and complemented it with a painting of a family Thanksgiving dinner generated by DALL-E.

Here was the request I made to my AI friend:

write a long story about a memorable family thanksgiving with an odd uncle on a snowy stormy day and his wife my aunt sally

Here’s the story:

Uncle Bob’s Thanksgiving

It was Thanksgiving day and the snow was falling gently outside. The family was gathered around the table, enjoying a delicious Thanksgiving feast. Uncle Bob was there, as always, with his wife Aunt Sally. But there was something different about Uncle Bob this year. He seemed odd, almost like he was in a trance. His eyes were glassy and he barely said a word.

Aunt Sally was chatty as always, but even she seemed a bit off. The family tried to make conversation with Uncle Bob, but it was like talking to a wall. Finally, after an awkward silence, Uncle Bob spoke.

“I had a dream last night,” he said, “a dream about a turkey.”

Everyone stared at Uncle Bob, not knowing what to say.

“In my dream, the turkey was alive and it was talking to me. It said, ‘Uncle Bob, you have to save me. The farmer is going to kill me and turn me into Thanksgiving dinner.'”

Uncle Bob paused and looked around the table. Everyone was still staring at him, not knowing what to say.

“So, I went to the farmer’s house and I begged him to spare the turkey’s life. He laughed at me and said, ‘No one is going to eat a talking turkey. I’m going to kill it and sell it to the grocery store.'”

Uncle Bob paused again and looked around the table. No one said a word.

“So, I went to the grocery store and I begged the manager to spare the turkey’s life. He laughed at me and said, ‘No one is going to eat a talking turkey. We’re going to kill it and sell it to the customers.'”

Uncle Bob paused again and looked around the table. Everyone was still staring at him, not knowing what to say.

“So, I went to the customers’ houses and I begged them to spare the turkey’s life. But they all just laughed at me and said, ‘No one is going to eat a talking turkey. We’re going to kill it and eat it for Thanksgiving dinner.'”

Uncle Bob paused one last time and looked around the table. Everyone was still staring at him, not knowing what to say.

“And that’s when I woke up,” Uncle Bob said.

There was a long silence around the table. Finally, Aunt Sally spoke up.

“That was a very strange dream, Bob,” she said. “Are you sure you’re feeling all right?”

Uncle Bob just stared at her and didn’t say a word.

The family finished their Thanksgiving feast in silence. Uncle Bob never said another word and Aunt Sally just seemed to be in a daze. It was a very strange and memorable Thanksgiving for everyone.

I hope you and your family have a lovely Thanksgiving – and share lots of good stories too!

AI Writing

Using AI to Help Write

I’ve been observing the evolution of AI-based tools from a distance. I’m not very experienced with using these tools myself, but I thought I’d share some of my excursions into this fascinating domain of AI-assisted or generative AI tools.

One of the text-based AI tools is GPT-3 from OpenAI. Late last year, OpenAI opened up a playground where one can experiment with using GPT-3 to assist with text – helping create text, edit text, summarize text, and more. One of the illustration AI tools is called DALL-E and it’s also provided by OpenAI. I’ve been experimenting and learning with both of these tools.

Here’s an example of using GPT-3 to help write. This morning I happened to see a tweet where the author was talking about an article he’d pulled together about how writers should focus more on niches – it’s titled 7 Simple Ways to Niche Down. It’s a good article worth reading. As I was reading it, I began wondering how GPT-3 might help with writing an article on this subject – so I asked it:

write an article about how to niche down your writing

Here’s what I got back from GPT-3 just using the default settings:

If you’re a writer, chances are you wear many hats. You’re a wordsmith, a storyteller, and a communicator. But in order to be a successful writer, you need to be able to niche down your writing.

Niche writing is a specific type of writing that appeals to a specific audience. It’s about writing for a particular market or genre, and it requires you to know your audience inside and out.

The first step to niching down your writing is to identify your target audience. Who are you writing for? What are their needs and wants? What do they like to read?

Once you know your audience, you can start to focus your writing. What topics will appeal to them? What type of writing will they respond to?

If you’re not sure where to start, consider writing guest posts for blogs or magazines that your target audience reads. This is a great way to get your name and your work in front of them.

Another option is to create a blog or website of your own and focus on writing articles, tips, or tutorials that will help your target audience. This will not only help you attract readers, but it will also show them that you’re an expert in your field.

Finally, don’t forget to promote your work! Make sure your target audience knows about your writing by sharing it on social media, submitting it to directories or list-serves, or even just telling people about it in person.

Niche writing can be a great way to build a successful writing career. By targeting a specific audience, you can become known as an expert in your field and attract a loyal following of readers. So if you’re ready to niche down your writing, start by identifying your target audience and then focus your writing to appeal to them.

Not bad. I had a subject and was looking for help putting together an article on the subject. I did minimal work but left with the bones I might consider fleshing out if I actually wanted to write an article on that subject. I also asked GPT-3 to help me clean up the opening paragraph above – which it rewrote for me.

To add to this post, I wanted an illustration that captured a writing theme so I asked DALL-E to generate one for me – asking it to generate:

an Edward hopper painting showing a writer at work in front of an old typewriter

That’s the image you see above.

The power of this kind of AI-based technology is mind-blowing to a layman like me. And its capabilities seem to be accelerating rapidly. I’ve got so much more to learn – and I’ve having fun just beginning to wade in. If you’re doing any experimentation similar to this, let me know what you’re learning!