You are sitting in a comfortable chair by the fire, on a cold winter’s night. Perhaps you have a mug of tea in hand, perhaps something stronger. You open a magazine to an article you’ve been meaning to read. The title suggested a story about a promising — but also potentially dangerous — new technology on the cusp of becoming mainstream, and after reading only a few sentences, you find yourself pulled into the story. A revolution is coming in machine intelligence, the author argues, and we need, as a society, to get better at anticipating its consequences. But then the strangest thing happens: You notice that the writer has, seemingly deliberately, omitted the very last word of the first .
The missing word jumps into your consciousness almost unbidden: ‘‘the very last word of the first paragraph.’’ There’s no sense of an internal search query in your mind; the word ‘‘paragraph’’ just pops out. It might seem like second nature, this filling-in-the-blank exercise, but doing it makes you think of the embedded layers of knowledge behind the thought. You need a command of the spelling and syntactic patterns of English; you need to understand not just the dictionary definitions of words but also the ways they relate to one another; you have to be familiar enough with the high standards of magazine publishing to assume that the missing word is not just a typo, and that editors are generally loath to omit key words in published pieces unless the author is trying to be clever — perhaps trying to use the missing word to make a point about your cleverness, how swiftly a human speaker of English can conjure just the right word.
Before you can pursue that idea further, you’re back into the article, where you find the author has taken you to a building complex in suburban Iowa. Inside one of the buildings lies a wonder of modern technology: 285,000 CPU cores yoked together into one giant supercomputer, powered by solar arrays and cooled by industrial fans. The machines never sleep: Every second of every day, they churn through innumerable calculations, using state-of-the-art techniques in machine intelligence that go by names like ‘‘stochastic gradient descent’’ and ‘‘convolutional neural networks.’’ The whole system is believed to be one of the most powerful supercomputers on the planet.
And what, you may ask, is this computational dynamo doing with all these prodigious resources? Mostly, it is playing a kind of game, over and over again, billions of times a second. And the game is called: Guess what the missing word is.
The supercomputer complex in Iowa is running a program created by OpenAI, an organization established in late 2015 by a handful of Silicon Valley luminaries, including Elon Musk; Greg Brockman, who until recently had been chief technology officer of the e-payment juggernaut Stripe; and Sam Altman, at the time the president of the start-up incubator Y Combinator. In its first few years, as it built up its programming brain trust, OpenAI’s technical achievements were mostly overshadowed by the star power of its founders. But that changed in summer 2020, when OpenAI began offering limited access to a new program called Generative Pre-Trained Transformer 3, colloquially referred to as GPT-3. Though the platform was initially available to only a small handful of developers, examples of GPT-3’s uncanny prowess with language — and at least the illusion of cognition — began to circulate across the web and through social media. Siri and Alexa had popularized the experience of conversing with machines, but this was on the next level, approaching a fluency that resembled creations from science fiction like HAL 9000 from “2001”: a computer program that can answer open-ended complex questions in perfectly composed sentences.
As a field, A.I. is currently fragmented among a number of different approaches, targeting different kinds of problems. Some systems are optimized for problems that involve moving through physical space, as in self-driving cars or robotics; others categorize photos for you, identifying familiar faces or pets or vacation activities. Some forms of A.I. — like AlphaFold, a project of the Alphabet (formerly Google) subsidiary DeepMind — are starting to tackle complex scientific problems, like predicting the structure of proteins, which is central to drug design and discovery. Many of these experiments share an underlying approach known as ‘‘deep learning,’’ in which a neural net vaguely modeled after the structure of the human brain learns to identify patterns or solve problems through endlessly repeated cycles of trial and error, strengthening neural connections and weakening others through a process known as training. The ‘‘depth’’ of deep learning refers to multiple layers of artificial neurons in the neural net, layers that correspond to higher and higher levels of abstraction: In a vision-based model, for instance, a layer of neurons might detect vertical lines, which would then feed into a layer detecting edges of physical structures, which would then report to a layer that identified houses as opposed to apartment buildings.
GPT-3 belongs to a category of deep learning known as a large language model, a complex neural net that has been trained on a titanic data set of text: in GPT-3’s case, roughly 700 gigabytes of data drawn from across the web, including Wikipedia, supplemented with a large collection of text from digitized books. GPT-3 is the most celebrated of the large language models, and the most publicly available, but Google, Meta (formerly known as Facebook) and DeepMind have all developed their own L.L.M.s in recent years. Advances in computational power — and new mathematical techniques — have enabled L.L.M.s of GPT-3’s vintage to ingest far larger data sets than their predecessors, and employ much deeper layers of artificial neurons for their training.
Chances are you have already interacted with a large language model if you’ve ever used an application — like Gmail — that includes an autocomplete feature, gently prompting you with the word ‘‘attend’’ after you type the sentence ‘‘Sadly I won’t be able to….’’ But autocomplete is only the most rudimentary expression of what software like GPT-3 is capable of. It turns out that with enough training data and sufficiently deep neural nets, large language models can display remarkable skill if you ask them not just to fill in the missing word, but also to continue on writing whole paragraphs in the style of the initial prompt. For example, when I gave GPT-3 the prompt:
In 1958, Frank Rosenblatt introduced the Perceptron, an early version of a neural net that could only detect the location of a square on a printed card. Fifty years later, neural nets are capable of . . .
The software dutifully completed the thought, and then continued on, picking up on the historical framing of the initial text:
. . . detecting and classifying faces, finding patterns in data, and emulating the human brain. While the neural net matured in academia, it also found its way into the tech industry. In the late 1980s and early 90s, neural nets were used in speech recognition and character recognition applications. The growth in computing power and the ability to collect larger data sets allowed other neural net applications to emerge.
Since GPT-3’s release, the internet has been awash with examples of the software’s eerie facility with language — along with its blind spots and foibles and other more sinister tendencies. GPT-3 has been trained to write Hollywood scripts and compose nonfiction in the style of Gay Talese’s New Journalism classic ‘‘Frank Sinatra Has a Cold.’’ You can employ GPT-3 as a simulated dungeon master, conducting elaborate text-based adventures through worlds that are invented on the fly by the neural net. Others have fed the software prompts that generate patently offensive or delusional responses, showcasing the limitations of the model and its potential for harm if adopted widely in its current state.
So far, the experiments with large language models have been mostly that: experiments probing the model for signs of true intelligence, exploring its creative uses, exposing its biases. But the ultimate commercial potential is enormous. If the existing trajectory continues, software like GPT-3 could revolutionize how we search for information in the next few years. Today, if you have a complicated question about something — how to set up your home theater system, say, or what the options are for creating a 529 education fund for your children — you most likely type a few keywords into Google and then scan through a list of links or suggested videos on YouTube, skimming through everything to get to the exact information you seek. (Needless to say, you wouldn’t even think of asking Siri or Alexa to walk you through something this complex.) But if the GPT-3 true believers are correct, in the near future you’ll just ask an L.L.M. the question and get the answer fed back to you, cogently and accurately. Customer service could be utterly transformed: Any company with a product that currently requires a human tech-support team might be able to train an L.L.M. to replace them.