If the artificial intelligence craze feels strangely familiar, it should, because we’ve been dealing versions of it for a long time now under different names. We’ve been introduced to products that supposedly get smarter the more we use them.
Our entire world has evolved into a giant software testing environment and we are the guinea pigs.
Meanwhile, there are unique aspects to what is now called AI, and they’re worth knowing about. One of the best explainers to date is in Ars Technica. Here is my summary in the form of excerpts:
To understand how language models work, you first need to understand how they represent words. Humans represent English words with a sequence of letters, like C-A-T for "cat." Language models use a long list of numbers called a "word vector." The full vector for cat is 300 numbers long.
Each word vector represents a point in an imaginary “word space,” and words with more similar meanings are placed closer together. For example, the words closest to cat in vector space include dog, kitten, and pet.
Word vectors are a useful building block for language models because they encode subtle but important information about the relationships between words. If a language model learns something about a cat (for example, it sometimes goes to the vet), the same thing is likely to be true of a kitten or a dog.
(But) words often have multiple meanings. And meaning depends on context. To transform word vectors into word predictions, large language models (LLMs) use layers that act as transformers. Each layer adds information to help clarify the meaning of that word and better predict which word might come next.
Researchers don’t understand exactly how LLMs keep track of this information, but logically speaking, the model must be doing it by modifying the hidden state vectors as they get passed from one layer to the next. It helps that in modern LLMs, these vectors are extremely large. The most powerful version of GPT-3, for example, has 96 layers and uses word vectors with 12,288 dimensions—that is, each word is represented by a list of 12,288 numbers!
A key innovation of LLMs is that they don’t need explicitly labeled data. Instead, they learn by trying to predict the next word in ordinary passages of text. Almost any written material—from Wikipedia pages to news articles to computer code—is suitable for training these models.
You might find it surprising that the training process works as well as it does. ChatGPT can perform all sorts of complex tasks—composing essays, drawing analogies, and even writing computer code. So how does such a simple learning mechanism produce such a powerful model?
One reason is scale. It’s hard to overstate the sheer number of examples that a model like GPT-3 sees. GPT-3 was trained on a corpus of approximately 500 billion words. For comparison, a typical human child encounters roughly 100 million words by age 10.
At the moment, we don’t have any real insight into how LLMs accomplish feats like this. Some people argue that such examples demonstrate that the models are starting to truly understand the meanings of the words in their training set. Others insist that language models are “stochastic parrots” that merely repeat increasingly complex word sequences without truly understanding them.
Traditionally, a major challenge for building language models was figuring out the most useful way of representing different words—especially because the meanings of many words depend heavily on context. The next-word prediction approach allows researchers to sidestep this thorny theoretical puzzle by turning it into an empirical problem. It turns out that if we provide enough data and computing power, language models end up learning a lot about how human language works simply by figuring out how to best predict the next word. The downside is that we wind up with systems whose inner workings we don’t fully understand.
I recommend that you bookmark the entire Ars Technica article.
LINKS:
'We're ready to go': Fulton County DA says work is done in Trump probe (CNN)
Judge rejects Trump bid to upend Georgia probe (Atlanta Journal-Constitution)
Mar-a-Lago manager De Oliveira makes his first court appearance in Trump’s classified documents case (AP)
Donald Trump's Legal Problems Could Be About to Get a Whole Lot Worse (Newsweek)
Trump Crushing DeSantis and G.O.P. Rivals, Times/Siena Poll Finds (NYT)
Republican presidential hopefuls blast DeSantis over slavery standards (WP)
Seven candidates say they have met qualifications for a spot on the Aug. 23 GOP debate stage in Milwaukee. About half of the GOP 2024 presidential candidates,including former Vice President Mike Pence, have not yet made the cut. Read more. (AP)
Ukraine war: Russian strike on Zelensky's home city kills six (BBC)
Amid the Counterattack’s Deadly Slog, a Glimmer of Success for Ukraine (NYT)
A senior Ukrainian official reported heavy fighting in the northeast, with Kyiv's forces holding their lines and making gains. Former Russian President Dmitry Medvedev said Moscow would have to use a nuclear weapon if Kyiv's ongoing counter-offensive was a success. (Reuters)
ISIL claims responsibility for Pakistan bombing that killed 54 people (Al Jazeera)
Heat Is Costing the U.S. Economy Billions in Lost Productivity (NYT)
Dell is all in on generative AI (Verge)
How AI Is Used to Scam You (and What You Can Do About It) (LifeHacker)
A.I. is on a collision course with white-collar, high-paid jobs — and with unknown impact (CNBC)
A jargon-free explanation of how AI large language models work (ArsTechnica)
Meta Has A.I. Google Has A.I. Microsoft Has A.I. Amazon Has a Plan. (Slate)
Outcry Against AI Companies Grows Over Who Controls Internet’s Content (WSJ)
How Silicon Valley is helping the Pentagon in the AI arms race (Financial Times)
Hollywood’s Fight Against A.I. Will Affect Us All (New Republic)
News Corp using AI to produce 3,000 Australian local news stories a week (Guardian)
AI's scariest mystery (Axios)
How AI is fundamentally altering the business landscape (VentureBeat)
What ‘Oppenheimer’ Doesn’t Tell You About the Trinity Test (NYT)
Nation Gathers Around Picky Eater To Make Him Try Things He Doesn’t Like (The Onion)
No comments:
Post a Comment