Large Language Models’ Emergent Abilities Are a Mirage
The original version of this story appeared in Quanta Magazine. Two years ago, in a project called the Beyond the Imitation Game benchmark, or BIG-bench, 450 researchers compiled a list of 204 tasks designed to test the capabilities of large language models, which power chatbots like ChatGPT. A new paper by a trio of researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM’s performance. Large language models train by analyzing enormous data sets of text—words from online sources including books, web searches, and Wikipedia—and finding links between words that often appear together. That rapid growth has brought an astonishing surge in performance and efficacy, and no one is disputing that large enough LLMs can complete tasks that smaller models can’t, including ones for which they weren’t trained.
Discover Related

Small Language Models Are the New Rage, Researchers Say

Can AI think on its own beyond the training parameters? Study finds evidence

Google DeepMind's Chatbot-Powered Robot Is Part of a Bigger Revolution

While OpenAI sorts it’s chaos, rival Anthropic’s Claude chatbot is evolving

Elon Musk Announces Grok, a ‘Rebellious’ AI With Few Guardrails

StupidGPT: ChatGPT-like AI bots are way more stupid than people realise, says AI Expert

Microsoft unveils Kosmos-1,a new AI model to race up with ChatGPT

Meta brings AI chatbot with own large language model for researchers

Google to develop AI model supporting 1,000 popular global languages
