1 year ago

Large Language Models’ Emergent Abilities Are a Mirage

The original version of this story appeared in Quanta Magazine. Two years ago, in a project called the Beyond the Imitation Game benchmark, or BIG-bench, 450 researchers compiled a list of 204 tasks designed to test the capabilities of large language models, which power chatbots like ChatGPT. A new paper by a trio of researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM’s performance. Large language models train by analyzing enormous data sets of text—words from online sources including books, web searches, and Wikipedia—and finding links between words that often appear together. That rapid growth has brought an astonishing surge in performance and efficacy, and no one is disputing that large enough LLMs can complete tasks that smaller models can’t, including ones for which they weren’t trained.

Discover Related