Mass event will let hackers test limits of AI technology

2 weeks, 4 days ago

This Tool Probes Frontier AI Models for Lapses in Intelligence

Executives at artificial intelligence companies may like to tell us that AGI is almost here, but the latest models still need some additional tutoring to help them be as clever as they can. Scale AI, a company that’s played a key role in helping frontier AI firms build advanced models, has developed a platform that can automatically test a model across thousands of benchmarks and tasks, pinpoint weaknesses, and flag additional training data that ought to help enhance their skills. The new tool “is a way for to go through results and slice and dice them to understand where a model is not performing well,” Berrios says, “then use that to target the data campaigns for improvement.” Berrios says that several frontier AI model companies are using the tool already. Jonathan Frankle, chief AI scientist at Databricks, a company that builds large AI models, says that being able to test one foundation model against another sounds useful in principle. The company says its new tool offers a more comprehensive picture by combining many different benchmarks and can be used to devise custom tests of a model’s abilities, like probing its reasoning in different languages.

Wired

Model Ai Models Tool Scale Berrios Scale Evaluation

Discover Related

7 months ago

Humanity's last exam: Experts ready toughest questions to pose to AI

Launched by the Center for AI Safety (CAIS) and startup Scale AI, this initiative aims to determine when AI reaches expert-level capabilities and to remain relevant as AI technology advances. …

IndiaToday

Science Technology

1 year, 11 months ago

Mass event will let hackers test limits of AI technology

No sooner did ChatGPT get unleashed than hackers started “jailbreaking” the artificial intelligence chatbot — trying to override its safeguards so it could blurt out something unhinged or obscene. “This …

Information Google Microsoft Ai