Databricks Has a Trick That Lets AI Models Improve Themselves
Databricks, a company that helps big businesses build custom artificial intelligence models, has developed a machine-learning trick that can boost the performance of an AI model without the need for clean labeled data. The method leverages ideas that have helped produce advanced reasoning models by combining reinforcement learning, a way for AI models to improve through practice, with “synthetic,” or AI-generated, training data. The Databricks reward model, or DBRM, can then be used to improve the performance of other models without the need for further labeled data. Reinforcement learning and synthetic data are already widely used, but combining them in order to improve language models is a relatively new and technically challenging technique.
Discover Related

Databricks and Anthropic partner to help companies build AI agents

Fractal bets on agentic AI to drive revenue

Companies had fun experimenting with AI. Now they have to show the returns.
