Rethinking how we measure AI intelligence
Google DeepMind releases Game Arena, an open-source platform for head-to-head frontier AI evaluation in environments with clear winning conditions.
Excerpt
Game Arena is a new, open-source platform for rigorous evaluation of AI models. It allows for head-to-head comparison of frontier systems in environments with clear winning conditions.
Read at source: https://deepmind.google/blog/rethinking-how-we-measure-ai-intelligence/