Measuring the performance of our models on real-world tasks

OpenAI Blog ·

OpenAI introduces GDPval, a new evaluation framework measuring model performance on real-world economically valuable tasks across 44 occupations.

Categories: Research

Excerpt

OpenAI introduces GDPval, a new evaluation that measures model performance on real-world economically valuable tasks across 44 occupations.