Building better AI benchmarks: How many raters are enough?

Google Research Blog ·

Google presented benchmark-methodology research on how many human raters are needed for reliable AI evaluation.

Categories: Research

Excerpt

Algorithms & Theory