Building better AI benchmarks: How many raters are enough?
Google presented benchmark-methodology research on how many human raters are needed for reliable AI evaluation.
Excerpt
Algorithms & Theory
Read at source: https://research.google/blog/building-better-ai-benchmarks-how-many-raters-are-enough/