Scaling laws for reward model overoptimization

OpenAI Blog ·

OpenAI published empirical scaling laws for reward model overoptimization, quantifying how RLHF reward collapse scales with model size and policy samples.

Categories: Research