Scaling laws for reward model overoptimization
OpenAI published empirical scaling laws for reward model overoptimization, quantifying how RLHF reward collapse scales with model size and policy samples.
Read at source: https://openai.com/index/scaling-laws-for-reward-model-overoptimization