Learning to summarize with human feedback

OpenAI Blog ·

OpenAI applied RLHF to train language models for summarization, demonstrating that human feedback produces more faithful and coherent summaries than supervised approaches.

Categories: Research

Excerpt

We’ve applied reinforcement learning from human feedback to train language models that are better at summarization.