Evolution strategies as a scalable alternative to reinforcement learning

OpenAI Blog ·

Evolution strategies rival standard RL on Atari/MuJoCo benchmarks while eliminating replay buffers and hyperparameter sensitivity, offering a scalable alternative for training.

Categories: Research

Excerpt

We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL’s inconveniences.