OpenAI Baselines: ACKTR & A2C

OpenAI Blog · Aug 18, 2017

OpenAI Baselines released implementations of ACKTR and A2C, with ACKTR offering improved sample efficiency over TRPO.

Excerpt

We’re releasing two new OpenAI Baselines implementations: ACKTR and A2C. A2C is a synchronous, deterministic variant of Asynchronous Advantage Actor Critic (A3C) which we’ve found gives equal performance. ACKTR is a more sample-efficient reinforcement learning algorithm than TRPO and A2C, and requires only slightly more computation than A2C per update.

Read at source: https://openai.com/index/openai-baselines-acktr-a2c