Wismut Labs

Cognition the Wismut Labs engineering blog

Recent Advances in Game Reinforcement Learning: OpenAI Five's DotA 2

Reinforcement learning is an area of Machine Learning where an agent learns from the environment by interacting with it and by receiving (positive or negative) rewards for performing actions.

One can think of it as a goal-oriented approach, which learns how to attain a complex objective (e.g., win a game) or maximize along a particular dimension (e.g., avoid falling off a tightrope) over many steps.

Deep reinforcement learning uses deep learning and reinforcement learning principles together in order to create such efficient algorithms. One of the most prominent, early applications of Deep Reinforcement Learning is AlphaGo Zero, developed by Google DeepMind, which can beat the best Go players in the world.

Over time, deep reinforcement learning has been proven to be highly relevant and beneficial to areas including robotics, video games, finance, healthcare and more.

To gain a better understanding of the foundations, state-of-research as well as current applications of deep reinforcement learning, I joined a free public class following Sergey Levine’s UC Berkeley Deep Reinforcement Learning class:

During the course of the project I completed a deep reinforcement learning group project on Stock Price Prediction with RL.

Deep Learning can be used to learn patterns in stock prices and volume, and also to understand news data surrounding stocks. We follow an outline listed by Boris Banushev using Reinforcement Learning to control the hyperparameters of GANs and LSTMs used for predicting stock price movement.

And I prepared lecture materials and conducted a two-hour lecture on Recent Advances in Game Reinforcement Learning: OpenAI Five’s DotA 2.

Timing of the lecture was rather fortunate as, on the same day, OpenAI launched OpenAI Five Arena, a 4-day, public experiment where anyone could play OpenAI Five in both competitive and cooperative modes.

The event ended with a convincing victory for OpenAI Five and offered valuable lessons into recent advanced made by the OpenAI team to consistently beat not only amateur players but also professional DotA 2 teams: