AlphaGo Zero isn’t the only AI learning to kick butt by playing against itself. Back in August, OpenAI announced they’d used a similar strategy in the video game Dota 2:
We’ve created a bot which beats the world’s top professionals at 1v1 matches of Dota 2 under standard tournament rules. The bot learned the game from scratch by self-play, and does not use imitation learning or tree search. This is a step towards building AI systems which accomplish well-defined goals in messy, complicated situations involving real humans.
Today we played Dendi on mainstage at The International, winning a best-of-three match. Over the past week, our bot was undefeated against many top professionals including SumaiL (top 1v1 player in the world) and Arteezy (top overall player in the world).
Dota 1v1 is a complex game with hidden information. Agents must learn to plan, attack, trick, and deceive their opponents. The correlation between player skill and actions-per-minute is not strong, and in fact, our AI’s actions-per-minute are comparable to that of an average human player.
What’s equally impressive is how fast their AI improved its game. By the beginning of March, they had a “classical reinforcement learning” able to play a bit of Dota 2. By early June it was able to beat a tester who was in the bottom 15% skill rank. By the end of June it could beat a tester who was as good as around 60% of all players. By mid July it could barely eke out a win against a tester in the top 99%. And then:
In the span of a month, our system went from barely matching a high-ranked player to beating the top pros and has continued to improve since then.
So far the AI can only handle one-on-one Dota 2 matches. Their next challenge: competing in five-on-five matches.