It’s Not Just Go: OpenAI’s AI Competes Against Top Dota2 Gamers

AlphaGo Zero isn’t the only AI learning to kick butt by playing against itself. Back in August, OpenAI announced they’d used a similar strategy in the video game Dota 2:

We’ve created a bot which beats the world’s top professionals at 1v1 matches of Dota 2 under standard tournament rules. The bot learned the game from scratch by self-play, and does not use imitation learning or tree search. This is a step towards building AI systems which accomplish well-defined goals in messy, complicated situations involving real humans.

Today we played Dendi on mainstage at The International, winning a best-of-three match. Over the past week, our bot was undefeated against many top professionals including SumaiL (top 1v1 player in the world) and Arteezy (top overall player in the world).

Dota 1v1 is a complex game with hidden information. Agents must learn to plan, attack, trick, and deceive their opponents. The correlation between player skill and actions-per-minute is not strong, and in fact, our AI’s actions-per-minute are comparable to that of an average human player.

What’s equally impressive is how fast their AI improved its game. By the beginning of March, they had a “classical reinforcement learning” able to play a bit of Dota 2. By early June it was able to beat a tester who was in the bottom 15% skill rank. By the end of June it could beat a tester who was as good as around 60% of all players. By mid July it could barely eke out a win against a tester in the top 99%. And then:

In the span of a month, our system went from barely matching a high-ranked player to beating the top pros and has continued to improve since then.

So far the AI can only handle one-on-one Dota 2 matches. Their next challenge: competing in five-on-five matches.

With No More Human Opponents, AlphaGo Kicks Its Own Ass 

This week, Google announced a pretty remarkable breakthrough. According to Slate:

The newest version of [Google’s] Go-playing algorithm, dubbed AlphaGo Zero, was not only better than the original AlphaGo, which defeated the world’s best human player in May. This version had taught itself how to play the game. All on its own, given only the basic rules of the game. (The original, by comparison, learned from a database of 100,000 Go games.) According to Google’s researchers, AlphaGo Zero has achieved superhuman-level performance: It won 100–0 against its champion predecessor, AlphaGo.

Almost as impressive, AlphaGo Zero did it using fewer computer chips, aka TPUs:

Early AlphaGo versions operated on 48 Google-built TPUs. AlphaGo Zero works on only four. It’s far more efficient and practical than its predecessors.

Maybe estimates of mass unemployemnt in 20 years are feeling like a pretty reasonable forecast (even though nobody really knows for sure).

UPDATE:  so this is a little creepy.  From the Deepmind blog:

After just three days of self-play training, AlphaGo Zero emphatically defeated the previously published version of AlphaGo – which had itself defeated 18-time world champion Lee Sedol – by 100 games to 0. After 40 days of self training, AlphaGo Zero became even stronger, outperforming the version of AlphaGo known as “Master”, which has defeated the world’s best players and world number one Ke Jie.

3 days to achieve a higher level of mastery of a complex game than most humans have, without any tactical or strategic advice, just being told the rules?  That’s pretty wild.  What this means is that if you can create an accurate enough simulation of a real world problem, an AI has a good chance of being able to master it on its own in a matter of days or months (the AI needs a simulation so it can learn by playing a bazillion games against itself).  

This won’t be true for all problems.  For example, in Go you only move one piece at a time, the environment is far less complex than, say, driving, and it may have real world hardware problems — e.g. sensors that don’t work well in the snow, or not being able to process data fast enough to react quickly enough to unexpected problems — that may limit what it can do.  But it’s still a pretty impressive/creepy accomplishment.

… And Now Facebook

Yesterday, Facebook made a big splash with its announcement of Oculus Go, a lightweight, standalone virtual reality headset that doesn’t require a cable and a PC or a smartphone.

It ships early next year, starting at $199 USD. It’s awesome for watching movies or concerts, playing games, or just hanging out with your friends in VR…. The high-resolution fast-switch LCD screen dramatically improves visual clarity and reduces screen door effect…. Oculus Go also ships with integrated spatial audio. The speakers are built right into the headset, transporting you straight into VR and making the headset easy to share with someone else.

According to the Oculus Developer blog, you can use either the Unity or Unreal gaming engine to develop for the Oculus Go. All of the big players in augmented reality support at least Unity, which means you should be able to develop in Unity and, with some tweaking, get it to work across both augmented reality and the Facebook virtual reality devices.

If the hype around Oculus Go is anywhere near reality, Oculus Go should end up seriously cranking up the competition over virtual and augmented reality, creating even more room for communities that work together to start building a seat at the table

Quick and Dirty Version of Revised Framework

Turns out I didn’t need to drop Make Creativity Work or Make Community Work. Once I wrote up the problems I was having with them, I realized that with some changes to the setup and some changes to both parts of the model, most of the problems went away. Below is my quick and dirty, short version of my revised framework. I’ll post a more fleshed out version in the coming weeks.

Many experts believe that between 2025 and 2040, 25-75% of all jobs will be replaced by robots/AI. Given that the rules of our economy already concentrate wealth and power at the top, this crisis could end up devastating the middle class and the poor and destroying our democracy. But not all experts agree that robots/AI could bring about mass unemployment, and there’s no way to know who’s right. So, we need a strategy for building a more just, prosperous economy for all regardless of whether robots/AI create an unemployment crisis.

The solution: Makers All.

To understand why Makers All is structured the way it is, we need to understand the economic backdrop that shapes it:

  • Not only robots and AI but also augmented and virtual reality, digital fabrication, wearables, and other emerging technologies are going to become the core of this new economy. The most valuable part of this economy won’t be physical products but the creative works that power them: a robot’s software that lets it cook, the recipe that tells it how to make tomato soup, the patent behind sensors that lets it know when a chicken breast is fully cooked.
  • The Internet has given us a sneak preview of what an economy dominated by creative works might look like. With websites, YouTube, and open source software, we have an unprecedented bounty of creative works at our fingertips. But the financial benefits have largely gone to the top: musicians, newspaper reporters, and other makers of creative works have a harder time paying the bills, income inequality has soared, and communities from Compton to Harlan County have essentially been written off.
  • But as the postwar consumer economy showed us, technology isn’t destiny. Through their unions, millions of white Americans built grassroots power at the heart of this new economy that ensured this prosperity was more broadly shared. The challenge we face today is how to build grassroots power at the heart of the creative works economy — and do it in a way so this time every community benefits

Makers All consists of 2 strategies:

Make Creativity Work. How do we ensure that in every community, as many people as possible can earn some or all of their income from creative works?

  • Knowledge. In every community, ensure that as many individuals as possible can learn the skills they need to participate in this economy both as tool users and tool creators — and rewire tech culture to support it. This knowledge is critical not only so people can benefit financially but also for the second part of Make Creativity Work: building power.
  • Power. Organize within and across communities to build grassroots power at the heart of this new economy. The goal: to ensure that everyone, not just a handful of corporations and the wealthy, gets a seat at the table where decisions get made about how the rules governing creative works are structured and who benefits (e.g., “YouTube Done Right”).

Make Community Work. Although every community should benefit from Make Creativity Work, not every individual will benefit enough to pay the bills and not everything we value will pay — and the harder hit our society is by job losses, the more critical this problem will become. So we need another way to make sure nobody starves and everyone has a shot at making it:

  • Fill the Gaps. Guarantee everyone security through a mix of a smaller Universal Basic Income and other opportunities that anyone could do and that reinforce our society’s values (e.g., “volunteer bucks”). Provide resources for work that’s critical to our communities but that the market doesn’t value, such as taking care of children and the elderly. And reduce our need for income through personal digital fabrication, creating more affordable housing, etc.
  • Make Communities Whole. Create Marshall Plan-style transitions for communities who are hit hard by new robot-driven job crises or were previously Left behind.