Playing Smart by Julian Togelius

Playing Smart by Julian Togelius covers the field of Game AI. It is not a book for those that want specifics, it is for people who want to learn more about the topic. This book serves as a great starting point for people looking to learn more about the field, or even get into it. Because it is a popular science book it provides the readers with a top-down view of the field. It will allow people to find what they are interested in and provides citations. In addition, the end of the book has a further reading section for those who are looking to learn more about Game AI and AI in general.

My favorite part of the book was the section on game AI competitions. With Deepmind’s bot just defeating Liquid TLO and Liquid Mana, I have been thinking a lot about the purpose of these competitions. In addition, I had just been playing around with the Halite III competition. This competition has competitors build a bot that can play a multiplayer game where players take turns on the same step to collect as many resources as possible. The competition attracts thousands of participants, and the hand-coded bots always win. This year it looks like one machine learning approach received ninth place. From a brief skim of the repo, it appears that the player uses imitation learning based on top players to create the bot.

The issue with this Halite competition and the Starcraft bot is a lack of emphasis on generalization. The ultimate goal of all AI research, I believe, is artificial general intelligence (AGI) and as Togelius points out in an article from Wired, “To get to the G in AGI we need to move beyond individual games.” This is perhaps my primary issue with competitions like Halite and the now ongoing Terminal competition. I’m excited that so many people are participating in them, but I struggle to see how they are helping us move towards AGI.

Take for example the Mario AI competition Togelius helped put together with his student Sergey Karakosvkiy (130). It was meant to be a challenge that people could apply general algorithms to but, instead, an agent using A* dominated the competition. Even before, in his competition for competitors to create an agent that could play a racing game, he found the solutions that won were less general (132). My hope is that any game AI competition has the ultimate goal of advancing the field. Competitions like Halite and Terminal are not incentivizing this. That’s one of the reasons why the GVG-AI Competition excites me. It is also one of the reasons the recent Deepmind Starcraft 2 bot disappoints me. I don’t mean to take away from the accomplishment; it is a feat and it should receive all the attention it is getting. However, I wish they had done a few things differently.

First, I wish that they had players play against a single agent in a best of five. Instead, they had each pro player play against different versions of the bot that had evolved in the training process for each game. This is problematic because a key aspect of how humans play in series is adaption. For example, in one game Liquid TLO fought against a bot that built 18 disruptors (a ground-based unit with a big ground only attack). If he had played against that bot again he likely would have gone for an air-based strategy which would have hard countered that version of the bot. If the bot built was versatile then this wouldn’t have been an issue. But as we saw in the Liquid Mana live match, which I’ll discuss at the end of this, the bot does not appear to be versatile.

Second, I wish they had emphasized that training on a single map is an important limitation. We can look at Togelius’ racing game as an example. He trained bots to play the game on a single track and got ideal performance, but when these same bots went onto a new track they were awful (68). The network, in this case, learned how to optimally navigate around a single track, not any track. I imagine we would see something similar if the AlphaStar agent attempted to play on another map in StarCraft 2. In addition, performance would be even worse if the agent played on a four-player map instead of a two-player map.

Third, I wish they had emphasized that not being able to play against the other two races in Starcraft 2 is a serious limitation. This is similar to the point above on generalization. Any player who tries to play StarCraft 2 must learn how to play three different matchups: vs zerg, vs protoss, and vs terran. Each matchup plays incredibly differently and is one of the reasons why StarCraft 2 is seen as a frontier in AI research. AlphaStar could only play one of the matchups in the game. Now I grant you that the researchers could have trained three different sets of agents, one for each matchup. Then claimed they solved the game with a simple switch statement to define which network should be used in the game. But again, that is not the point of these benchmarks. It’s impressive that DeepMind has made a bot that can beat humans. Incredibly impressive. But the really impressive moment will be when they make a bot, not multiple versions, that can play against any race and win a series on multiple maps against a professional player. A further layer of complication, that would blow me away, would be a bot that could play as any race.

These three are just wishes for how I wish Deepmind had handled the three-hour event. The point in the stream that showed just how far DeepMind has to go was during the live match. Liquid Mana played against AlphaStar and it started off looking really bad. He was getting harassed and attacked from every direction and struggled to deal with the pressure. In response to the pressure, he sent out a warp prism (drop ship) with two immortals (strong ground units) to attack AlphaStar’s base. At the base, Mana dropped the immortals and started attacking. AlphaStar has a few things it can do here:

  1. Send a portion of its army back to handle the harassment
  2. Ignore and send all unit’s to Mana’s base to attack, i.e. all in
  3. Send it’s entire army back to handle the harassment
  4. Build a unit to counter the dropship and accept the damage

AlphaStar opted to go with option three, send it’s entire army back to deal with the harassment. This isn’t a horrible move, because the warp prism has the ability to not only drop units but warp new ones in. However, Mana also has an observer (an invisible flying unit) at AlphaStar’s base that saw AlphaStar’s army coming back. So Mana picked up his two immortals and retreated back out of sight. AlphaStar moved its army out of the base and Mana went back and dropped the immortals. This process repeated itself several times until Mana was back in the game and eventually won.

AlphaStar could have left a portion of its army in preparation for Mana dropping off again. It could have used the stargate it built to build a flying unit to chase after the dropship. Instead, it repeated the process over and over again and lost the game. The bot showed no ability to adapt. This is the main reason why I think a single version of the bot would not hold up in a series against a human.

Going back to Playing Smart, it’s an interesting book that I highly recommend. I decided to go on a tangent due to the DeepMind event that just happened and because I already knew most of what was being covered. The perspective was interesting, but I didn’t have a ton to write about as I did with the Ghengis Khan post where everything was new to me.

. . .


I hope you enjoyed this post and, for my next book, I am reading A Conjuring of Light by V.E. Shwab.