Human V Machine: Deep Mind & AlphaGo Explained

Flip It

Google successfully managed to make history earlier this month – in what some would argue is a landmark moment in the future of artificial intelligence. What Google have achieved is the removing of one of the last vestiges of human exclusivity over the machine when AlphaGo defeated champion Go player, Lee Sedol.

The Competitors:

AlphaGo is a computer program developed by London-based Artificial Intelligence firm Google DeepMind to play the board game Go. In October 2015, it became the first computer Go program to beat a professional human Go player without handicaps on a full-sized 19×19 board.

Lee Se-dol (born 2 March 1983) is a South Korean professional Go player of 9-dan rank. As of February 2016, he ranks second in international titles behind only Lee Chang-ho and over the past decade is regarded as the top player in the world.

In a best of 5 match, Sodol only managed to conquer the Google program once - just halting a clean sweep, in match four. AlphaGo’s victory joins the mantle of Checkers, which fell in 1994, chess in 1997 and Jeopardy in 2011. Now, for those who didn’t know, we do have some context as to what this all means: up until 2014 experts believed that it was impossible for computers to trump Go due to the near infinite possibilities.

The game gets complex and we mean fast. There are around 400 possible moves on chess after the first round, there are a whopping 129,960 in Go.  There are 35 possible moves on any turn in a chess game, and 250 for Go. A typical game between experts lasts for around 150 move, which means there are: 208168199381979984699478633344862770286522453884530548425639456820927419612738015378525648451698
519643907259916015628128546089888314427129715319317557736620397247064840935, possibilities in each game.

Which, check this, means that the observable universe contains less atoms than possible combinations in the game of Go. To put this into something we can understand, there are 32 million second in a year, it would take over two years, playing 16 hours a day, at one move per second to play 47 million moves, as this is 1048 – no computer is projected to compute anything close to the trillion teraflops – yes that is a number - required to win.

This idea was the crux of the problem for all the ancestors of AlphaGo – there wasn’t enough computer power for them to determine a solution to all of the moves, it would take an almost infinite amount of time to process every single move. In order to beat a world class human player the AlphaGo program would actually have to learn how to play, and then play adaptively in games against humans – which pretty is similar to thinking and problem solving.

To do this AlphaGo uses two types of AI technology:

  • Monte Carlo tree search: This involves choosing moves at random and then simulating the game to the very end to find a winning strategy.

  • Deep neural networks: A 12 layer-network of neuron-like connections that consists of a "policy network" that selects the next move and a "value network" that predicts the winner of the game.

Created in China, 2,500 years ago, in Ancient China, go is a pastime beloved by emperors and generals. Go appeared simple – initially simply a board; two players;  two stones.  Player one uses black stones, while player two used white – each player alternates trying to grab territory. Like chess it is an deterministic perfect information game, which means the best player always wins – there is nothing hidden nor any chance elements.

To show you how fast Ai has developed, when Kasparov was defeated by IBM’s Deep Blue in 1994, Go programs couldn’t even beat a solid amateur.

AlphaGo wasn’t programmed with good and bad moves instead it studied a database of online Go matches which would give it the equivalent experience of doing nothing but playing Go for 80 years straight.

AlphaGo engineers fed the program around 30 million combinations or moves and then played 1000s of games against itself. This means that instead of having to contemplate the millions of moves every time, AlphaGo can narrow down on the optimal move very quickly.

In fact, in the 4th game Sedol used this to his advantage, by playing an awkward move called “the wedge”, which leaves so many possibilities that it confused AlphaGo and allowed Sedol to win. It also appeared from the initial exchanges that Sedol had won the 5th and final game, but although playing valiantly he eventually succumbed to the superior technical strength of AlphaGo.

This also means there is an interesting future for the DeepMind program - as it is still far from perfect.

The game of Go is similar to something like Solitaire.