Official Blog
Insights from Googlers into our products, technology, and the Google culture
AlphaGo: using machine learning to master the ancient game of Go
January 27, 2016
The game of
Go
originated in China more than 2,500 years ago.
Confucius
wrote about the game, and it is considered one of the
four essential arts
required of any true Chinese scholar.
Played by more than 40 million people worldwide
, the rules of the game are simple: Players take turns to place black or white stones on a board, trying to capture the opponent's stones or surround empty space to make points of territory. The game is played primarily through intuition and feel, and because of its beauty, subtlety and intellectual depth it has captured the human imagination for centuries.
But as simple as the rules are, Go is a game of profound complexity. There are 1,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 possible positions—that’s more than the number of atoms in the universe, and more than a googol times larger than chess.
This complexity is what makes Go hard for computers to play, and therefore an irresistible challenge to artificial intelligence (AI) researchers, who use games as a testing ground to invent smart, flexible algorithms that can tackle problems, sometimes in ways similar to humans. The first game mastered by a computer was
noughts and crosses
(also known as tic-tac-toe) in 1952. Then fell
checkers
in 1994. In 1997 Deep Blue famously beat Garry Kasparov at
chess
. It’s not limited to board games either—IBM's
Watson
[PDF] bested two champions at Jeopardy in 2011, and
in 2014 our own algorithms learned to play dozens of Atari games
just from the
raw pixel inputs
. But to date, Go has thwarted AI researchers; computers still only play Go as well as amateurs.
Traditional AI methods—which construct a
search tree
over all possible positions—don’t have a chance in Go. So when we set out to crack Go, we took a different approach. We built a system, AlphaGo, that combines an
advanced tree search
with
deep neural networks
. These neural networks take a description of the Go board as an input and process it through 12 different network layers containing millions of neuron-like connections. One neural network, the “policy network,” selects the next move to play. The other neural network, the “value network,” predicts the winner of the game.
We trained the neural networks on 30 million moves from games played by human experts, until it could predict the human move 57 percent of the time (the previous record before AlphaGo was
44 percent
). But our goal is to beat the best human players, not just mimic them. To do this, AlphaGo learned to discover new strategies for itself, by playing thousands of games between its neural networks, and adjusting the connections using a trial-and-error process known as
reinforcement learning
. Of course, all of this requires a huge amount of computing power, so we made extensive use of
Google Cloud Platform
.
After all that training it was time to put AlphaGo to the test. First, we held a tournament between AlphaGo and the other top programs at the forefront of computer Go. AlphaGo won all but one of its 500 games against these programs. So the next step was to invite the reigning three-time European Go champion Fan Hui—an elite professional player who has devoted his life to Go since the age of 12—to our London office for a challenge match. In a closed-doors match last October, AlphaGo won by 5 games to 0. It was the first time a computer program has ever beaten a professional Go player. You can find out more in our paper, which was published in
Nature
today.
What’s next? In March, AlphaGo will face its ultimate challenge: a five-game challenge match in Seoul against the legendary
Lee Sedol
—the top Go player in the world over the past decade.
We are thrilled to have mastered Go and thus achieved one of the
grand challenges of AI
. However, the most significant aspect of all this for us is that AlphaGo isn’t just an
“expert” system
built with hand-crafted rules; instead it uses general machine learning techniques to figure out for itself how to win at Go. While games are the perfect platform for developing and testing AI algorithms quickly and efficiently, ultimately we want to apply these techniques to important real-world problems. Because the methods we’ve used are general-purpose, our hope is that one day they could be extended to help us address some of society’s toughest and most pressing problems, from climate modelling to complex disease analysis. We’re excited to see what we can use this technology to tackle next!
Posted by Demis Hassabis, Google DeepMind
http://3.bp.blogspot.com/-UM6zXm-cXW4/VqkFrP32nlI/AAAAAAAARzw/HmxeOsYvvqk/s1600/Go-game_hero.jpg
Demis Hassabis
Google DeepMind
Labels
accessibility
39
acquisition
26
ads
131
Africa
19
Android
59
apps
418
April 1
4
Asia
39
books + book search
48
commerce
12
computing history
7
crisis response
32
culture
11
developers
120
diversity
33
doodles
65
education and research
144
entrepreneurs at Google
14
Europe
46
faster web
16
free expression
61
google.org
69
googleplus
50
googlers and culture
203
green
101
Latin America
18
maps and earth
193
mobile
125
online safety
19
open source
19
photos
38
policy and issues
139
politics
71
privacy
66
recruiting and hiring
32
scholarships
31
search
504
search quality
24
search trends
118
security
36
small business
31
user experience and usability
41
youtube and video
140
Archive
2016
Jan
2015
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2014
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2013
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2012
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2011
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2010
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2009
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2008
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2007
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2006
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2005
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Mar
Feb
Jan
2004
Dec
Nov
Oct
Sep
Aug
Jul
Jun
May
Apr
Feed
Follow @google
Follow
Give us feedback in our
Product Forums
.