Google's deep Q-network proves a quick study in classic Atari 2600 games
In an old school gaming party to end all parties, Google's new deep Q-network (DQN) algorithm is likely to mop the floor with you at Breakout or Space Invaders, but maybe take a licking at Centipede. Provided with only the same inputs as a human player and no previous real-world knowledge, DQN uses reinforcement learning to learn new games, and in some cases, develop new strategies. Its designers argue that this kind of general learning algorithm can crossover into discovery making in other fields.
The only inputs that the team from Google DeepMind, London, gave the DQN agent were the raw screen pixels, the set of available actions and game score, before letting it loose on 49 Atari 2600 games to see how it fared. These included well known favorites like Breakout, Pong, Space Invaders and Q'bert, side-scrolling shooters, such as River Raid, sports sims like Boxing and Tennis, and 3D car racer Enduro.
Sick of Ads?
More than 700 New Atlas Plus subscribers read our newsletter and website without ads.
Join them for just US$19 a year.More Information
The researchers say that DQN performed at more than 75 percent of the level of a professional games tester for over half the games, and that in 43 cases surpassed any existing linear algorithm for learning that game. It performed best in Breakout, Video Pinball, Star Gunner and Crazy Climber, while DQN's worst games included the likes of Asteroids, Gravitar, Montezuma's Revenge, and Private Eye – but really, who was ever good at Gravitar?
A key feature of the DQN's algorithm is what the research team likens to humans revisiting and learning during rest periods, like sleep. In "experience replay" DQN reviewed stored games during its training phase. The researchers say this function was critical to DQN's success, with the algorithm's performance dropping significantly when it was disabled.
DQN is distinctly different from previous notable game playing agents, such as IBM's Deep Blue, as this new algorithm represents machine learning from a blank slate, with no prior definitions, rules, or models. Its creators say such algorithms could help make sense of complex large-scale data and be used in a wide variety of fields, including climate science, physics, medicine and genomics and potentially even providing insights into how humans learn.
Of course, it could also help Google to create new products and improve on existing ones, like taking an "OK Google" request much more complex than a query about the weather, and developing actionable results.
Google's blog entry, linked below, contains a video originally published in Nature and depicts the yawningly-slow first 100 games of DQN failing to even return the ball in Breakout, to learning how to tunnel through to the top of the bricks.
The research was published this week in Nature