Landmark AI system beats poker pros in multi-player Texas hold 'em
A team of computer scientists from Carnegie Mellon University and the Facebook AI Research lab has created an AI system that, for the first time, has defeated several poker professionals in six-player Texas hold-em. Unlike earlier iterations of the system, the researchers will not publicly release this algorithm's code, for fear it could decimate the online poker world.
Back in early 2017 a team of Carnegie Mellon researchers demonstrated a new AI poker system called Libratus. Over a decades worth of work culminated in an impressive 20-day event in which Libratus beat four poker professionals across 120,000 hands of no limit Texas hold 'em.
Libratus was not perfect though. As well as only functioning effectively in two-player, head-to-head versions of the game, it relied on an extraordinary amount of supercomputer power to work. Libratus needed 15 million CPU core hours to just develop a blueprint strategy, and during live gameplay still relied on 1,400 CPU cores to function.
Now, in 2019, the researchers have revealed Pluribus, an extraordinary evolution of the poker playing system, which can now win multi-player poker games while using only a fraction of the processing power of its predecessor – 12,400 core hours to compute its blueprint strategy and just 28 CPU cores in live play.
Over the last few years we've seen a number of incredible milestones in AI development. Games have always been a compelling benchmark for assessing truly dynamic artificial intelligence systems, and from chess to Go we've witnessed increasingly sophisticated algorithms dominate human players. However, these games have all primarily been zero-sum, two-player challenges. Multi-player poker, on the other hand, is exponentially more complicated, relying on hidden information, bluffing, and unpredictable strategic play.
To test Pluribus, the researchers recruited a pool of poker champions to play 10,000 hands a day across a 12-day period. These were six-player games, pitting the AI against five professionals. Another series of experiments pitted a single professional against five independent copies of Pluribus. Across all experiments and games Pluribus steadily beat the human pros.
"Playing a six-player game rather than head-to-head requires fundamental changes in how the AI develops its playing strategy," says Noam Brown, one of the Carnegie Mellon researchers who recently joined the Facebook AI Research lab. "We're elated with its performance and believe some of Pluribus' playing strategies might even change the way pros play the game."
Pluribus works by beginning each competition with a blueprint strategy, produced by playing multiple games against itself. But, pretty much immediately after the first round of gameplay, the system begins to shift that strategy in real-time. One of Pluribus' interesting, and successful, strategies was utilizing a method referred to as "donk betting", which is commonly avoided by human players.
"Donk betting" is when a player starts a round with a bet, immediately following a round they ended with a call. Only on rare occasions is this considered a strong strategic play, and the name itself is a reference to calling bad players donkeys, as they may often unknowingly make this move without realizing what they are doing.
"It was incredibly fascinating getting to play against the poker bot and seeing some of the strategies it chose," explains Michael Gagliano, a professional player who was pitted against Pluribus. "There were several plays that humans simply are not making at all, especially relating to its bet sizing. Bots/AI are an important part in the evolution of poker, and it was amazing to have first-hand experience in this large step toward the future."
Alongside Pluribus' sophisticated and unpredictable gameplay comes a significantly reduced need for processing power compared to prior AI gameplay systems. The researchers note 2016's incredible AlphaGo system won its games using 1,920 CPUs, and Libratus in 2017 needed 100 CPUs to run its two-player poker games. Pluribus incredibly runs on just two Intel Haswell E5-2695 v3 CPUs, and less than 128 GB of memory. Each move Pluribus makes takes on average 20 seconds, about twice as fast as a standard professional poker player.
This landmark achievement is undeniably an impressive leap forward for AI development, but it is reasonable to ask what this means for the highly profitable world of online poker. Despite more openly revealing the code behind Libratus back in 2017, the researchers are suggesting Pluribus' algorithms will have to remain secret, and will not be publicly released at this point. Speaking to MIT Technology Review, Noam Brown suggests the system could effectively win huge volumes of money in the online poker environment.
"It could be very dangerous for the poker community," Brown warns.
The new research is published in the journal Science.
Source: Carnegie Mellon University