Every poker player will have wished they had just a little more insight into a hand at some point. A new piece of software created by the Computer Poker Research Group at the University of Alberta, however, has no such crises of confidence. Cepheus has "solved" heads-up limit Texas hold 'em.
"We define a game to be essentially solved if a lifetime of play is unable to statistically differentiate it from being solved at 95 percent confidence," explains lead author of the research Michael Bowling. "Imagine someone playing 200 hands of poker an hour for 12 hours a day without missing a day for 70 years. Furthermore, imagine them employing the worst-case, maximally exploitive opponent strategy, and never making a mistake. They still cannot be certain they are actually winning."
Heads-up limit Texas hold ‘em is played with just two players, fixed bet sizes and a limited number of raises allowed. According to the University of Alberta, this version of poker has fewer possible situations than Checkers, a game the institution says it solved in 2007.
In poker, however, all of the the game information is not laid bare to the players as it is in checkers. It is not possible for players to have full knowledge of past events or to see their opponents' hands, for example. This "imperfect-information" nature of heads-up limit Texas hold ‘em, says the University of Alberta, makes playing or solving it a much more challenging problem for computers.
Cepheus was programmed with the rules of the game and was then trained against itself. According to Bowling, the software was run for two months using more than four thousand CPUs each considering over six billion hands every second. Over a billion, billion hands were played, with the software learning and improving with each hand.
"The breakthroughs behind this result are general algorithmic advances that make game-theoretic reasoning in large-scale models of any sort more tractable," explains Bowling. "With real-life decision-making settings almost always involving uncertainty and missing information, algorithmic advances – such as those needed to solve Poker – are needed to drive future applications."
The university says that this is the first time a nontrivial imperfect-information game played competitively by humans has been solved. Although the research focused on a poker variation, imperfect-information situations in the real-world to which it might be applied could include decision making at airport checkpoints and coast guard patrolling.
Source: University of Alberta