In 2007, I took interest in computer poker Texas Hold'em. At this time, in multiplayer games, the world champion is Poki. You can play online with it on the University of Alberta online game. Even if Poki is a good start point to learn how to play poker, it is not more than an average human poker player.
In the following, I will assume you are familiar with the rules of Texas Hold'em poker. If not, you sould have a look to this website (also available in french).
Multiplayer poker is fun but at this time there is so much difficulties that research about poker actually focuses on head up matches. In a scientific point of view, it makes the game more tractable and the key difficulty remains : Poker is an information incomplete game. Poker is not only about statistics, it is a strategie game with randomness. Moreover, if you want to correctly play poker, in some situations you have to bluff. I mean play some bad hands like good ones. Not conviced ? Ok, lets see.
To understand the interest of bluff, consider this small example (thanks to Martin Zinkevitch to show me this trick). Assume a two player game. Each player must pay 1$ to the pot to receive a value uniformly drawn in [0,1]. They keep this value secret (incomplete information game). At this point, you have 2 choices :
- Check - Don't add money to the pot
- Raise - Add 1$ to the pot
- Fold - he loses but don't add any money to the pot
- Check - equalises the bets and reveal the secret numbers. The higest win the pot.
Now consider you know you opponent will Check if it's free or if his secret number is greater than 0.4. We can compute the expected value of our two actions as a function of our secret number :
As you can see, the magic occurs, and the red line (expected value for the raise strategie) is over the expected value for the Check strategie for the greatest hands (high secret number) and for the lowest ones…
An other interesting point, is that we found an optimal response to the strategie "Check is greater than 0.4". But our strategie can also be defeated by an other one, and so on.
Traditionnal solution for real Poker - Nash Equilibrium
The main idea is to found a strategie whcih cannot be defeated (in expectation). Because poker is a zero sum game, Nash's theorem ensures existence of such strategies (but not unicity). A Nash equilibrium in poker would be a player with no adaptation in its strategie and with some stochastic decisions. It would be considered by humans as a very defensive player.
Even in head ups, the tree of the game is really huge. So big than solving it is untractable at this time. But the tree is far more width than height : in a poker round, each player have few actions availables. So, the usual approach is to make some buckets of hands, clustering hands with equilvalent probalibities of winning (according to the revealated common cards). Performing a such reduction in the game tree, some techniques of linear programming can solve it (see publications of D. Koller), leading us to a pseudo optimal Nash Equilibrium player. This kind of techniques leaded to several top bots as Sparbot, GS3, Hyperborean…
This kind of bots were the first ones able to play without losing against professionnal players (see the man machine competition for). But a problem remains : this kind of player is boring to play with. Moreover, they don't try to exploit weaknesses of their opponent, just play their hands an wait for errors.
This problem can be easily seen in an other game : Rochambo (Stone-Paper-Cisors). The only Nash Equilibrium is to play one of the tree possibilities at random with probability 1/3. This player cannot be defeated (in expectation), it cannot lose, but cannot win either. That's because, performing an adaptation of your strategie to defeat a weak oponent (say an always Stone player) lead to an other weak player (an always Paper).
An other point is multiplayer games. Bucketing trick allows to compute an approximation of Nash Equilibrium, but in multiplayer games it doesn't stand. The tree is still too big and we are not sure than a Nash Equilibrium is a good player in this case (a Nash equilibrium would play as all the other players collude). In multiplayer games, the challenge for humans is to balance attack and defence to exploit the weakest players.
This is a harder part. Until 2006 the leader in this area was Vexbot. Even if vexbot is good, it knows nothing about statistics and some bad beats in the begining of its learning can lead to strange play. It was outperformed by some version of Polaris in 2007. Anyway, evaluation of this kind of player is hard because when they play again an Equilibrium bot, they lose or play as an equilibrium (which is normal). Note that experts players are able to play very close to a Nash equilibrium even if they need to know that they are playing against a non adaptative player.
With Raphaël Maîtrepierre, we developped our version of a Poker bot : Brennus. At this time it differs from the previous ones on two main points :
- Hand abstraction :
- We don't use any hand abstraction. We use some Monte Carlo simulation to evaluate the strength of our hand with respect to the modelised strategie of our opponent.
- Opponent Modelisation :
- We use several basis strategies, and use an adaptation of the UCB algorithm to choose the best one. It leads to interesting results because our play is a mixture of strategies selected by a UCB-like and biais of strategies are balanced by the biais of the other ones.
How to play against Brennus ?
At this time Brennus is offline.