January 20, 2015

Poker's Deus Ex Machina (Part II)—
How a Computer Proved Poker Is a Game of Chance

AUTHOR'S NOTE: This post is the second of two related posts. Part I is HERE.

* * * * *

As discussed in my last post, the recent news that a team of researchers has "essentially weakly solved" heads-up limit hold' em poker (HULHE) should be considered significant support for the legal argument that poker is a "skill game"—i.e., a game where skill, rather than chance, is the "dominant factor" in the game. In fact, the Cepheus computer program's ability to play a non-exploitable, game theory optimal (GTO) strategy does advance the skill game argument by showing that the skill-chance analysis cannot be confined to a single hand, demonstrating the importance of making long-term strategic decisions (e.g., balancing ranges). Further, Cepheus proves that, at least for the HULHE variation of poker, a player can use a GTO strategy that is indifferent to the role of chance over the long-term (i.e., the strategy will not lose to a non-GTO strategy over a statistically significant number of hands).

So, has Cepheus resolved the skill game legal argument in favor of poker as a game of skill? Unfortunately, the opposite may well be the case. As one of Cepheus' researchers explained, an important implication of a GTO strategy is that it is designed to be impervious not only to the effects of chance, but also to any counter-strategy (emphasis added):
"Since poker is a symmetrical game, the end strategy which Cepheus plays is an unbeatable one. While chips can, of course, be won from Cepheus in the short term, there is no decision which can be made against it which will be a winner in the long term. If a perfect opponent, either human or computerized, were to play a semi-infinite number of hands against Cepheus the best possible result would be for them to break even. Any imperfect opponent, which unfortunately includes all human players, would make mistakes along the way and lose. That being said, what Cepheus cannot do is maximize its winnings against weak opponents, a skill [at] which humans excel. Cepheus is simply an invincible, immovable bunker, a Maginot Line that actually works."
Thus, if two players face each other and both play a GTO strategy, then neither player will be able to exploit the other player, neither player will have a strategic advantage, and the result of the game will be left entirely to chance. In other words, the fact HULHE has a GTO strategy necessarily implies that the game can be played in such a way that the sole determining factor in the outcome is the effect of chance (i.e., which player gets luckier).

Now the possibility of a GTO v. GTO showdown may appear only theoretical. But let's consider the opposite situation, where two HULHE novices are matched up. Assuming neither player has any knowledge of proper strategy, again the results of the match are determined solely by chance. Now, let's take the next leap—two equally experienced, talented players who try to exploit the other player's flaws. In order to exploit those flaws, each player will necessarily make flawed strategic decisions (i.e., deviate from GTO strategy). However, over time, the constant back-and-forth of play should result in a series of game adjustments by both players which leaves each of them playing a close approximation of GTO strategy, such that neither player has a significant strategic edge. Again, the long-term results of such an even-skill match would be governed mostly (predominantly) by chance.

This implication for GTO strategy—that skilled players will eventually reach a close approximation of a GTO equilibrium strategy—is real and not theoretical. As was noted by Cepheus' researchers (emphasis added):
"So what does the availability of Cepheus’ data mean to limit hold’em play, particularly in the online environment where there are no effective checks against referencing Cepheus while play is ongoing? Not a great, deal unfortunately. While Cepheus would have undoubtedly had a detrimental and traumatic effect on a competitive online environment there is effectively no environment left to traumatize. Due to the rake, which is the share of the pot which the house claims as its fee, poker is a negative sum game. As the fundamental of heads up limit hold’em became better understood and the skill gap between competitors narrowed, many players found themselves in a position where they were able to beat their opponents but not both their opponents and the rake. More and more often, competition between players began to result in both players losing and the situation was exasperated [sic—exacerbated] by the decline of the online poker industry, which shifted a large portion of competitive play to lower stakes where the rake represents a larger percentage of a player’s potential winnings. Poker players, being rational people, did the only sane thing they could do, which was decline to play anyone who appeared to be of even remotely similar skill. At of the time of writing this article on a Saturday evening there are, on Pokerstars, the current market leader, thirty-five heads-up limit hold’em tables above the one dollar level where players are waiting for an opponent and one table at which two players are actually competing. Cepheus will undoubtedly prove a valuable sparring partner and research tool for casino players and enthusiasts looking to sharpen their skills, but the heyday of heads-up cash play has, unfortunately, already passed."
This concern about relative skill between players is common within the poker community. Online poker players have long engaged in the process of "bum hunting"—looking for games with known weak players to exploit. As poker professional Paul Ratchford explains:
"Poker is a zero sum game minus a cut that the house takes. So in an environment where all players are good and everybody plays game theory optimal poker EVERYBODY loses. The house takes money out of pots at an enormous rate especially at lower games. In fact, versus a bunch of skilled regulars (with zero recreational dollars in play) it may be impossible at a 6-max or full ring table for even some of the best in the world to win…. The bottom line is that if you are a professional poker player you need to be bum hunting / table selecting."
But the bum hunting problem is not limited to online play. Going back even to the early WSOP days, elite brick and mortar poker players like Doyle Brunson and Amarillo Slim would seek out games with easy marks like Archie Karas and Jimmy Chagra. More recently, high stakes professional poker players have pursued games with "whales" like Cirque du Soleil founder Guy Lalibert√©, baseball superstar Alex Rodriguez, and Texas banker Andy Beal (immortalized in the classic book, The Professor, the Banker, and the Suicide King: Inside the Richest Poker Game of All Time). Similarly, professional poker players Phil Ivey, Andrew Robl, Dan "Jugleman" Cates, and Tom Dwan have recently posted bail money or provided support to Paul and Darren Phua, individuals who reportedly control access to Macau's famed, whale-laden high stakes poker games. Back in Vegas, a number of poker pros jealously protect their whales from poaching by other pros:
"This game started about a week before the $1 million One Drop tournament and ran daily. Though a security guard kept gawkers and potential short-buys at bay, recognizable faces included One Drop Founder Guy Laliberte, Rick Salomon (the movie producer most famous for his Paris Hilton tape), and self-described model / actor / astronaut / asshole” Dan Bilzerian (@DanBilzerian). When this game runs, even the pros who play the regular 300-600 mix at Aria move elsewhere. 'Crazy' Mike Thorpe, who organizes many high-stakes mix games in town, says the regular 300-600 players at Aria, which include David 'Viffer' Peat and Ivey Room host Jean-Robert Bellande, have to move to Bellagio because Bobby Baldwin himself (Bobby’s Room namesake) would rather host his nosebleed no-limit game in The Ivey Room without pros."
Ironically, Bellande has himself been criticized by other poker players, including WSOP Main Event champion Greg Merson, for setting up high stakes poker games filled with whales, then excluding other poker pros from those games. Of course, this pattern of skilled "sharks" seeking out less talented "fish" to exploit isn't limited to high stakes play.

The irony of "bum hunting" or targeting "whales" and "fish" is that these weaker players generally play a highly flawed poker style that diverges markedly from GTO strategy. Although a GTO strategy would profit off these weak players over time, a non-GTO style will actually exploit weak players faster and for greater profit. So, in essence, poker's best players generally profit off of weaker players by utilizing a non-GTO strategy. Poker professional Paul Ratchford explains this irony (albeit in the context of no-limit hold 'em):
Maximum exploitive NLHE occurs when a player chooses the most exploitive line to maximize his/her expected value. Most players do not play balanced ranges and, therefore, we should seek to maximize our edge by playing appropriately unbalanced in response. In a Rock, Paper, Scissors example, where we know that our opponent will throw rock 100% of the time, we would simply use paper 100% of the time. Even if we knew that our opponent threw rock 40% of the time, 30% paper, and 30% scissors, the maximum exploitive play would still be paper 100%. In NLHE, if you play heads up versus an opponent who folds 100% of the time to three-bets, your response would be to reraise 100%. It is important to note that if you are playing against a GTO opponent, the maximum exploitive strategy will be GTO. The appropriate response to a perfectly balanced Rock, Paper, and Scissors range is to be perfectly balanced yourself.
Or, as Cepheus' own researchers admit, "what Cepheus cannot do is maximize its winnings against weak opponents, a skill [at] which humans excel."  In other words, maximizing profits in poker requires deviation from Cepheus-style GTO poker strategy.

The analytical takeaways from the discussion above can be distilled into these Poker Postulates:
  • A poker player's relative skill advantage over his opponent matters more than his absolute skill level—i.e., "In the land of the blind, the one-eyed man is king."
  • As the difference in skill between poker players increases, the effect of chance on game results decreases, but is never eliminated altogether.
  • In poker games between players of substantially similar skill, results will be determined predominately by chance.
  • In poker games between players of substantially dissimilar skill, results will be determined predominately by skill, even though over a short period of time chance may permit a lesser-skilled player to prevail.
  • Skill in poker is more readily demonstrated by utilizing a non-GTO strategy to exploit weaker players than in utilizing a non-exploitable GTO strategy, at least insofar as success is measured by profits.
To be blunt, then, poker skill ultimately is not measured by how well a player selects starting hands, calculates pot and implied odds, or balances ranges. Likewise, poker skill is not measured by degree of similarity to or deviation from a GTO strategy. Rather, poker skill predominately turns on game selection; that is, being able to get into a game with weaker opponents whose flawed strategies can be exploited via a non-GTO strategy.

Returning, then, to the skill game legal argument, the clear implication of the Cepheus GTO strategy research is that poker advocates are left defending the awkward proposition that poker "skill" has little to do with game-related strategy and mostly means "preying on weak players" (or bum hunting, or fleecing fish—pick your own metaphor). Presented in this context, poker players begin to look less like mathematical savants and more like casino operators luring patrons to a -EV table game. In fact, for many poker players, their odds of winning money would actually be enhanced dramatically if they gave up poker for a seat at a house-run table game.

Is it any wonder the law treats poker the same as Mississippi Stud or Let It Ride?

“The creatures outside looked from pig to man, and from man to pig, and from pig to man again; but already it was impossible to say which was which.”

~~ George Orwell, Animal Farm

January 19, 2015

Poker's Deus Ex Machina (Part I)—
How a Computer Proved Poker Is a Game of Skill

One of my favorite TV shows is the CBS drama, Person of Interest. The show's plot is driven by the concept that a fully functional artificial intelligence (AI) computer program has actually been created. The AI system—known as "The Machine"—was originally created as an omnipresent surveillance tool to detect terrorist plots for the government. The Machine's creator feared those in power would abuse its abilities and went underground, programming The Machine with an ethical code and using it to predict and prevent criminal acts.

In last week's episode—the aptly titled "If-Then-Else"—the show's protagonists were caught in a conundrum, needing both to hack into a computer system to prevent a stock market crash while also escaping a trap meant to capture or kill them. The show flashed back to when The Machine was first created and its creator was teaching it to play chess. The Machine would calculate thousands of possible moves ahead, and when a chosen line of attack ultimately failed, would alter its strategy for future games, thereby learning how to play the game better. The rest of the show involved watching The Machine run through multiple alternative game plans for the team of heroes, with many of them ending in the team's demise. The Machine ultimately found a solution which gave the team a slim but real chance of survival. Of course, the show ended in a shocking cliffhanger—the apparent death of one of my favorite characters—which initially angered me, but ... well, as they say, "Spoiler Alert".

Just a few days after watching that Person of Interest episode, news broke via Science magazine that a team of scientists in Canada had "essentially weakly solved" heads-up limit hold 'em poker (HULHE). The methodology used to develop the Cepheus poker program is uncannily similar to that used with Person of Interest's fictitious Machine. Essentially, Cepheus played billions of billions of hands of poker against an identical program, beginning with random trial and error as to the proper strategy—bet, raise, call, or fold—at each decision point. Once the hand concluded, Cepheus would assign a "regret factor" to each decision based on the hindsight knowledge of the actual hand results. As tens of thousands of similar hands and situations accumulated, Cepheus would adjust its strategy to lessen or avoid decisions with higher regret factors, instead pursuing the balance of decisions which, overall, caused the least regret. For example, Cepheus might initially only check or call on most flops with a pocket pair higher than any card on the board, but would over time learn that betting or raising a high percentage of the time is a better (i.e., more profitable) strategy. Eventually, the program reached a point where further adjustments created more regret, meaning that the program had developed a non-exploitable, Game Theory Optimal (GTO) strategy.

The announcement that HULHE has been solved has been widely covered by both the tech/science media and the general media, generally in a positive light (see, e.g., FiveThirtyEightBloomberg, Yahoo News, The Washington Post, The Verge, Gizmodo, Nature, and Spectrum). One article, however, took a more skeptical view. Chris Hall, writing for The Guardian, claimed to have played Cepheus and found it to be flawed:
"At first, I was roundly stuffed by the computer’s non-stop aggression. Any bluffs I made failed miserably. To counteract this, I became more aggressive preflop and stopped bluffing almost entirely. Cepheus’s game did not adapt to my play and it made what I would consider several questionable plays. The program was reluctant to ever give up any sort of hand in a large pot making it easier to get lots of value from moderately weak hands."
Hall's claim is utterly at odds with the claim that Cepheus has solved HULHE, and reflects a lack of understanding of what GTO strategy means in game theory. If Cepheus in fact is playing a GTO strategy, then by definition Hall cannot play a style which attacks a flaw in Cepheus' strategy because GTO strategy has no flaws to exploit. As Cepheus' creators explain (emphasis added):
"Since poker is a symmetrical game, the end strategy which Cepheus plays is an unbeatable one. While chips can, of course, be won from Cepheus in the short term, there is no decision which can be made against it which will be a winner in the long term. If a perfect opponent, either human or computerized, were to play a semi-infinite number of hands against Cepheus the best possible result would be for them to break even. Any imperfect opponent, which unfortunately includes all human players, would make mistakes along the way and lose."
Rather than exposing a supposed flaw in Cepheus, Hall's short-term positive results were purely a matter of short-term variance—that is, Hall got lucky. Now, this is not to say that Hall's change in tactics had no effect; either:
  1. Hall originally was playing sub-optimal poker and correctly adjusted toward GTO strategy, improving his results (which were augmented by short-term variance); or,
  2. Hall incorrectly adjusted away from GTO strategy to exploit a perceived (but illusory) flaw, won over the short-term because of variance, but would in fact lose over the long-term utilizing that strategy.
To be fair, Hall acknowledged that the 400 hands he played—essentially one or two decent cash game sessions—were an insufficient sample size to evaluate Cepheus. Still, Hall doubled-down on his irrational doubting of Cepheus:
Perhaps the best way to show off Cepheus would be to issue a challenge over a fixed amount of hands to a world-class professional player like Daniel Negreanu or Phil Ivey. This could create poker’s own version of Deep Blue v Garry Kasparov and would certainly be interesting for poker junkies like myself. I’d probably still take man over machine, though.
Assuming a statistically significant number of hands were played, Hall picking a human to defeat a computer playing GTO strategy? GTFO!

Hall's article did indirectly point out one crucial distinction between the Cepheus GTO strategy and the strategies employed by skilled human poker players: Human players will often deviate from GTO strategy in specific situations against specific opponents in order to maximize their profits through exploitation of weak players' worst errors, even though that particular non-GTO strategy would lose money over the long run against most opponents. As Cepheus' creators readily admit (link added):
"[W]hat Cepheus cannot do is maximize its winnings against weak opponents, a skill at which humans excel. Cepheus is simply an invincible, immovable bunker, a Maginot Line that actually works."
Although Cepheus is an impressive achievement in its own right, my thoughts immediately turned toward how Cepheus would play in the legal world. Does the development of a computer program capable of GTO poker play decisively prove that poker is a game of skill rather than a game of chance?

Ah, yes, our old friend, the "skill game argument", which you might recall from such classic crAAKKer posts as: "Tilting at Poker Windmills", "Why Poker Litigation Fails", and "Garnishing a Turd (Part IV): DiCristina Ends, Not With a Bang, But a Whimper". The foundation for the skill game argument is that, for purposes of gaming law in some states, "gambling" is determined by an evaluation of whether skill or chance is the "dominant factor" in determining the outcome of the game in question. The skill game argument seeks to prove that poker is a game in which skill predominates over chance, and therefore is not illegal under applicable state gaming laws. Unfortunately, every appellate court to have considered the skill game argument to date has rejected the argument or found it irrelevant (see Section E and FN3 of my discussion of the DiCristina appellate decision for the full summary of this litigation futility).

The skill game argument reached its apex in the DiCristina litigation, where a federal district court judge found poker to be a game of skill for purposes of the federal Illegal Gambling Business Act. As discussed in my analysis of the DiCristina district court decision, the court's analysis of the skill game issue was driven in large part by Dr. Randall Heeb's sophisticated statistical analysis of millions of online poker hand histories. Dr. Heeb was able to demonstrate that winning players displayed a skill edge greater than expected variance within a few thousand hands of play, and also that winning players won more money than losing players even when playing the same starting hands.

Cepheus advances the skill game argument by demonstrating that there is a theoretical strategy for playing HULHE which is optimal, in the sense of being non-exploitable over a sufficiently large number of hands. In fact, Cepheus' creators note that there may be multiple GTO strategies for HULHE: "different Nash equilibria may play differently". (p. 9).

Cepheus makes two significant contributions to the skill game argument. First, Cepheus demonstrates that individual poker decisions must be evaluated in the aggregate, over time. To this point, many of the examples of poker skill used to support the skill game argument have focused on individual or tactical poker plays—for example, how pot odds, stack size, or starting hand strength can be used to determine correct game decisions. Cepheus shows that the game is substantially more complex than any one play or hand, and that a strategic approach to game decisions is both necessary and possible. Although successful poker players recognize the importance of long-term strategic game theory concepts such as range-balancing, Cepheus is a rigorous mathematical and logical proof of the importance of poker players thinking beyond the immediate play or hand. In other words, Cepheus is a refutation of the superficial legal argument that poker is a game of chance because, regardless of skill, players are still "subject to defeat at the turn of a card" in a particular hand.

Cepheus also makes another, more significant contribution to the skill game argument. Some of the legal arguments made in favor of poker as a skill game overreach, trying to establish that nearly nothing in the game is beyond the control of the player; these arguments fall flat against the easily observable elements of chance in play. Cepheus, however, takes the element of chance head on and renders it irrelevant. Cepheus does not deny the presence of a significant element of chance in the game. Cepheus simply is indifferent—impervious even—to the effects of chance over the long-term. Regardless of what cards may fall by chance, Cepheus will over the long-term win against opponents playing a non-GTO strategy. For purposes of the legal skill game argument, Cepheus serves as the embodiment of the triumph of skill over chance.

So, does Cepheus mean that the legal skill game argument is over? Hardly. Cepheus is a proof of HULHE only. One of the Cepheus researchers, Neil Burch, is participating in a couple of threads in the Two Plus Two poker forums where he describes some of the limits of the Cepheus results. First, Burch does not think heads up no-limit poker is solvable using current methods and technology because of the exponentially greater number of decision points in play. Second, Burch points out that moving from a heads up to even a three-person game adds significant layers of complexity to the analysis, including the interesting possibility that two players could collude to exploit a third player, even if that third player was playing an equilibrium strategy. Consequently, Cepheus is better viewed as a "proof of concept" of the degree of skill involved in poker, not a proof that every version or permutation of poker has a GTO strategy impervious to the effects of chance.

Unfortunately, what Cepheus giveth, Cepheus also taketh away. Like a demon from a bad horror flick, Chance does not want to stay dead and buried. Stay tuned, true believers, for our next episode when Cepheus resurrects Chance to haunt the skill game argument once again.

* * * * *

AUTHOR'S NOTE: This post is the first of two related posts. Part II is HERE.