optimal OOP play against a polarized range
Consider this simplified river spot. Hero is OOP with a condensed range against Villain’s polarized range. Hero’s hands can only beat Villain’s bluffs. Let’s also say that each player can only use an unmixed strategy for their entire range: Hero must always check-call or check-fold, Villain must always check or bet.
How does Villain’s EV change as a function of his range composition?
Hero always calls
When Villain bets his entire range he begins to make money when his value bets outnumber his caught bluffs. Once we consider the money Villain loses from failed bluffs, we see that his EV is equivalent to his equity share.
Furthermore, we can integrate his EV equation from 0 to T and we get PT/2. This is equivalent to multiplying the mean value by total combos, T.
Villain checks his entire range and again his EV is equivalent to his equity.
Now we can compare Villain’s perfect counter strategy (which is mixed) to the above suboptimal unmixed strategies. Since Hero is going to call with every combo in his range (and they are all bluff catchers) Villain should check behind with his air and value bet his made hands; correspondingly Villain should never bluff.
The graph tells us that the prior strategies lose value from either bluffing or missing value bets. Each ‘bad’ strategy has the same EV across Villain’s entire range and the perfect strategy has a higher EV by B/2 on average; the perfect strategy makes BT/2 more dollars than the bad strategies across Villain’s entire range.
Furthermore, we can say that Villain’s bad strategies do not exploit Hero because his EV is equivalent to his equity share. The perfect strategy allows Villain to increase his EV by value betting. Another, less exploitative strategy is for Villain to check when his range is mostly bluffs and bet when he has mostly made hands.
Hero always folds
This case is not as interesting. When Villain checks behind his EV is the same as when Villain always bets and Hero always calls (the second graph). Villain always wins the pot when he bets his entire range. This strategy is indistinguishable from the perfect strategy where Villain checks behind with his made hands and bluffs with his air, as again Villain always wins pot.
Villain’s perfect strategy and his always bet strategy both exploit Hero as his EV is larger than his equity share. Always checking is a nonexploitative strategy for Villain. The exploitative strategies out perform the ‘bad’ strategy by P/2 on average and get Villain an additional PT/2 EV across his entire range.
Which unmixed strategy should Villain use if he’s not sure which unmixed strategy Hero is using? We know that Villain’s strategies are break even against a Hero who always calls; we also know that a folding Hero can only be exploited by a betting Villain. This means that Villain should choose to use the always betting strategy as it will exploit Hero when he decides to always fold while a checking Villain can never exploit Hero.
What does the optimal strategy for Hero look like against a Villain who can only use the unmixed strategies? The optimal strategy would have to be at least as good as always calling for Hero. A strategy better than always calling would exploit Villain because Hero’s expected value would be more than his equity share. Provided Hero knows Villain’s range composition, a simple mixed exploitative strategy for Hero is to always call when Villain’s range is mostly bluffs and always fold when Villain’s range is mostly value. This demonstrates that Villain’s always bet strategy is not optimal against mixed strategies because Hero is able to exploit it.
Now let’s move on to the more interesting case where both parties are permitted to use mixed strategies.
optimal range balancing and bet sizing by villain
Consider Villain’s EV equation.
Consider the number of bluffs as a fraction of the number of value hands.
This version of the function shows Villain’s expected winnings per value hand in his range.
If Villain’s strategy is unexploitable, his EV shouldn’t depend on Hero’s calling frequency, C.
This function gives us the balanced bet sizing for a given pot size and bluff ratio.
When A < 1, we have more bluffs than value hands. We can see that the optimal bet sizing for such a case is a negative number, implying that we can’t play optimally when we have more bluffs than made hands.
When A = 1, we have the same number of bluffs as made hands. This requires an infinitely large bet, so we can’t play optimally when we have the same number of bluffs as made hands.
As A approaches infinity (indicating that we have very few bluffs), the optimal bet sizing is zero. There is no incentive for Hero to call if Villain has no bluffs in his range, so Villain’s EV is equivalent to his equity share.
Now wrapping this all together we can consider Villain’s EV function without a dependence on C.
Consider the function when we replace the bluff constant with the bluff count, L.
Every bluff that Villain adds increases his EV as if it were a made hand and not a bluff. The number of bluffs that Villain can profitably add is bounded by the number of made hands in his range (and also the total number of hands, as Villain can’t make a bluff out of no hand).
what is the optimal # of bluffs for villain to have in his range?
This chart also shows bet sizing.
Notice that as Villain increases the number of bluffs the multiplier on his EV is increasing from 1 to 2. We know that the multiplier will never quite make it to 2 because A would have to be 1 and such isn’t a valid value.
Let’s pretend that Villain is capable of getting his multiplier to the unattainable 2 factor. What is Villain’s EV when half of his range is value hands (and the other half is bluffs)?
Half of his range is making 2 pots, which means that Villain is able to take all of Hero’s equity. We know that this case isn’t strictly possible because A can’t equal 1. Are there legal cases where Villain can take all of Hero’s equity? Consider that if Villain’s EV is more than pot, Hero should just always fold which will give Villain an EV of pot. This means that Villain’s EV is bounded by his equity share and the size of the pot.
Let’s look for other cases where Villain wins the entire pot.
Any A bounded as such will allow Villain to win the entire pot. This inequality hints that the amount of money that a range can win is bounded by the percentage of value hands in that range.
How can Hero exploit Villain?
A discrepancy between Villain’s bet sizing and his value-bluff ratio indicates that he can be exploited. It follows that Hero needs two pieces of information to exploit Villain: 1) the number of bluffs that Villain’s bet sizing says he should have and 2) the number of bluffs that Villain actually has. The first number is a matter of arithmetic, the second number is an estimation that Hero comes to by hand reading.
Here we see the mathematics of exploiting an unbalanced opponent identifying the importance of hand reading. What is Hero’s strategy when Villain has too many bluffs? too few bluffs?
Let’s take another look at Villain’s EV function. This is a linear function dependent on the variable C, Hero’s calling frequency. When Villain has too few bluffs, we expect his EV to increase with Hero’s calling frequency; Hero is paying off Villain’s made hands. When Villain has too many bluffs, we expect his EV to decrease with Hero’s calling frequency; Hero is catching Villain’s bluffs. Let’s see if we can find this pattern in Villain’s EV function.
We can see that the slope of the line is determined by the factor multiplying C. This is the m, in the mx+b form.
Conditions for a down slope.
Conditions for an up slope.
These results aren’t super surprising when you consider that we already showed that when this form is an equality (as opposed to an inequality), Villain is using an unexploitable bet size. So it follows that when Villain’s bet is too small, Hero should call; and when the bet is too large, Hero should fold.
Here’s an elucidating graph.
The larger the magnitude of the slope, the larger the discrepancy between the number of bluffs Villain actually has and the number of bluffs his bet sizing requires.
Hero’s strategy is as follows. When Villain has too many bluffs, always call. When Villain has too few bluffs, always fold. When Villain is balanced, do whatever you want. You’ll note that this is similar to the exploitative strategy Hero was able to use against a Villain who only used unmixed strategies.
Also note that by adjusting to Villain’s bluff discrepancy, Hero prevents himself from being exploited by Villain’s potentially exploitative strategy. Do we call such a strategy optimal?
What if Hero doesn’t know Villain’s “actual” bluff count? Is there an optimal calling percentage he can fallback too?
No, I don’t think so. This number can be thought of as an “optimal calling % against all possible range compositions and bluff ratios”. Presumably, this number would make Villain indifferent towards bluffing; all of Villain’s value/bluff compositions would result in the same EV. I think the crux of the issue is that in practice Hero isn’t equally likely to face each of Villain’s range compositions and bluff strategies.