How sharp are bookmakers?

Since early January 2019, we have collected odds from 106,204 matchups and 3-balls on the PGA and European Tours. The database spans 11 sportsbooks and includes opening and closing lines for 72-hole matchups, single-round matchups, and single-round 3-balls. In this blog we analyze some of this data and see how our model stacks up against the bookmakers. If you are primarily interested in the assessment of our model's performance, you can skip right to the final section.

In order to make an honest living, a bookmaker's offered odds on a given bet will add up to more than 100%. The amount by which this exceeds 100% is called the overround or the book's margin. This margin differs across books: Pinnacle generally offers odds with the lowest margin, while books like Bet365 and DraftKings are known as higher-margin books. Rather than summarize differences in the average margin, I'll characterize a book's advantage by calculating the average return from a "blind" betting strategy — i.e. betting on all outcomes. (In golf matchups betting, ties are sometimes treated as void — that is, your money from the bet is returned — while other times ties are a loss. The possibility of void bets makes calculation of the margin ambiguous, and I don't want to include the margin from tie bets in this exercise, anyways). The table below shows the average return — profit per dollar bet, or "ROI" — at each listed book from a blind betting strategy (I don't bet on the Tie when it is offered).

book	# of bets	blind return	margin-free return
5dimes	20882	-4.60%	0.05%
bet365	15848	-7.08%	0.06%
betcris	8508	-5.59%	-0.02%
betonline	6798	-4.88%	0.05%
bovada	17877	-6.80%	0.08%
draftkings	11868	-7.01%	-0.34%
pinnacle	11434	-3.31%	-0.01%

A bettor who is betting randomly at Pinnacle can expect to lose a meager 3.3 cents per dollar bet. Conversely, with that same random strategy at Bet365, you will be dishing out just over 7 cents on the dollar! Some of this difference is due to the fact that Bet365 offers 3-ball bets, which tend to be higher-margin bets, while Pinnacle does not. However, even after excluding 3-balls, DraftKings and Bet365 have blind returns below -6%. You may notice that at Pinnacle, for example, the blind betting return is substantially better than the typical margin on their bets (~3.7%). This is because ties are void for all bets offered by Pinnacle, but I still include those ties when calculating the average return [1].

Also included in the table is the so-called "margin-free return" from the same blind betting strategy at each book. This quantity requires some explaining. It will be useful for our purposes in this blog to be able to remove the margin from each bookmaker's odds. The simplest method for doing so is to divide the implied probability by the sum of implied probabilites in the bet. For example, suppose a bookmaker offered odds of 55% and 47% for a matchup where ties are void: the margin-free probabilities would then simply be 55/(55+47) = 53.9% and 46.1%. That is, this method assumes that the bookmaker applied the margin proportionally to each golfer; but of course, we don't know that this is the case. It is now well known that at more extreme odds bookmakers don't allocate margin proportionally; they put proportionally more margin on longshots than on favourites — this phenomenon is known as the favourite-longshot bias. It's a fascinating topic but it's one we won't discuss here. Fortunately, we can test whether our method for removing the margin is reasonable. If, after removing the margin, a blind betting strategy yields an average return of zero, this means that the margin-free odds approximate the actual likelihood of the event occurring (that is, margin-free odds of 65% do in fact happen 65% of the time.) This is a tricky concept to come to terms with; interested readers are referred to the footnotes [2]. Looking at the table, we see that blind returns at each book are near zero, meaning our method for removing the margin is sufficient here. (Because most odds in this database are between 35% and 65%, it appears that the favourite-longshot bias is not playing a large role.)

I have one final preliminary before getting to the more interesting parts of this analysis. The next table displays the distribution of possible returns from betting randomly against Pinnacle at several sample sizes. This gives a sense of how much variation in betting performance is possible due to randomness alone. The data in the table are produced from 2000 simulations at each sample size, where each simulation consists of simulating matchups using Pinnacle's margin-free probability (i.e. if Pinnacle gives a golfer a 45% win probability, they will win ~45% of the time across all simulations). Ties are not possible in the simulations which is why the average return is around -3.7%. Columns 2-6 are the percentiles of the distribution and column 7 is the fraction of returns that were above 0%.

N	5th	25th	50th	75th	95th	fraction > 0%
100	-19.83%	-10.53%	-3.57%	3.22%	12.55%	0.364
500	-10.83%	-6.73%	-3.86%	-0.74%	3.61%	0.201
1000	-8.78%	-6.00%	-3.84%	-1.70%	1.60%	0.116
2500	-7.04%	-5.14%	-3.78%	-2.43%	-0.36%	0.034
5000	-5.92%	-4.64%	-3.72%	-2.83%	-1.55%	0.003
10000	-5.44%	-4.40%	-3.69%	-3.04%	-2.07%	0.000

This table is an important one to keep in mind when interpreting the results that follow. At sample sizes below 5k bets randomness can still play a meaningful role, and even with 10k bets 2-3% deviations due to chance alone are possible.

Which closing line should we trust?

With the boring stuff out of the way, let's start to analyze the quality of bookmakers' odds. To assess bookmaker quality, I'll borrow a method I first encountered in work by Joseph Buchdahl. (I encourage you to read through the entire thing; much of the analysis here is based off of Buchdahl's work.) The idea is that to compare the odds from different bookmakers, we use one bookmaker's odds to bet against the others (and vice versa). As a jumping off point we'll analyze the closing odds from Pinnacle and DraftKings. (It's important to note that our "opening" and "closing" odds are not the official opening and closing lines, but rather our first and last efforts at scraping them.) Consider the two plots below which show the "calibration" of both Pinnacle and DraftKings' margin-free closing odds.

Each data point represents a group of predictions. For example, the point second from the left in the DraftKings plot is comprised of all golfers with margin-free odds between 30 and 35%; their mean was 32.6% (value on x-axis) and this set of golfers ultimately won 32.5% of the time (value on y-axis). The line in each plot represents perfect calibration (x=y). Overall, both sets of odds do a good job of approximating actual frequencies: when DraftKings says a golfer has a 60% chance of winning, they do in fact win about 60% of the time (as evidenced by the observed win rate of a large number of "60% predictions"). The same can be said for Pinnacle's odds. However, calibration plots, used as a means of evaluating the quality of a model, aren't particularly informative. Any halfway-decent model can be made to look good on a plot like this. To see this, suppose that in the 60% bin the golfers can be evenly divided into a subgroup of golfers that won 58% of the time and another subgroup that won 62% of the time. If Pinnacle (on average) assigns to these subgroups of golfers win probabilities of 58% and 62% respectively, while DraftKings (on average) assigns to golfers in both subgroups win probabilities of 60%, we can definitively say that Pinnacle's model is better. But, both models will look the same in these calibration plots.

Fortunately a much sharper test of prediction quality can be found by using a bookmaker's odds to bet against other books. That is, we use one book's margin-free probabilities to estimate expected value using other books' offered (i.e. margin-included) odds. For example, consider the 1st Round 3-Ball between Andrew Landry, Jordan Spieth, and Cameron Champ at the 2020 Memorial. DraftKings offered closing odds of 3.25 (or, in American format, +225) on Landry, 2.35 (+135) on Spieth, and 2.8 (+180) on Champ; Bet365 offered respective odds of 3.75 (+275), 2.25 (+125), and 2.62 (+162). To calculate DraftKings' margin-free price for Andrew Landry, we simply divide his implied probability (30.77%) by the sum of all 3 implied probabilities (30.77% + 42.55% + 35.71% = 109.04%), which yields 28.2% (equivalently 3.54 or +254). This margin-free implied probability from DraftKings is then taken as our "true" probability when calculating expected value against Bet365's price; for Landry, this results in an expected value of 5.75%.

Ideally we would perform this exercise book-by-book, but unfortunately the sample sizes simply aren't large enough. Part of the sample size issue stems from the fact that for this exercise we can only use bets that both books offered odds on. The next two tables show the results using Pinnacle and DraftKings' odds, respectively, to place bets at all other books for various levels of expected value. There are many duplicate bets in these tables: for any given bet there might be several books offering it, and, further, both golfers from each bet will be assigned to an expected value bin. Pinnacle's table is based off a total of 10972 unique bets and Draftkings' is based off of 8178 unique bets. We use 1 unit stakes for all bets.

Pinnacle

exp. value bin	# of bets	exp. roi	roi
< -10%	6848	-12.6%	-13.9%
> -10% & < -8%	7267	-8.9%	-9.3%
> -8% & < -6%	13728	-6.9%	-5.4%
> -6% & < -4%	20161	-5%	-5.8%
> -4% & < -2%	16520	-3.1%	-2.5%
> -2% & < 0%	6669	-1.1%	-3.6%
> 0% & < 2%	2901	0.8%	0.6%
> 2%	1902	4.5%	3.7%

DraftKings

exp. value bin	# of bets	exp. roi	roi
< -10%	13393	-13.8%	-9.3%
> -10% & < -8%	7818	-8.9%	-8.5%
> -8% & < -6%	8718	-7%	-5.7%
> -6% & < -4%	9500	-5%	-5%
> -4% & < -2%	8263	-3%	-4%
> -2% & < 0%	6219	-1.1%	-2.5%
> 0% & < 2%	3834	0.9%	-5.3%
> 2%	4989	5.6%	-4.1%

This is bad news for DraftKings and good news for Pinnacle. When Pinnacle says that a bet has an expected value of 4%, the realized return (averaged over many +4% EV bets) is about 4%. In Pinnacle's results there is a (roughly) 1 to 1 relationship between expected and actual ROI, while for DraftKings it is only a slightly positive relationship. For Pinnacle this means that closing odds from other books add little predictive value to their closing odds. Given that Pinnacle's odds move the most from opening to close, this result is not surprising; we would expect that their closing line incorporates all relevant information. Put differently, if you were to optimally predict matchup results using the closing odds from all books, the best predictions would be achieved by putting 95-100% of the weight on Pinnacle. More generally, a simple way to estimate the optimal weighting between the model odds being used to calculate expected value, and the odds being bet against, is to see where actual ROI sits in relation to random (or blind) betting returns and expected ROI. For example, if random returns are -5%, expected returns are 5%, and actual returns were 2%, we would conclude that the correct weighting is 70% on the model and 30% on the bookmaker (ignoring sample size issues).

For DraftKings, the slight positive relationship between expected and realized return indicates that their closing line adds a bit of predictive value to the other books' closing lines (perhaps not at Pinnacle, but at other books). If you repeat this exercise using closing lines from the other bookmakers in the first table of this article there are not many noteworthy takeaways. BetOnline, 5Dimes and Bovada tend to just copy Pinnacle (or Bet365, in the case of Bovada) when it's possible. Bet365 and Betcris, along with Pinnacle and DraftKings, appear to be the books that actually provide independent pricing; they don't move their lines as much as Pinnacle so we will defer their detailed discussion to the next section. As the above analysis suggests, the closing lines at Betcris and Bet365 do not add much value to Pinnacle's closing line.

To finish this section, here are the betting results that could be achieved with different expected value thresholds using Pinnacle's closing odds to bet against other books' closing odds (which is a feasible betting strategy). For example, the first row indicates there were a total of 4803 bets with an expected value of at least 0% according to Pinnacle's margin-free closing lines, 3435 of which were unique bets. Placing 1 unit on each of these bets yielded a profit of 88 units and a return of 1.8%. Returns begin to decline at higher thresholds, but sample sizes are also smaller.

threshold	# of bets	uniques	exp. roi	profit	roi
0%	4803	3435	2.28%	88.4	1.84%
1%	3040	2332	3.34%	93.3	3.07%
2%	1902	1532	4.47%	71.3	3.75%
3%	1140	955	5.82%	23.1	2.03%
4%	743	633	7.07%	8.1	1.08%
5%	511	446	8.25%	-3.0	-0.59%

Before you close, you have to open

Pinnacle's closing lines are accurate because they move in response to useful information revealing itself in the market. Next we look at how other bookmakers' opening odds influence Pinnacle's odds movement from open to close. Of course with this exercise I'm not suggesting that Pinnacle is directly responding to other bookmakers' odds — I have no idea how their odds are set — but there are many indirect ways bookmakers' odds could affect one another.

To start let's focus on Betcris and Pinnacle. Since we started tracking Betcris' odds in late 2019, we have 3492 bets that were offered by both bookmakers. Of those 3492 bets, there were 314 instances where we only scraped odds at one point in time for at least one of the books; that is, the opening and closing odds were the same. We drop these bets for this analysis, leaving us with a sample size of 3178. (This typically occurs either due to errors in our scraping scripts or odds that are posted without much time before the golfers started.) Because Pinnacle and Betcris apply differential margins to their odds, it will be easiest to focus on the margin-free probabilities for this exercise. Consider the 3rd Round Matchup between Christiaan Bezuidenhout and Sam Burns at the 2020 Arnold Palmer Invitational: Betcris' opening margin-free price for Bezuidenhout was 46.6% while Pinnacle's was 60.5%. Thus Pinnacle's opening price was 29.8% (\( \frac{60.5}{46.6} - 1 \)) higher than Betcris'. The closing margin-free price for Betcris was 49.2% while at Pinnacle it was 50.1%. Therefore Betcris' price, from opening to close, moved 18.9% of the way towards Pinnacle's opening price; Pinnacle's price, from opening to close, moved 74.5% of the way towards Betcris' opening price.

We repeat this calculation for all 3178 bets, focusing only on the odds offered for the first listed player in each matchup to avoid double counting. There were 1401 instances where one of Pinnacle or Betcris showed an advantage of at least 5% (the "advantage" in our example above was 29.8%). In these bets, Pinnacle's price on average moved 54.7% of the way to Betcris' opening probability, while Betcris on average moved just 15.6% of the way to Pinnacle's opener. Performing the same exercise except only with bets where there is a 10% advantage, we have a sample of 464 bets, and respective average moves of 62.3% and 15.0% for Pinnacle and Betcris. Moving the cutoff to 15%, there are 146 bets and average moves of 70.2% and 12.6% for Pinnacle and Betcris. Among other things, these movements can tell us about "closing line value": if you start with a 5% advantage against a book's opening odds, and the closing odds move 50% of the distance towards your model's odds, then closing line value would be 2.5% (using a book's margin-free odds).

An important caveat here is that for 72-hole matches (which constitute ~70% of this sample), Pinnacle posts their opening odds before Betcris. Therefore part of Pinnacle's price movement towards Betcris could be occurring before Betcris actually posts their odds [3]. In any case, the takeaway here is that Pinnacle's closing prices are influenced by Betcris' prices to a considerable degree. Also, given that in the previous section we argued that Pinnacle's closing lines are the most accurate in our database, Betcris' closing prices should move towards Pinnacle's more than they currently do (if their only concern when setting odds was accuracy, which it likely is not) [4]. The next table summarizes the results of this exercise for other pairs of books. Also included in this table is the correlation between the opening odds, and closing odds, of the relevant pair (for Bet365/DK we exclude 3-balls in the correlation to make it more comparable to the other pairs of books).

book1	book2	correlation b/w opening odds	correlation b/w closing odds	full sample size	advantage threshold	# of bets	book1 -> book2	book2 -> book1
bet365	pinnacle	0.94	0.94	2480	5%	777	8.1%	8.7%
bet365	pinnacle	0.94	0.94	2480	15%	85	10.3%	14.8%
bet365	betcris	0.82	0.85	836	5%	384	4.6%	12.5%
bet365	betcris	0.82	0.85	836	15%	65	11.0%	11.5%
bet365	draftkings	0.91*	0.94*	3925	5%	1926	5.6%	6.6%
bet365	draftkings	0.91*	0.94*	3925	15%	368	8.8%	10.5%
draftkings	pinnacle	0.94	0.94	4088	5%	1258	9.7%	21.8%
draftkings	pinnacle	0.94	0.94	4088	15%	58	18.4%	25.0%
draftkings	betcris	0.86	0.89	1022	5%	460	6.5%	13.4%
draftkings	betcris	0.86	0.89	1022	15%	55	17.3%	15.6%
betcris	pinnacle	0.83	0.95	3178	5%	1401	15.6%	54.7%
betcris	pinnacle	0.83	0.95	3178	15%	146	12.6%	70.2%

A minor point on the correlations is that each pair of books covers a different sample of bets, and therefore some samples may yield naturally higher or lower correlations. When the opening odds of DraftKings (or Bet365) have large discrepancies with Pinnacle, both books show some movement towards each other by closing. Given that Pinnacle's closing lines are pretty accurate, this tell us two things: 1) opening lines at Bet365 and DraftKings add some value to Pinnacle's opening lines, and 2) opening lines at Bet365 and DraftKings should be moving a lot more than they do (which should surprise nobody). Paradoxically, the correlation between DraftKings and Pinnacle's closing odds actually declines slightly relative to the correlation between their opening odds; this is possible because Pinnacle's lines move a lot while DraftKings' do not [5]. We observe the same phenomenon with Bet365/Pinnacle.

When I do this analysis for, e.g., 5Dimes vs. Pinnacle, the timing issue becomes apparent: I find that Pinnacle's lines move a lot towards 5Dimes' opening odds (>50%). This is largely driven by the fact that 5Dimes often follows Pinnacle's odds closely (correlation 0.975 between their opening odds), and as a consequence posts their opening odds after Pinnacle. If Pinnacle's odds have already moved before we scrape 5Dimes' opening odds (which mimic Pinnacle's prices at the time), it will give us the impression that Pinnacle's odds "moved towards" 5Dimes' openers.

The previous table is basically sufficient for understanding the quality of the opening lines at these 4 books. If we agree that Pinnacle's closing line is accurate, and combine that with the observation that Pinnacle's movement from opening to closing is influenced by Betcris to a large degree, we would conclude that Betcris' opening lines are adding a lot of predictive value to Pinnacle's opening lines. By the same logic, the openers at DraftKings and Bet365 also add some value to Pinnacle's openers. Finally, we can also say that Pinnacle's opening lines themselves are providing a reasonable amount of predictive value, as evidenced by the fact that they only move 50-70% of the way to Betcris' opening odds (depending on the starting discrepancy). Given that Betcris moves Pinnacle's opening lines more than 50% of the way towards their opening lines, we would expect their openers to be the most accurate. If we did not want to rely on the assumption of the accuracy of Pinnacle's closing line to compare opening line quality, we could repeat the exercise from the previous section and use each book's opening odds to bet against other books' opening odds. However the sample sizes just aren't large enough to draw very sharp conclusions (for example, there are only ~1000 bets that Betcris and Bet365 both offered odds on). Without providing specifics, the main takeaways from the exercise were that Pinnacle and Betcris have opening lines that are a step above DraftKings and Bet365 in terms of accuracy, but also that all 4 books' opening lines add some degree of predictive value to each other.

We'll finish this section as we did the last: the next table shows the betting results using Pinnacle's opening lines to bet against all other books' openers (which is a feasible strategy given Pinnacle almost always opens first). As mentioned earlier, keep in mind that these might not literally be Pinnacle's opening odds (but should always be from within an hour or two of opening).

threshold	# of bets	uniques	exp. roi	profit	roi
0%	6098	3996	3.36%	-66.7	-1.09%
1%	4486	3122	4.41%	13.3	0.30%
2%	3259	2382	5.52%	4.4	0.13%
3%	2294	1725	6.80%	11.6	0.50%
4%	1651	1272	8.09%	10.5	0.64%
5%	1227	955	9.35%	15.8	1.29%
6%	948	755	10.49%	16.4	1.73%
7%	720	572	11.77%	-14.1	-1.96%
8%	568	448	12.93%	-14.7	-2.59%
9%	429	339	14.39%	-8.3	-1.94%
10%	338	270	15.71%	10.6	3.13%

A blind betting strategy returns about -5.5% on this sample of bets, which means that Pinnacle's realized ROI sits just over halfway between random and expected returns. At the higher thresholds this relationship seems to break down a bit, but there are some serious sample size issues. Another point about sample size to consider here (and throughout this article) is that some bets are correlated: for example, a bookmaker might offer 5 tournament matchups involving the same golfer, which decreases the effective sample size.

How good is the Data Golf model?

If a bettor can add predictive value to the bookmaker's odds in any way they stand a chance of making money; at the very least they will not lose as much money as they would if betting randomly. A successful bettor does not require a model that fits the data better than the bookmaker's odds, but rather just one that can improve upon those odds enough to overcome the margin they are up against.

In comparing our model's odds to those of bookmakers, I've developed a greater appreciation for the quality of their predictions. Even a "soft" book like Bet365 has margin-free probabilities that fit the data pretty well; since 2019, their matchup odds have performed similar to ours. That is, while we can make money betting against Bet365 (evidence to come), they could also make money betting against us (if we added some reasonable margin to our odds and started bookmaking). As alluded to in the previous paragraph, if you have a model that fits the data as well as the model you are betting against (e.g. suppose a 50-50 weighting of the two sets of odds is optimal) you are likely in a very good position. Even if the optimal weighting puts only 15-30% on your model, that can be sufficient depending on the margin and the size of the discrepancies between your model and the offered odds.

Let's take a detailed look at how the Data Golf model performed over the last 2 years. Recall that in 2019 our model did not include course-specific adjustments, while in 2020 our PGA Tour model incorporated both course fit and course history adjustments and our European Tour model incorporated course history. The table below shows our overall betting performance across all books since early 2019 for various expected value thresholds (as before, using a 1-unit stake for all bets). Betting with a 2% threshold means that any bet with an expected value of at least 2% is taken.

threshold	# of bets	uniques	exp. roi	profit	roi
0%	45866	22846	5.16%	-420.0	-0.92%
1%	37610	19348	6.19%	-137.5	-0.37%
2%	30702	16303	7.25%	-86.1	-0.28%
3%	24832	13523	8.37%	-31.2	-0.13%
4%	20136	11203	9.51%	79.7	0.40%
5%	16420	9302	10.65%	237.5	1.45%
6%	13463	7731	11.79%	279.1	2.07%
7%	11045	6489	12.95%	379.0	3.43%
8%	9103	5435	14.12%	402.9	4.43%
9%	7468	4478	15.35%	359.0	4.81%
10%	6155	3730	16.60%	274.4	4.46%
11%	5174	3181	17.76%	202.4	3.91%
12%	4352	2700	18.94%	226.3	5.20%
13%	3697	2304	20.09%	212.5	5.75%
14%	3121	1955	21.31%	190.8	6.11%
15%	2630	1638	22.58%	225.6	8.58%
16%	2241	1410	23.81%	201.8	9.00%
17%	1928	1227	25.00%	173.2	8.98%
18%	1661	1060	26.21%	145.2	8.74%
19%	1433	921	27.44%	113.8	7.94%
20%	1242	796	28.67%	83.6	6.73%

A couple points to note. First, blind betting returns averaged across all books was -5.9%. When evaluating your betting results this is the benchmark you should compare your performance to; against a 6% margin, breaking even is actually very good. Second, along with the number of bets made we also display the number of unique bets made; if multiple books are offering the same bet it's likely we will have placed a bet on both in this exercise.

Betting 1 unit on every positive "edge" according to our model since 2019 results in a cool loss of 420 units. However, with 45866 bets made, this yields a respectable ROI of -0.9%. Given the blind betting strategy returned -5.9%, it's clear we are adding some value to the odds we are betting against. To turn a meaningful profit with our model, an expected value threshold of 5% or higher is required, with profits peaking at the 8% threshold. The fact that ROI stops increasing at the highest expected value thresholds is not too worriesome given the smaller sample sizes (another reminder that 1000 bets is in fact a small sample). Overall, our actual ROI sits slightly under halfway between random betting returns and expected returns, which means that the optimal weighting of our model and the bookmakers is about 45-55 [6]. If our model added no predictive value, actual returns should equal the blind betting return, while if bookmaker odds added no value to our model actual returns should equal expected returns.

Here are two further ways to breakdown these results. First, focusing on the 8% threshold, which is where profits were maximized, the breakdown across bet types was: 72-hole matchups — 2191 bets; 2.1 units profit; 0.1% ROI, 1-round matchups — 3370 bets; 58.2 units profit; 1.7% ROI, and 3-balls — 3542 bets; 342.6 units profit; 9.7% ROI. Second, the table below displays the model's performance by bookmaker using the 8% threshold:

book	# of bets	exp. roi	profit	roi
bovada	1998	14.99%	147.3	7.37%
bet365	2240	15.11%	106.6	4.76%
willhill	335	14.62%	50.2	15.00%
unibet	184	16.68%	41.0	22.3%
betonline	185	12.07%	21.1	11.43%
5dimes	1254	12.88%	16.1	1.28%
fanduel	218	14.93%	14.2	6.51%
pinnacle	1040	12.78%	13.0	1.25%
betcris	324	11.91%	0.1	0.03%
sportsbook	321	13.26%	-1.5	-0.45%
draftkings	1004	13.65%	-5.3	-0.52%

In our published betting results from 2019 and 2020, our average return on matchups and 3-balls was about 0.9%. All bets were placed through Bet365 and instead of level staking we were using a version of the Kelly Criterion. In 2019 the EV thresholds used were probably closer to 5%, while halfway through 2020 we switched to higher thresholds (6-7% for matchups; 8-9% for 3-balls). Given the returns from Bet365 in this analysis I can't help but feel we got a bit unlucky with our actual betting results.

Next I restrict our sample of possible bets to the 2020 PGA Tour season, which used the most complete version of our model [7].

threshold	# of bets	uniques	exp. roi	profit	roi
0%	17420	8212	4.82%	163.6	0.94%
1%	14043	6884	5.86%	279.5	1.99%
2%	11228	5755	6.96%	249.3	2.22%
3%	8896	4715	8.13%	270.6	3.04%
4%	7144	3897	9.27%	344.7	4.82%
5%	5665	3156	10.53%	327.7	5.79%
6%	4632	2628	11.66%	331.0	7.15%
7%	3755	2175	12.87%	345.7	9.21%
8%	3090	1828	14.03%	334.3	10.82%
9%	2554	1505	15.19%	326.7	12.79%
10%	2096	1248	16.43%	271.5	12.95%
11%	1777	1068	17.5%	237.5	13.36%
12%	1497	907	18.62%	231.2	15.45%
13%	1267	781	19.74%	183.1	14.45%
14%	1053	659	21.01%	129.1	12.26%
15%	880	557	22.3%	133.5	15.17%
16%	754	475	23.44%	123.8	16.42%
17%	654	412	24.5%	94.0	14.37%
18%	567	361	25.59%	84.1	14.84%
19%	485	312	26.79%	64.6	13.31%
20%	420	270	27.92%	46.4	11.05%

Now betting every "edge" from our model would have turned a profit. This is a useful demonstration of the fine line between betting successfully and unsuccessfully. In our full betting results, optimal predictions put about 45% of the weight on our model's prediction and 55% on the bookmaker's; restricting to only 2020 PGA Tour predictions the weight on our model increases to 60-65%. This increase was enough to flip our ROI from -1% to +1%, which is the difference between seeing your bankroll dwindled down to nothing and seeing it double or triple in size. The reality is that the models that led to these two ROIs are not very different; thousands of bets are required to establish a statistically meaningful difference.

Profits in the 2020 PGA Tour season were maximized at the 7% threshold; the profit breakdown by bet type was: 72-hole matchups — 653 bets; +38.7 units; 5.9% ROI, 1-round matchups — 1362 bets; +70.0 profit; 7.8% ROI, and 3-balls — 1740 bets; 237.0 profit; 13.6% ROI. Beyond the 5% threshold sample sizes are pretty small, and it looks like we got lucky in the 7-13% range where actual ROI is approaching expected ROI.

Finally, this last table shows how each book's margin-free odds moved from opening to close in relation to our model's probabilities (using the full time period, but excluding 3-balls). The correlations shown here, as before, might be a bit misleading as the samples they cover are different. For example, the correlation between DG probabilities and Bet365 margin-free probabilities is 0.86 in the Pinnacle-Bet365 overlap sample. This difference likely reflects the fact that those matchups were between more unevenly-matched golfers, which naturally produces stronger correlations. As before, matchups where we only managed to scrape odds at a single point in time are dropped. As an example for clarity's sake, the last 2 columns of the first row indicate the following: there were 874 matchup bets at Bet365 where the opening margin-free odds disagreed with our model's odds by at least 15% (that is, the ratio of our odds to Bet365, or the ratio of Bet365's odds to ours, was at least 1.15); on average in those 874 matchups, Bet365's closing odds moved 5.3% of the distance towards our model's odds.

book	correlation w/ opening odds	correlation w/ closing odds	sample size (>5% adv.)	book -> dg (>5% adv.)	sample size (>15% adv.)	book -> dg (>15% adv.)
bet365	0.74	0.76	4285	3.0%	874	5.3%
betcris	0.81	0.82	3464	10.0%	451	11.8%
draftkings	0.86	0.87	4081	3.4%	697	5.7%
pinnacle	0.87	0.90	5269	26.2%	797	30.0%

To conclude this section, here are a few takeaways. First, from our betting performance it's clear that 3-balls have the weakest odds-setting of the bet types we considered; this is at least in part because they are only offered by "softer" bookmakers. Second, matchup odds in golf are solid, regardless of the bookmaker you are considering. I am skeptical that there are many independent (that is, not incorporating market prices in some way) matchup models out there that fit the data better than most bookmakers' odds do. With the high margins (4-6%) typically built into matchup prices, it is therefore not an easy task to be profitable. In our case, it was disheartening at first to see that most bookmakers' prices predict matchup results as well as, or better than, our model does. There is clearly information that bookmakers are incorporating into their opening odds that our model is not (and vice versa). As this final section showed however, it is not necessary to have a perfect model to be successful betting. A model whose purpose is to generate a profitable betting strategy, versus one whose purpose is to set odds for bettors to bet against, will be quite different in how they are best built. One key difference between our model and a bookmaker's margin-free odds is that we generate more extreme predictions (i.e. closer to 0% or 100%); this doesn't really hurt you when betting (provided you use a sensible staking strategy) but it could be very detrimental to a bookmaker. A final takeaway is that our 2020 PGA Tour model performed very well; hopefully with some improvements in the off-season we can maintain or increase that advantage.

[1] For some reason, this is a hill I am willing to die on. If you are placing bets on matchups where ties are void, your expected value on golfer X is no longer equal to the standard formula of \( win\_prob \cdot euro\_odds - 1 \) where \( win\_prob \) is the fraction of matchups won by golfer X out of those that didn't result in a tie. Rather, we need to account for the fact that if the match is tied your bet is returned. Thus, expected value is equal to \( win\_prob\_o \cdot (euro\_odds - 1) - loss\_prob\_o \) where \( win\_prob\_o \) and \( loss\_prob\_o \) are the outright win and loss probabilities, respectively. The effect here is minor; using the second method will move your expected value closer to zero. Intuitively, the difference is simply that I think you should be including the void bets in the calculation of your overall rate of return and expected value. We have a worked example here. [Back to text]

[2] Suppose you are betting blindly on an event that has a true probability of 30%. If a bookmaker offers fair prices of 30%-70%, then your expected value from betting randomly will be 0. Conversely, suppose the bookmaker offers odds of 35% and 65%. That is, they price the longshot too short and the favourite too long. With these odds your expected value from betting blindly is \( 0.5 \cdot (\frac{0.3}{0.35} - 1) + 0.5 \cdot (\frac{0.7}{0.65} - 1) = -0.033 \). If the prices instead were 25%-75% your expected value would be 0.067. The practical point here is that the returns from blindly betting on all outcomes will only be zero if the bookmaker's odds are well-calibrated (i.e. they price 30% events at 30%). This doesn't mean their prices are perfect (as in this example), just that they are roughly calibrated. The fact that after we remove the margin the average return to blind betting is close to zero at all books indicate that our margin-free odds estimates are reasonable. There is a philosophical point here about why I am assuming that if we remove the margin "correctly" the odds will be well-calibrated. If that bothers you, you can alternatively just think of this exercise as producing the most accurate margin-free probabilities possible from the offered (margin-included) odds. That is, I am giving bookmakers' the benefit of the doubt. [Back to text]

[3] Anecdotally, it seems like Betcris posts their opening odds independently of Pinnacle. Still, it is likely the case that when Pinnacle posts a line that is way different than what Betcris later posts the market will have already started pushing it towards the (eventual) Betcris price before they open (simply because the Pinnacle opening price is likely bad). If we only consider 1-round matchups (where Pinnacle and Betcris typically post around the same time), the relevant numbers for the 5% advantage sample are 262 bets, 32% move by Pinnacle towards Betcris, and 19% move by Betcris towards Pinnacle. [Back to text]

[5] This declining correlation is surprising because we saw that when large discrepancies exist between DK and Pinnacle's openers, they tend to exhibit some movement, on average, in the direction of each other's opening lines by closing. However, consider this example: in bet 1, DK opens at 45% and closes at 45% while Pinnacle opens at 48% and closes at 41%; in bet 2, DK opens at 43% and closes at 43%, while Pinnacle opens at 42% and closes at 50%. This is an extreme example, but the correlation between openers here is 1 while the correlation between closing odds is -1, despite the fact that Pinnacle's odds moved towards DraftKings' opening odds from opening to close. This is not at all representative of what is actually happening in the DK-Pinnacle sample, but it should provide some intuition on how a lower closing line correlation could arise. [Back to text]

[6] Given this knowledge we could adjust our model's predictions by moving them towards bookmakers' odds to make the "model's" expected value line up with realized returns. For example, assuming a 45% weight, we would require about 7-8% EV according to our model to actually have "true" positive expected value (random returns are -6%, so true EV at 8% is (\(0.55 \cdot -6\% + 0.45 \cdot 8\% = 0.3\% \)). However, we don't do this for practical reasons: we want our website to be internally consistent, and by incorporating the market in some parts of the site but not others we move away from that. [Back to text]