Using our PGA Tour live model archives, which date back to Nov 2018 (with a few missing events since then for various reasons),
339 players have entered the final round with a share of the lead or better. Using our pre-final-round win probabilities,
we predicted 127.8 wins from this group of players (37.6% win rate). In fact, this group won
134 times (39.5%).
Using our European Tour archives, which only go back to Feb 2023, 83 players had a lead or co-lead heading into the final round.
We predicted 26.6 wins from this group (32%) and 29 of them went on to win (34.9%).
So, interestingly, our model has slightly
underestimated the actual win rate of 54-hole leaders.
This is interesting because its one of the few spots where our model still consistently diverges from the betting market.
Unfortunately we don’t archive in-play betting odds, but anyone who follows our model closely
knows that we are consistently higher on leaders than the sportsbooks and exchanges.
This divergence seems to occur most often with unproven players, or players with a history of losing leads.
Here is how the win rates look when grouped by the leader’s pre-tournament skill:
tour |
player type |
count |
model |
actual |
pga |
elite |
73 |
38.3 (52.4%) |
43 (58.9%) |
pga |
sub elite |
97 |
39.2 (40.4%) |
36 (37.1%) |
pga |
above average |
115 |
36.8 (32%) |
41 (35.7%) |
pga |
below average |
54 |
13.5 (25.1%) |
14 (25.9%) |
euro |
elite |
11 |
4.9 (44.6%) |
5 (45.5%) |
euro |
sub elite |
22 |
8.4 (38.3%) |
9 (40.9%) |
euro |
above average |
34 |
10.2 (30.1%) |
12 (35.3%) |
euro |
below average |
16 |
3.1 (19.1%) |
3 (18.8%) |
(On the PGAT, elite players were defined as those with a skill above +1.6, sub-elite were between 0.8 and 1.6,
above-average between 0 and 0.8, and below-average was below 0. For the European Tour,
the breakpoints are all 0.9 strokes lower than the PGA Tour.)
Combining across tours, elite players outperformed our model’s win expectation the most, while sub-elite players were the only ones to underperform it.
The sample sizes in these subgroups are pretty small, so I wouldn't read into them too much.
For a useful reference point, the standard deviation in a proportion estimate is
\( \sqrt(\frac{p \cdot (1-p)}{N}) \), so for \( N=339 \) and \( p=0.39 \) the standard error is 2.6%.
The model estimates shown above are true out-of-sample predictions in that they were made
before the outcome had occurred. However, to get a bigger sample size,
it’s useful to look back at what our model predictions would have been for tournaments that pre-dated our live model.
Mainly out of laziness, I'm going to use the probabilities that appear on our
pressure page, which go back to 2004
for the PGA and European Tours. Importantly, these probabilities
don’t include an adjustment for pressure (i.e. a player's position on the leaderboard),
and as a result they are, on average, ~3% higher for 54-hole leaders than the probablities from our live model (which does
account for pressure).
These probabilities also differ for various idiosyncratic reasons (the live model accounts for weather, and updates skill in a more complex
way during the event), but these should even out in a sufficiently large sample. The upshot is that we'd expect the average live model probability
to be about 3%
lower than the "model" probabilities shown below. First, here are the overall predicted and actual win rates by tour since 2004:
tour |
count |
model |
actual |
pga |
1231 |
492.1 (40%) |
464 (37.7%) |
euro |
996 |
389.8 (39.1%) |
390 (39.1%) |
The actual win rate on the PGAT is 2.3% lower than our pressure-free model probabilities, and therefore roughly in line
with what we would have expected our live model to project. Shockingly, 54-hole leaders on the European Tour have won as often as
the pressure-free model predicted. This is a bit puzzling because when looking at strokes-gained relative to expectation,
leaders on the European Tour underperform significantly. However, this underperformance is not as large as on the PGA Tour, and the chasers also underperform more
on the European Tour versus the PGAT. With a sample size of 1000 the standard errors of these proportion estimates are about 1.5%, meaning
that a true win rate of +/- 3% from what we've observed is still within the realm of possiblity.
Here are the win rates divided up by the leader's skill as before:
tour |
player type |
count |
model |
actual |
pga |
elite |
233 |
134.8 (57.8%) |
142 (60.9%) |
pga |
sub elite |
377 |
160.9 (42.7%) |
139 (36.9%) |
pga |
above average |
397 |
134.7 (33.9%) |
127 (32%) |
pga |
below average |
224 |
61.8 (27.6%) |
56 (25%) |
euro |
elite |
202 |
106.9 (52.9%) |
114 (56.4%) |
euro |
sub elite |
287 |
119.7 (41.7%) |
114 (39.7%) |
euro |
above average |
331 |
116.4 (35.2%) |
117 (35.3%) |
euro |
below average |
176 |
46.7 (26.5%) |
45 (25.6%) |
Again, remember that our hypothetical live model numbers would be expected to be 3% lower than all the model figures here.
On both tours, elite players have won more than expected by a substantial margin. The only group that was more than 3%
lower than the pressure-free prediction was sub-elite players on the PGA Tour, which is probably noise (i.e. I don't see a good reason
why they would underperform more than average players). Seeing this makes me think we should consider
allowing pressure effects to vary by the leader's skill. Of course, any analysis of professional golf in the 2000s is incomplete without
a Tiger adjustment: the elite win rates sans Tiger on the PGA Tour are 56.4% (predicted) and 57.2% (actual), meaning that Tiger's ability
to close did significantly inflate elite players' win rate.
Another interesting dimension to examine is how win rates varied with the size of the 54-hole lead:
tour |
lead |
count |
model |
actual |
pga |
0 |
527 |
135.3 (25.7%) |
132 (25%) |
pga |
1 |
303 |
109.6 (36.2%) |
101 (33.3%) |
pga |
2 |
165 |
78.2 (47.4%) |
74 (44.8%) |
pga |
3 |
107 |
65.6 (61.3%) |
54 (50.5%) |
pga |
4+ |
129 |
103.4 (80.2%) |
103 (79.8%) |
euro |
0 |
435 |
111.9 (25.7%) |
106 (24.4%) |
euro |
1 |
253 |
93.4 (36.9%) |
102 (40.3%) |
euro |
2 |
132 |
61.8 (46.8%) |
58 (43.9%) |
euro |
3 |
102 |
62.7 (61.5%) |
66 (64.7%) |
euro |
4+ |
74 |
59.9 (80.9%) |
58 (78.4%) |
The PGA Tour numbers tell a pretty consistent story, with larger leads usually resulting in fewer wins relative to expectation (the 3-stroke lead
numbers for the PGAT are a bit crazy, but still within 2 standard errors of the live model prediction). The European Tour numbers don't
show the same pattern.
It would be nice to have an archive of in-play betting odds to complete this analysis, but
it seems like it must be the case that the betting market has underestimated 54-hole leaders' win probabilities in recent years
(they are normally lower than our model, and our model has
been slightly lower than the observed win rates).
The market is generally in line with or maybe even a bit higher than our model when a top player is leading, so that subset of odds might be
more accurate. It is important to remember that even when using 20 years of data, sample sizes are still relatively small:
as mentioned earlier, the win rate for 1000 golfers has a standard error of roughly 1.5%.
Extra: Here are the predicted and actual win rates for 36-hole leaders from our live model (we haven't added probabilities back to 2004
for 36-hole leaders yet):
tour |
count |
model |
actual |
pga |
357 |
82.0 (23%) |
82 (23%) |
euro |
80 |
17.7 (22.1%) |
20 (25%) |
And here is the breakdown by player skill:
tour |
player type |
count |
model |
actual |
pga |
elite |
60 |
22.8 (38%) |
28 (46.7%) |
pga |
sub-elite |
113 |
28.6 (25.3%) |
24 (21.2%) |
pga |
above average |
118 |
22.2 (18.8%) |
22 (18.6%) |
pga |
below average |
66 |
8.3 (12.6%) |
8 (12.1%) |
euro |
elite |
11 |
4 (36.7%) |
4 (36.4%) |
euro |
sub-elite |
22 |
6.3 (28.6%) |
4 (18.2%) |
euro |
above average |
26 |
4.9 (18.9%) |
8 (30.8%) |
euro |
below average |
21 |
2.4 (11.6%) |
4 (19%) |