At the 2019 PGA Championship at Bethpage Black, much was made of the Black course's length, narrow fairways,
and penal rough. The consensus opinion was that this setup heavily favoured longer hitters. That got us thinking:
do certain PGA Tour courses favour bombers, while others favour shorter, more accurate players?
And, if the answer to that question is yes,
what are the characteristics of these courses? The goal of this blog post will be make some headway on those two questions.
To evaluate whether a given course favours longer hitters, or more accurate players, there are two main
issues to overcome. First, it is simply a fact that the best players on the PGA Tour are also
some of the longest off the tee. Therefore, even at golf courses that don't favour those with power any more than
a typical PGA Tour course, we would expect long hitters to play well.
To get around this, a golfer's performance must be compared
relative to their baseline ability.
That is, we should compare Dustin Johnson's performance at Bethpage Black to
his performance at all other courses on the PGA Tour. If Bethpage
does in fact favour golfers with above-average power, then we should observe longer players
performing above their baselines there. Second, as with any golf analysis, the randomoness inherent to golf
scores needs to be contended with. It simply is not possible to draw strong
conclusions from any single golf tournament. Even if a course setup is
truly
favourable to power players, it is easily possible for these players to underperform
their baselines in a single week just due to randomness.
To alleviate this concern, we will do two things. First, every season from 2005-2019 will be analyzed; if a given course
plays as favourable to long hitters in most or all seasons, then this is more compelling
evidence that something real is going on. Second,
we can restrict our analysis to just strokes-gained off-the-tee (OTT) and strokes-gained approach (APP); this
will remove
the noisiest part of golf scores, and it is also the component of performance
that should be of interest in an analysis of driving distance and accuracy [1].
On to the analysis! For every stroke-play PGA Tour event from 2005-present, we calculate each golfer's average driving
distance (in units of yards relative to the field average) and average driving accuracy (in units of % fairways hit
relative to field average) over the previous 12 months. Using
our statistical model of golf scores,
we have an estimate of a player's baseline ability at each point in time (roughly, think of
this as a golfer's adjusted strokes-gained at all tournaments they have played in a 2-year window).
Then, we simply correlate golfers' performance relative-to-baseline at each course with their 12-month driving distance
average, or driving accuracy average, at the time of the tournament. This will tell us, for each course,
to what degree golfers with better-than-average distance, or accuracy, outperformed
their baseline skill levels in a given week. As mentioned above, performance in SG:OTT and
SG:APP relative-to-baseline is also analyzed.
The plot below summarizes the results. (You'll need to hover over the data to gain any
insight.) The larger dots are course-specific
averages for all seasons from 2005-2019, while the smaller dots are for
specific years. Each "column" of dots, which is highlighted on hover, is for a single course.
For the driving distance plot (note the clickable link at the top to toggle distance/accuracy), each dot is interpreted
as
the number of strokes per round a golfer who is 10 yards longer
than the PGA Tour average would be expected to perform above their baseline.
For driving accuracy,
the interpretation is the number of strokes above baseline per round for a golfer who is
5% more accurate than the PGA Tour average. Ten yards and five percent are roughly
the standard deviations of the season-long distributions of
driving distance and driving accuracy on the PGA Tour, which means these estimates can be
interpreted as the difference between the average golfer and the 85th percentile golfer in each skill (distance or accuracy).
For interested readers,
these numbers are coefficients from simple linear regressions
[2].
Click to see performance of accurate golfers
SELECT SG METRIC:
TOTAL
APP
OTT
To ensure we are on the same page, let's discuss the 4th data point from the left, which belongs
to Augusta National. Since 2005, Augusta has favoured longer hitters: a golfer who
is 10 yards longer than the tour average would be expected to perform 0.19 strokes
above their baseline at The Masters. Conversely, a golfer who is 5% more accurate than
the PGA Tour average would be expected to perform 0.13 strokes below their baseline
at Augusta National. You'll notice that, in general, courses that favour longer hitters
tend to not favour accurate players; this is not mechanical, it is simply
due to the fact that most long hitters on tour are not accurate, and vice versa.
This is a critical point: you should think of this analysis as looking at how two types of players — those who
hit it far, and those who hit it accurately —
perform at different courses. Evidently, the same player could be both long and accurate; but in general this
is not the case on the PGA Tour, which makes this analysis relatively clean.
For a more detailed discussion of this point, visit
[3].
The purpose of including the year-specific data points is to highlight the variance
across years at a given course. For example, Harbour Town Golf Links has yielded a negative
coefficient for driving distance every single year from 2005-2019. This makes it very convincing to
claim that Harbour Town does in fact favour shorter hitters relative to other courses.
Conversely, Muirfield Village also had a negative overall average from 2005-2019, but some of
the specific years yielded positive correlations with driving distance; this leads us
to put a little less weight on this specific negative estimate. It is easy to see the impact
of statistical noise on the estimates by examining courses that host
tournaments with limited field sizes: for example, East Lake GC and Kapalua.
The reason for performing this analysis using SG:OTT and SG:APP
is to better understand how the correlation with total strokes-gained arises (this will be discussed more below).
A useful statistical point to note is that the coefficient values from the analysis using each strokes-gained category
add up to the coefficient from the analysis using total strokes-gained.
For example, at Harbour Town the coefficient was -0.23 using total strokes-gained and -0.2 using
OTT+APP; this means that there must be a slightly negative coefficient on ARG+PUTT (-0.03).
To focus ideas,
the remainder of the article will centre on the driving distance analysis.
As was eluded to above, it's important to remember that these are just
correlations. A
correlation between driving distance and performance could arise for many reasons. Let's consider a few
explanations.
The most straightforward explanation for a negative correlation between the length a player possesses
and their performance-to-baseline
would be a course setup that limits a golfer's ability to hit driver — this directly
reduces the number of strokes longer hitters can expect to gain off the tee (e.g. Harbour Town).
On the flip side,
course setups that require driver to be hit on every hole should achieve the opposite.
Golf courses that are especially penal off the tee could
reduce the advantage of unreigned power, resulting
in a negative correlation between a golfer's performance-to-baseline
at that course and their length off the tee (think Le Golf National). Or,
as has been
described
by Andy Johnson of the fried egg,
courses that are
less penal could, somewhat counterintuitively,
also help shorter hitters by allowing both short and long hitters to reach greens
after "missed" drives. According to this explanation, longer hitters
will be favoured at long, narrow courses with deep rough, such as Bethpage Black.
This is due to the fact that all players, both long and short, end up hitting
many approach shots from the rough, and therefore being able to hit short irons, instead of long irons,
from that
deep rough is a big advantage. Importantly, this explanation would specifically predict
an advantage for long hitters on approach shots; if we observe
that at courses such as Bethpage Black longer hitters are performing above their baseline
SG:APP, this would support Andy's hypothesis. However, it's also likely
that bombers would perform above their baseline SG:APP at long courses with wide fairways, simply
because the advantage of having raw power is likely to be greater with longer approaches than shorter ones.
There are also some spurious (i.e. uninteresting) ways that correlations between
golfers' driving distance and performance-to-baseline could arise.
As mentioned previously, golfers' who have above-average driving distance also
tend to have other features in their games. Namely, they tend to have worse
short games, and better approach games, on average. Therefore, any course that emphasizes short game
more than other courses will hurt longer hitters, while courses that emphasize
approach shots will benefit them
[4]. We would say that the correlation driven by
short-game performance is an
uninteresting explanation because it is not being "caused"
by driving distance, but rather is just a byproduct of the fact that longer hitters tend to
be worse putters. A correlation driven by approach performance is more interesting, as
having more length likely improves a player's approach game, all else equal.
Because longer players tend to be better approach players,
it will be hard to distinguish between a course that
is simply a "second shot golf course" (i.e. emphasizes approach play), and one that actually gives
a special advantage to those with power on approach shots. Both these
explanations predict long hitters will outperform their baseline SG:APP.
Of course, all these explanations can be ignored when
only OTT performance is considered.
To tie all these explanations together, we can say, perhaps trivially, that longer hitters will outperform
their baselines at courses that emphasize the specific skills they possess more than
other courses do.
To start to understand which explanations are consistent with the data,
let's now look at which course
characteristics correlate with the over-performance or under-performance of longer hitters.
The plot below considers two simple characteristics: course length and average fairway width.
The colouring and opacity of the dots indicates whether, and to what degree, each
course saw players with above-average driving distance underperforming
or overperforming their baseline skill levels.
SELECT SG METRIC:
TOTAL
APP
OTT
It can be a bit more difficult to see trends in this plot. The main noticeable
pattern is that most of the data points for longer courses are varying shades of green; that is,
longer players outperform their baselines at (surprise!) longer courses.
This is true for off-the-tee performance, and to a slightly lesser degree for approach performance,
and generally holds regardless of whether the
course has wide fairways or not.
With respect to fairway width, there are actually no strong
patterns to speak of. Bethpage Black, which we've highlighted throughout this article, has seen
long hitters outperform their baselines significantly on both approach and off-the-tee performance.
At Firestone CC, which is also a long course with narrow fairways and deep rough, longer players
have performed above their baselines overall, but below baseline on approach shots.
Torrey Pines, Congressional, and Quail
Hollow also fit the mold as long, narrow, and penal setups, and they all see overperformance
from long hitters in SG:OTT and SG:APP. (It is useful to refer back to the first plot
in this blog to see how much these numbers varied across years, in order to get a sense of the
variability underlying each overall estimate.) As eluded to earlier, this pattern could simply be
due to the length of these courses, and not necessarily due to longer hitters benefitting from their
superior ability to hack it out of the rough. At the two widest courses in recent years on the PGA Tour,
Kapalua and Trinity Forest, longer players performed above baseline off the tee, and were pretty much
right at their baselines on approach shots.
Statistically, when a regression is performed, there is no
relationship between bombers' advantage and fairway width after
controlling for the yardage of the course. In other words, if you vary fairway width
while holding the distance of the course constant, there does not seem to be any advantage
or disadvantage gained or lost by longer players. (For example, comparing Bethpage and
Congressional, to Doral and Bay Hill)
In this blog we have tried to estimate the degree of underperformance
or overperformance of different player types at each PGA Tour course.
While most of the results align with intuition — longer players mainly gain an advantage
at longer courses — there
were a few surprises. Muirfield Village CC is the longest course in the data that yielded
a negative correlation between driving distance and performance-to-baseline. Club de Golf Chapultepec,
site of the WGC-Mexico, is a course that plays much shorter than
its yardage of 7300 yards due to its high altitude location, and yet yielded one of the strongest relationships
between driving distance and performance. While it is possible to provide a story
to rationalize each result from this analysis, it's important to remember that statistical noise
is still playing a large role. These results should be taken as suggestive, and far from the last
word on any of the explanations given to explain the factors in course setup that allow power players
to thrive.