• Analytics Blog
June 10, 2019
Where do power players succeed on the PGA Tour?
- June 10, 2019
At the 2019 PGA Championship at Bethpage Black, much was made of the Black course's length, narrow fairways, and penal rough. The consensus opinion was that this setup heavily favoured longer hitters. That got us thinking: do certain PGA Tour courses favour bombers, while others favour shorter, more accurate players? And, if the answer to that question is yes, what are the characteristics of these courses? The goal of this blog post will be make some headway on those two questions.

To evaluate whether a given course favours longer hitters, or more accurate players, there are two main issues to overcome. First, it is simply a fact that the best players on the PGA Tour are also some of the longest off the tee. Therefore, even at golf courses that don't favour those with power any more than a typical PGA Tour course, we would expect long hitters to play well. To get around this, a golfer's performance must be compared relative to their baseline ability. That is, we should compare Dustin Johnson's performance at Bethpage Black to his performance at all other courses on the PGA Tour. If Bethpage does in fact favour golfers with above-average power, then we should observe longer players performing above their baselines there. Second, as with any golf analysis, the randomoness inherent to golf scores needs to be contended with. It simply is not possible to draw strong conclusions from any single golf tournament. Even if a course setup is truly favourable to power players, it is easily possible for these players to underperform their baselines in a single week just due to randomness. To alleviate this concern, we will do two things. First, every season from 2005-2019 will be analyzed; if a given course plays as favourable to long hitters in most or all seasons, then this is more compelling evidence that something real is going on. Second, we can restrict our analysis to just strokes-gained off-the-tee (OTT) and strokes-gained approach (APP); this will remove the noisiest part of golf scores, and it is also the component of performance that should be of interest in an analysis of driving distance and accuracy [1].

On to the analysis! For every stroke-play PGA Tour event from 2005-present, we calculate each golfer's average driving distance (in units of yards relative to the field average) and average driving accuracy (in units of % fairways hit relative to field average) over the previous 12 months. Using our statistical model of golf scores, we have an estimate of a player's baseline ability at each point in time (roughly, think of this as a golfer's adjusted strokes-gained at all tournaments they have played in a 2-year window). Then, we simply correlate golfers' performance relative-to-baseline at each course with their 12-month driving distance average, or driving accuracy average, at the time of the tournament. This will tell us, for each course, to what degree golfers with better-than-average distance, or accuracy, outperformed their baseline skill levels in a given week. As mentioned above, performance in SG:OTT and SG:APP relative-to-baseline is also analyzed.

The plot below summarizes the results. (You'll need to hover over the data to gain any insight.) The larger dots are course-specific averages for all seasons from 2005-2019, while the smaller dots are for specific years. Each "column" of dots, which is highlighted on hover, is for a single course. For the driving distance plot (note the clickable link at the top to toggle distance/accuracy), each dot is interpreted as the number of strokes per round a golfer who is 10 yards longer than the PGA Tour average would be expected to perform above their baseline. For driving accuracy, the interpretation is the number of strokes above baseline per round for a golfer who is 5% more accurate than the PGA Tour average. Ten yards and five percent are roughly the standard deviations of the season-long distributions of driving distance and driving accuracy on the PGA Tour, which means these estimates can be interpreted as the difference between the average golfer and the 85th percentile golfer in each skill (distance or accuracy). For interested readers, these numbers are coefficients from simple linear regressions [2].

Where do bombers play well?
SG/round per 10 yards of driving distance
Click to see performance of accurate golfers
Notes: Plotted are coefficients from a regression of a golfer's strokes-gained over baseline on their historical driving distance average (or driving accuracy average), for selected PGA Tour courses from 2005-present. Larger dots are the average of the coefficients from specific years (i.e. the smaller dots) for each course. The toggle at the top allows you to see this analysis done for total strokes-gained, strokes-gained approach, and strokes-gained off-the-tee (all relative-to-baseline). For a golfer to be included in the analysis, they had to have played at least 30 PGA Tour rounds over the previous 12 months. For a course to be included, at least 3 stroke-play events had to be played there since 2005.
To ensure we are on the same page, let's discuss the 4th data point from the left, which belongs to Augusta National. Since 2005, Augusta has favoured longer hitters: a golfer who is 10 yards longer than the tour average would be expected to perform 0.19 strokes above their baseline at The Masters. Conversely, a golfer who is 5% more accurate than the PGA Tour average would be expected to perform 0.13 strokes below their baseline at Augusta National. You'll notice that, in general, courses that favour longer hitters tend to not favour accurate players; this is not mechanical, it is simply due to the fact that most long hitters on tour are not accurate, and vice versa. This is a critical point: you should think of this analysis as looking at how two types of players — those who hit it far, and those who hit it accurately — perform at different courses. Evidently, the same player could be both long and accurate; but in general this is not the case on the PGA Tour, which makes this analysis relatively clean. For a more detailed discussion of this point, visit [3].

The purpose of including the year-specific data points is to highlight the variance across years at a given course. For example, Harbour Town Golf Links has yielded a negative coefficient for driving distance every single year from 2005-2019. This makes it very convincing to claim that Harbour Town does in fact favour shorter hitters relative to other courses. Conversely, Muirfield Village also had a negative overall average from 2005-2019, but some of the specific years yielded positive correlations with driving distance; this leads us to put a little less weight on this specific negative estimate. It is easy to see the impact of statistical noise on the estimates by examining courses that host tournaments with limited field sizes: for example, East Lake GC and Kapalua.

The reason for performing this analysis using SG:OTT and SG:APP is to better understand how the correlation with total strokes-gained arises (this will be discussed more below). A useful statistical point to note is that the coefficient values from the analysis using each strokes-gained category add up to the coefficient from the analysis using total strokes-gained. For example, at Harbour Town the coefficient was -0.23 using total strokes-gained and -0.2 using OTT+APP; this means that there must be a slightly negative coefficient on ARG+PUTT (-0.03).

To focus ideas, the remainder of the article will centre on the driving distance analysis. As was eluded to above, it's important to remember that these are just correlations. A correlation between driving distance and performance could arise for many reasons. Let's consider a few explanations.

The most straightforward explanation for a negative correlation between the length a player possesses and their performance-to-baseline would be a course setup that limits a golfer's ability to hit driver — this directly reduces the number of strokes longer hitters can expect to gain off the tee (e.g. Harbour Town). On the flip side, course setups that require driver to be hit on every hole should achieve the opposite.

Golf courses that are especially penal off the tee could reduce the advantage of unreigned power, resulting in a negative correlation between a golfer's performance-to-baseline at that course and their length off the tee (think Le Golf National). Or, as has been described by Andy Johnson of the fried egg, courses that are less penal could, somewhat counterintuitively, also help shorter hitters by allowing both short and long hitters to reach greens after "missed" drives. According to this explanation, longer hitters will be favoured at long, narrow courses with deep rough, such as Bethpage Black. This is due to the fact that all players, both long and short, end up hitting many approach shots from the rough, and therefore being able to hit short irons, instead of long irons, from that deep rough is a big advantage. Importantly, this explanation would specifically predict an advantage for long hitters on approach shots; if we observe that at courses such as Bethpage Black longer hitters are performing above their baseline SG:APP, this would support Andy's hypothesis. However, it's also likely that bombers would perform above their baseline SG:APP at long courses with wide fairways, simply because the advantage of having raw power is likely to be greater with longer approaches than shorter ones.

There are also some spurious (i.e. uninteresting) ways that correlations between golfers' driving distance and performance-to-baseline could arise. As mentioned previously, golfers' who have above-average driving distance also tend to have other features in their games. Namely, they tend to have worse short games, and better approach games, on average. Therefore, any course that emphasizes short game more than other courses will hurt longer hitters, while courses that emphasize approach shots will benefit them [4]. We would say that the correlation driven by short-game performance is an uninteresting explanation because it is not being "caused" by driving distance, but rather is just a byproduct of the fact that longer hitters tend to be worse putters. A correlation driven by approach performance is more interesting, as having more length likely improves a player's approach game, all else equal. Because longer players tend to be better approach players, it will be hard to distinguish between a course that is simply a "second shot golf course" (i.e. emphasizes approach play), and one that actually gives a special advantage to those with power on approach shots. Both these explanations predict long hitters will outperform their baseline SG:APP. Of course, all these explanations can be ignored when only OTT performance is considered.

To tie all these explanations together, we can say, perhaps trivially, that longer hitters will outperform their baselines at courses that emphasize the specific skills they possess more than other courses do. To start to understand which explanations are consistent with the data, let's now look at which course characteristics correlate with the over-performance or under-performance of longer hitters. The plot below considers two simple characteristics: course length and average fairway width. The colouring and opacity of the dots indicates whether, and to what degree, each course saw players with above-average driving distance underperforming or overperforming their baseline skill levels.

Which course types favour longer hitters?
Green favours bombers; red disadvantages bombers
Notes: Plotted are the regression coefficients from the driving distance and performance-to-baseline analysis. The x and y coordinates indicate the width of the fairways and length of the course, respectively, while the opacity and colouring reflect the value of the coefficient (green is positive, red is negative, and darker opacity means higher magnitude). The interpretation of each data point is "the number of strokes per round above baseline a golfer is expected to perform at the relevant course for every 10 yards above the PGA Tour driving distance average the golfer is". To be included in this plot, the course has to have been played at least twice since 2005 and once since 2012. Further, it had to be a course with the ShotLink system set up in order to estimate fairway width.
It can be a bit more difficult to see trends in this plot. The main noticeable pattern is that most of the data points for longer courses are varying shades of green; that is, longer players outperform their baselines at (surprise!) longer courses. This is true for off-the-tee performance, and to a slightly lesser degree for approach performance, and generally holds regardless of whether the course has wide fairways or not.

With respect to fairway width, there are actually no strong patterns to speak of. Bethpage Black, which we've highlighted throughout this article, has seen long hitters outperform their baselines significantly on both approach and off-the-tee performance. At Firestone CC, which is also a long course with narrow fairways and deep rough, longer players have performed above their baselines overall, but below baseline on approach shots. Torrey Pines, Congressional, and Quail Hollow also fit the mold as long, narrow, and penal setups, and they all see overperformance from long hitters in SG:OTT and SG:APP. (It is useful to refer back to the first plot in this blog to see how much these numbers varied across years, in order to get a sense of the variability underlying each overall estimate.) As eluded to earlier, this pattern could simply be due to the length of these courses, and not necessarily due to longer hitters benefitting from their superior ability to hack it out of the rough. At the two widest courses in recent years on the PGA Tour, Kapalua and Trinity Forest, longer players performed above baseline off the tee, and were pretty much right at their baselines on approach shots. Statistically, when a regression is performed, there is no relationship between bombers' advantage and fairway width after controlling for the yardage of the course. In other words, if you vary fairway width while holding the distance of the course constant, there does not seem to be any advantage or disadvantage gained or lost by longer players. (For example, comparing Bethpage and Congressional, to Doral and Bay Hill)
In this blog we have tried to estimate the degree of underperformance or overperformance of different player types at each PGA Tour course. While most of the results align with intuition — longer players mainly gain an advantage at longer courses — there were a few surprises. Muirfield Village CC is the longest course in the data that yielded a negative correlation between driving distance and performance-to-baseline. Club de Golf Chapultepec, site of the WGC-Mexico, is a course that plays much shorter than its yardage of 7300 yards due to its high altitude location, and yet yielded one of the strongest relationships between driving distance and performance. While it is possible to provide a story to rationalize each result from this analysis, it's important to remember that statistical noise is still playing a large role. These results should be taken as suggestive, and far from the last word on any of the explanations given to explain the factors in course setup that allow power players to thrive.