Model Talk | Off-season model tweaks (2024)

Off-season model tweaks (2024)

PUBLISHED — 2024-01-08

As our site has become more complex and interconnected over time, it's become increasingly cumbersome to make changes to our model. Making a model update usually requires re-writing code in multiple places, and sometimes tweaks in design to a few pages. This, combined with the fact that any changes to our model at this point have a pretty small effect on predictive power, can make for an aggravating off-season in front of the computer. With my venting out of the way, here are some details on a few tweaks to the model for 2024.

First, and most importantly, we've made a change to how we weight a player's past rounds when estimating their skill. In a previous post we described the two different ways we weight rounds: by the sequence in which they were played (e.g. 5th-most recent round) and by the time at which they were played (e.g. round was played 10 days ago). This change only affects the sequence-weighted average. Previously, we used a weighting scheme that decayed exponentially (i.e. the ratio of two consecutive weights was a constant); this new weighting scheme decreases faster (in proportional terms) for more recent rounds and slower for rounds further into the past. The difference in the weight being applied to each round is shown in the plot below:

The new weighting scheme puts more weight on rounds 1-19 (8.4% in total), less weight on rounds 20-126 (-14.1%) and more weight on rounds more than 126 rounds into the past (5.6%). It puts nearly half the weight on a player's most recent 20 rounds (48%) but also puts a non-negligible amount on rounds more than 150 rounds into the past (7.6%). This new scheme has an intuitive appeal to me for two reasons. First, it's always bothered me that we put no weight on a player's very long-term form; with a constant decay, it's not possible to put a lot of weight on short-term form while also keeping some weight on rounds beyond 200 into the past. The variable decay allows for this. Second, this weighting scheme is tailor-made to capture something the model has consistently struggled with in the past: players who find some form for a few weeks after going through a rough patch. The model would typically be slow to update a player's skill in these situations. A current example of this type of player-trend is Erik Van Rooyen; Van Rooyen was a PGA Tour average player or better from 2018-2022, struggled mightily for most of 2023, and then reeled off 6 straight +SG events to end the year. To start the 2024 season, Van Rooyen's weighted average using the new weighting scheme is +0.36 while under the old one it would be +0.11. Of course, there are plenty of players who go into slumps, show some signs of life, and then revert back to their slumping level, so it's not as though this new weighting scheme perfectly fits all golfer profiles. But, on the whole, it did perform better when backtesting.

To get our final predicted skills based on total strokes-gained, the sequence-weighted average is still combined with the time-weighted average, so the actual effect of this change in weighting is only around half of what it seems. In our rankings, we are still using the old constant decay as we feel like there would be too much of a short-term focus under the new weighting scheme.

The second change we've made concerns how we treat the number of days since a player last teed it up. Previously, we allowed a player's "days since last event" (DSLE) to interact with their skill level; the result was that with every week a golfer takes off their predicted skill declined slightly, and for better golfers this decline occured more quickly (this is the interaction part). This latter effect is picking up some regression to the mean: all else equal, if a +2 skill golfer and a -1 golfer take a few months off, we expect their skill levels to be closer together when they return. The problem with doing things this way is that it doesn't take into account your DSLE relative to the field you are playing against. An example might help: this week at the Sentry, as with all weeks, there will be some players who have played more recently than others, but all players will be coming off of extended layoffs. We should probably expect that playing 1 week more recently than your peers matters a lot less when that means you played 5 weeks ago instead of the field average of 6 weeks. Following this logic, we now look at a player's DSLE relative to the field and allow the effect of relative DSLE on performance to vary with the field's average raw DSLE. The more recently the field as a whole has played, the more important it is when you last played. Compared to the old model, DSLE can now have a much larger effect: if a player competed 1 week ago when the field mostly played two weeks ago, that could be worth 0.1-0.15 strokes. If the field's average DLSE is 21 (3 weeks), the benefit of playing 1 week more recently than the field might only be worth 0.04-0.06 strokes. I was surprised how large this effect seems to be, and it made me wonder whether more players should be playing the week before majors (obviously there are other variables to consider, but it at least provides some food for thought).

Our third, and most inconsequential, change is that we've improved the skill estimates for the SG categories and distance/accuracy for players with less SG or driving data. As was outlined in a past post on Brent Grant's putting, we arrive at our SG category predictions by first estimating skill in each category only using a player's Shotlink rounds, and then adjusting these predictions so that they add up to our best estimate of a player's overall skill. If most of a player's data is not from Shotlink tournaments, these adjustments can be large. Further, even if the sum of the SG category skills and overall skill aren't that different, if a player hasn't played many recent SG rounds we should be less confident in their specific category predictions. For example, when Cameron Smith went to LIV in mid-2022, he was one of the best putters in the world. Since then, we only have 16 rounds of SG:P data from Smith (in which he has continued to putt very well). If we just used his Shotlink rounds, and ignored their recency, we would rate Smith as the best putter in the world, projected to gain 0.75 strokes on the greens in his next round. This is roughly what we did in the previous version of our model. Now, we account for the fact that only 25% of the weight going into Smith's overall skill is coming from rounds with SG category data, which causes Smith's predicting putting skill to fall to 0.59 strokes per round. More generally, the lower is a player's fraction of rounds with detailed SG data, the more regression to the mean we apply to their predicted category skills, as you would expect.

For driving distance and accuracy, because they aren't in units of strokes, we weren't making any analogous adjustments to what we did with the SG categories (i.e. making them add up to overall skill). Therefore, if a player hadn't played any recent tournaments with driving data, and their overall skill had changed a lot, we could be seriously mis-estimating their driving skill. A recent example of this can be found in Andrea Pavan's data; Pavan went through an awful stretch of golf from 2020-2022 before finding some form in 2023 on the Challenge Tour. If we compare the sum of his predicted SG category skills (based almost entirely off rounds he played in 2020-2022) to his overall skill (based primarily off of data from 2023), we find that the latter is almost 2 strokes better than the former. As alluded to above, we have a method for distributing this difference of 2 strokes to each of the SG category estimates; but, until recently, we didn't do anything to adjust a player's distance and accuracy skill in these situations. Now we've fit a regression that uses a player's basic distance/accuracy skill (formed from their driving data), their SG:OTT skill (which has already been adjusted to add up to their overall skill), and the fraction of their recent rounds that had driving data. The result is that for players with lots of recent driving data this regression doesn't do much to their predicted distance/accuracy, but for a player like Pavan it adjusts his driving skill to be more in line with his recent play (while still accounting for what his driving profile looked like in the past).