I wanted to know how much of an impact, if any, the weather has on the race times posted for the Ann Arbor Track Club’s annual Dexter to Ann Arbor Run.
I also got Ann Arbor NOAA weather data for corresponding years and compiled it into the file NOAA WEATHER station ANN ARBOR U OF MICH MI US.csv.
Here’s what we’re looking at.
- Average half-marathon race time for the top 50 men, compared against:
- Race day high temperature
- Race day low temperature
- Year-to-date rain
- Year-to-date number of days with sleet
- Year-to-date snow
Here are pictures of the relationships between race times and weather factors (and also the Pearson coefficients) from that original data:
Without looking more closely, it looks like we discovered some really cool stuff. The average race time seems to increase (get slower…) whenever any of the weather factors increase. Obviously, red flags should be going up about the possibility of spurious correlations here.
OK, let’s look more closely. Here are graphs by year of the factors from the original data:
So yeah, for whatever reason, all the factors actually seems to be trending up as the year increases. This is definitely a problem.
I detrended the race times and weather factors using the best fit linear model approach, and here are the resulting detrended relationships and coefficients:
These graphs aren’t nearly as striking, but they’re probably more useful. Here are my guesses based on these:
- Race times appear to be slower when it’s warmer on race day. Nothing new here. This effect is commonly acknowledged within the running community.
- Race times might possibly tend to be faster if there has been more rain from January to May of the race year. This seems kind of weird, if in fact there even is any real correlation. It might be a stretch, but the only explanation I can come up with for such a relationship is that maybe more rain in January to May means warmer temperatures and so more training days.