Quantitative Analysis in Sports

I've written primarily about hockey and baseball, two sports that appear to be as far apart as can be in the way the games are typically modelled, though there is a fair amount of common ground to be had.

Hockey projects:

  • The Impact of Puck Possession and Location on Ice Hockey Strategy. Journal of Quantitative Analysis in Sports, (2)1. [2006] -- Since offence and defence are highly entangled concepts -- a team is less likely to be scored upon in situations where they're likely to score -- I separate the two elements of each into puck possession and location and assess the offensive and defensive potential in each situation; this is simulated using a semi-Markov process. The data were manually collected from Harvard Crimson Men's games in 2004-2005 and are available upon request.
  • Inter-Arrival Times of Goals in Ice Hockey. Journal of Quantitative Analysis in Sports, 3(3). [2007] -- Taking times between goals in NHL games, and accounting for the censored nature of data, I estimate the probability distribution of the time between events using survival analysis, then use this to estimate the value of a goal in terms of the change in win probability for a team. This approach was originally inspired by work in baseball by George Lindsey and others.

Baseball projects:

  • Simulating Record Accomplishments in Baseball; Or, That's the Second-Biggest Hitting Streak I've Ever Seen! Working paper [2008] -- Given season statistics for players throughout the history of Major League Baseball, I shrink these statistics toward fitted career curves for each player, then use these to simulate player-games and determine hitting and on-base streaks under an assumption of independent game outcomes. Having verified that the model accurately captures hitting streaks, I conclude (as have others) that the fabled DiMaggio 56-game streak, while an impressive accomplishment in itself, is not unexpected in the history of the game. The model doesn't fare nearly as well in modelling on-base streaks.
  • The Catcher Spotting Project is an attempt to measure pitcher intent by recording the potential target of each pitch, and adding this information to already available sources.

Content copyright (c) 2008, Andrew C. Thomas.