Game Excitement and Win Probability in the NFL

Game excitement calculation and a win probability figure.

Max Bolger https://twitter.com/mnpykings
08-21-2020

Table of Contents


Part 1: Importing and Preprocessing

First we need to import our dependencies. These pacakges are what make this analysis possible.


import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt

Next we will read in our data from the nflfastR data repo.


# Read in data
YEAR = 2019

data = pd.read_csv('https://github.com/guga31bb/nflfastR-data/blob/master/data/' \
                         'play_by_play_' + str(YEAR) + '.csv.gz?raw=True',
                         compression='gzip', low_memory=False)

Perfect! Our data and notebook are set up and ready to go. The next step is to filter our df to include only the game we would like to work with. We will subset by game_id (which we will need later). The new nflfastR game ids are very convenient and use the following format:

YEAR_WEEK_AWAY_HOME

Note, the year needs to be in YYYY format and single digit weeks must lead with a 0.


#Subset the game of interest
game_df = data[
             (data.game_id== '2019_09_MIN_KC')
             ]

#View a random sample of our df to ensure everything is correct          
game_df.sample(3)

       play_id         game_id  ...  xyac_success   xyac_fd
23013     1294  2019_09_MIN_KC  ...      1.000000  1.000000
23080     3077  2019_09_MIN_KC  ...      0.140994  0.107368
23077     2992  2019_09_MIN_KC  ...           NaN       NaN

[3 rows x 340 columns]

The last step in preprocessing for this particular analysis is dropping null values to avoid jumps in our WP chart. To clean things up, we can filter the columns to show only those that are of importance to us.


cols = ['home_wp','away_wp','game_seconds_remaining']
game_df = game_df[cols].dropna()

#View new df to again ensure everything is correct
game_df

        home_wp   away_wp  game_seconds_remaining
22960  0.560850  0.439150                  3600.0
22961  0.560850  0.439150                  3600.0
22962  0.599848  0.400152                  3596.0
22963  0.612526  0.387474                  3590.0
22964  0.629503  0.370497                  3584.0
...         ...       ...                     ...
23132  0.697633  0.302367                    59.0
23134  0.806030  0.193970                    24.0
23135  0.910061  0.089939                     4.0
23136  0.927525  0.072475                     3.0
23137  1.000000  0.000000                     0.0

[166 rows x 3 columns]

Everything looks good to go! Before we use this data to create the WP chart, we are going to calculate the game’s excitement index.

Part 2: Game Excitement Index

We are using Luke Benz’ formula for GEI which can be found here. It’s simple yet effective which is why I like it so much. As Luke notes, “the formula sums the absolute value of the win probability change from each play”. Here, we are creating a function (inspired by ChiefsAnalytics) that follows his formula. This function requires a single parameter game_id. The new version of nflfastR’s game id must be used here.


#Calculate average length of 2019 games for use in our function
avg_length = data.groupby(by=['game_id'])['epa'].count().mean()

def calc_gei(game_id):
  game = data[(data['game_id']==game_id)]
  #Length of game
  length = len(game)
  #Adjusting for game length
  normalize = avg_length / length
  #Get win probability differences for each play
  win_prob_change = game['home_wp'].diff().abs()
  #Normalization
  gei = normalize * win_prob_change.sum()
  return gei

Let’s run the function by passing in our game id from earlier.


print(f"Vikings @ Chiefs GEI: {calc_gei('2019_09_MIN_KC')}")

Vikings @ Chiefs GEI: 4.652632439280925

This seemed to be a pretty exciting game. Let’s compare it to other notable games from last season.


# Week 1 blowout between the Ravens and Dolphins
print(f"Ravens @ Dolphins GEI: {calc_gei('2019_01_BAL_MIA')}")

# Week 14 thriller between the 49ers and Saints

Ravens @ Dolphins GEI: 0.9723172478637379

print(f"49ers @ Saints GEI: {calc_gei('2019_14_SF_NO')}")

49ers @ Saints GEI: 5.190375267367869

Yep, the Vikings vs Chiefs game was definitely one of the more exciting regular season games of last season. Let’s see how it looks visually with a WP chart!

Part 3: Win Probability Chart

Matplotlib and Seaborn can be used together to create some beautiful plots. Before we start, below is a useful line of code that prints out all usable matplotlib styles. You can also see how each of them look by checking out the documentation.


#Print all matplotlib styles
print(plt.style.available)

['Solarize_Light2', '_classic_test_patch', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark', 'seaborn-dark-palette', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'tableau-colorblind10']

Since we already have all of our data set up from Step 1, we can jump straight to the plot!


#Set style
plt.style.use('dark_background')

#Create a figure
fig, ax = plt.subplots(figsize=(16,8))

#Generate lineplots
sns.lineplot('game_seconds_remaining', 'away_wp', 
             data=game_df, color='#4F2683',linewidth=2)

sns.lineplot('game_seconds_remaining', 'home_wp', 
             data=game_df, color='#E31837',linewidth=2)

#Generate fills for the favored team at any given time

<AxesSubplot:xlabel='game_seconds_remaining', ylabel='home_wp'>

ax.fill_between(game_df['game_seconds_remaining'], 0.5, game_df['away_wp'], 
                where=game_df['away_wp']>.5, color = '#4F2683',alpha=0.3)

ax.fill_between(game_df['game_seconds_remaining'], 0.5, game_df['home_wp'], 
                where=game_df['home_wp']>.5, color = '#E31837',alpha=0.3)

#Labels
plt.ylabel('Win Probability %', fontsize=16)
plt.xlabel('', fontsize=16)

#Divider lines for aesthetics
plt.axvline(x=900, color='white', alpha=0.7)
plt.axvline(x=1800, color='white', alpha=0.7)
plt.axvline(x=2700, color='white', alpha=0.7)
plt.axhline(y=.50, color='white', alpha=0.7)

#Format and rename xticks
ax.set_xticks(np.arange(0, 3601,900))

[<matplotlib.axis.XTick object at 0x000000002F30CF60>, <matplotlib.axis.XTick object at 0x000000002F30CB00>, <matplotlib.axis.XTick object at 0x000000002F33FD30>, <matplotlib.axis.XTick object at 0x000000002F3D0438>, <matplotlib.axis.XTick object at 0x000000002F3D08D0>]

plt.gca().invert_xaxis()
x_ticks_labels = ['End','End Q3','Half','End Q1','Kickoff']
ax.set_xticklabels(x_ticks_labels, fontsize=12)

#Titles

[Text(0, 0, 'End'), Text(900, 0, 'End Q3'), Text(1800, 0, 'Half'), Text(2700, 0, 'End Q1'), Text(3600, 0, 'Kickoff')]

plt.suptitle('Minnesota Vikings @ Kansas City Chiefs', 
             fontsize=20, style='italic',weight='bold')

plt.title('KC 26, MIN 23 - Week 9 ', fontsize=16, 
          style='italic', weight='semibold')

#Creating a textbox with GEI score
props = dict(boxstyle='round', facecolor='black', alpha=0.6)
plt.figtext(.133,.85,'Game Excitement Index (GEI): 4.65',style='italic',bbox=props)

#Citations
plt.figtext(0.131,0.137,'Graph: @mnpykings | Data: @nflfastR')

#Save figure if you wish
#plt.savefig('winprobchart.png', dpi=300)

Wow, this game had a ton of WP changes. No wonder it had a high GEI!

Things to be aware of:

That concludes this tutorial. Thanks for reading, I hope you learned some python in the process! Big thanks to Sebastian Carl and Ben Baldwin for everything they do; I’m looking forward to watching this platform grow! The future of sports analytics has never looked brighter.

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. Source code is available at https://github.com/mrcaseb/open-source-football, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Bolger (2020, Aug. 21). Open Source Football: Game Excitement and Win Probability in the NFL. Retrieved from https://www.opensourcefootball.com/posts/2020-08-21-game-excitement-and-win-probability-in-the-nfl/

BibTeX citation

@misc{bolger2020game,
  author = {Bolger, Max},
  title = {Open Source Football: Game Excitement and Win Probability in the NFL},
  url = {https://www.opensourcefootball.com/posts/2020-08-21-game-excitement-and-win-probability-in-the-nfl/},
  year = {2020}
}