# Game Excitement and Win Probability in the NFL

Game excitement calculation and a win probability figure.

08-21-2020

## Part 1: Importing and Preprocessing

First we need to import our dependencies. These pacakges are what make this analysis possible.

``````
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt``````

Next we will read in our data from the nflfastR data repo.

``````
YEAR = 2019

'play_by_play_' + str(YEAR) + '.csv.gz?raw=True',
compression='gzip', low_memory=False)``````

Perfect! Our data and notebook are set up and ready to go. The next step is to filter our df to include only the game we would like to work with. We will subset by `game_id` (which we will need later). The new nflfastR game ids are very convenient and use the following format:

`YEAR_WEEK_AWAY_HOME`

Note, the year needs to be in YYYY format and single digit weeks must lead with a 0.

``````
#Subset the game of interest
game_df = data[
(data.game_id== '2019_09_MIN_KC')
]

#View a random sample of our df to ensure everything is correct
game_df.sample(3)``````
``````
play_id         game_id  ...  xyac_success   xyac_fd
23013     1294  2019_09_MIN_KC  ...      1.000000  1.000000
23080     3077  2019_09_MIN_KC  ...      0.140994  0.107368
23077     2992  2019_09_MIN_KC  ...           NaN       NaN

[3 rows x 340 columns]``````

The last step in preprocessing for this particular analysis is dropping null values to avoid jumps in our WP chart. To clean things up, we can filter the columns to show only those that are of importance to us.

``````
cols = ['home_wp','away_wp','game_seconds_remaining']
game_df = game_df[cols].dropna()

#View new df to again ensure everything is correct
game_df``````
``````
home_wp   away_wp  game_seconds_remaining
22960  0.560850  0.439150                  3600.0
22961  0.560850  0.439150                  3600.0
22962  0.599848  0.400152                  3596.0
22963  0.612526  0.387474                  3590.0
22964  0.629503  0.370497                  3584.0
...         ...       ...                     ...
23132  0.697633  0.302367                    59.0
23134  0.806030  0.193970                    24.0
23135  0.910061  0.089939                     4.0
23136  0.927525  0.072475                     3.0
23137  1.000000  0.000000                     0.0

[166 rows x 3 columns]``````

Everything looks good to go! Before we use this data to create the WP chart, we are going to calculate the game’s excitement index.

## Part 2: Game Excitement Index

We are using Luke Benz’ formula for GEI which can be found here. It’s simple yet effective which is why I like it so much. As Luke notes, “the formula sums the absolute value of the win probability change from each play”. Here, we are creating a function (inspired by ChiefsAnalytics) that follows his formula. This function requires a single parameter `game_id`. The new version of nflfastR’s game id must be used here.

``````
#Calculate average length of 2019 games for use in our function
avg_length = data.groupby(by=['game_id'])['epa'].count().mean()

def calc_gei(game_id):
game = data[(data['game_id']==game_id)]
#Length of game
length = len(game)
normalize = avg_length / length
#Get win probability differences for each play
win_prob_change = game['home_wp'].diff().abs()
#Normalization
gei = normalize * win_prob_change.sum()
return gei``````

Let’s run the function by passing in our game id from earlier.

``````
print(f"Vikings @ Chiefs GEI: {calc_gei('2019_09_MIN_KC')}")``````
``````
Vikings @ Chiefs GEI: 4.652632439280925``````

This seemed to be a pretty exciting game. Let’s compare it to other notable games from last season.

``````
# Week 1 blowout between the Ravens and Dolphins
print(f"Ravens @ Dolphins GEI: {calc_gei('2019_01_BAL_MIA')}")

# Week 14 thriller between the 49ers and Saints``````
``````
Ravens @ Dolphins GEI: 0.9723172478637379``````
``````
print(f"49ers @ Saints GEI: {calc_gei('2019_14_SF_NO')}")``````
``````
49ers @ Saints GEI: 5.190375267367869``````

Yep, the Vikings vs Chiefs game was definitely one of the more exciting regular season games of last season. Let’s see how it looks visually with a WP chart!

## Part 3: Win Probability Chart

Matplotlib and Seaborn can be used together to create some beautiful plots. Before we start, below is a useful line of code that prints out all usable matplotlib styles. You can also see how each of them look by checking out the documentation.

``````
#Print all matplotlib styles
print(plt.style.available)``````
``````
['Solarize_Light2', '_classic_test_patch', 'bmh', 'classic', 'dark_background', 'fast', 'fivethirtyeight', 'ggplot', 'grayscale', 'seaborn', 'seaborn-bright', 'seaborn-colorblind', 'seaborn-dark', 'seaborn-dark-palette', 'seaborn-darkgrid', 'seaborn-deep', 'seaborn-muted', 'seaborn-notebook', 'seaborn-paper', 'seaborn-pastel', 'seaborn-poster', 'seaborn-talk', 'seaborn-ticks', 'seaborn-white', 'seaborn-whitegrid', 'tableau-colorblind10']``````

Since we already have all of our data set up from Step 1, we can jump straight to the plot!

``````
#Set style
plt.style.use('dark_background')

#Create a figure
fig, ax = plt.subplots(figsize=(16,8))

#Generate lineplots
sns.lineplot('game_seconds_remaining', 'away_wp',
data=game_df, color='#4F2683',linewidth=2)

sns.lineplot('game_seconds_remaining', 'home_wp',
data=game_df, color='#E31837',linewidth=2)

#Generate fills for the favored team at any given time``````
``````
<AxesSubplot:xlabel='game_seconds_remaining', ylabel='home_wp'>``````
``````
ax.fill_between(game_df['game_seconds_remaining'], 0.5, game_df['away_wp'],
where=game_df['away_wp']>.5, color = '#4F2683',alpha=0.3)

ax.fill_between(game_df['game_seconds_remaining'], 0.5, game_df['home_wp'],
where=game_df['home_wp']>.5, color = '#E31837',alpha=0.3)

#Labels
plt.ylabel('Win Probability %', fontsize=16)
plt.xlabel('', fontsize=16)

#Divider lines for aesthetics
plt.axvline(x=900, color='white', alpha=0.7)
plt.axvline(x=1800, color='white', alpha=0.7)
plt.axvline(x=2700, color='white', alpha=0.7)
plt.axhline(y=.50, color='white', alpha=0.7)

#Format and rename xticks
ax.set_xticks(np.arange(0, 3601,900))``````
``````
[<matplotlib.axis.XTick object at 0x000000002F30CF60>, <matplotlib.axis.XTick object at 0x000000002F30CB00>, <matplotlib.axis.XTick object at 0x000000002F33FD30>, <matplotlib.axis.XTick object at 0x000000002F3D0438>, <matplotlib.axis.XTick object at 0x000000002F3D08D0>]``````
``````
plt.gca().invert_xaxis()
x_ticks_labels = ['End','End Q3','Half','End Q1','Kickoff']
ax.set_xticklabels(x_ticks_labels, fontsize=12)

#Titles``````
``````
[Text(0, 0, 'End'), Text(900, 0, 'End Q3'), Text(1800, 0, 'Half'), Text(2700, 0, 'End Q1'), Text(3600, 0, 'Kickoff')]``````
``````
plt.suptitle('Minnesota Vikings @ Kansas City Chiefs',
fontsize=20, style='italic',weight='bold')

plt.title('KC 26, MIN 23 - Week 9 ', fontsize=16,
style='italic', weight='semibold')

#Creating a textbox with GEI score
props = dict(boxstyle='round', facecolor='black', alpha=0.6)
plt.figtext(.133,.85,'Game Excitement Index (GEI): 4.65',style='italic',bbox=props)

#Citations
plt.figtext(0.131,0.137,'Graph: @mnpykings | Data: @nflfastR')

#Save figure if you wish
#plt.savefig('winprobchart.png', dpi=300)``````

Wow, this game had a ton of WP changes. No wonder it had a high GEI!

Things to be aware of:

• Sometimes the plot generates small gaps in the fill. This only occurs when the previous data point is on the opposite side of the 50% threshold compared to the current data point or vice versa (this happens twice to the Chiefs’ WP line towards the end of the game). The `.fill_between()` function only checks to fill at each new data point and not inbetween. This is very minor and the dark background makes it hardly noticeable, but I wanted to address it to make sure nobody gets confused if this happens to them.

• The nflfastR win probability model is a little wonky in OT due to it not accounting for ties as Sebastian mentions here. Be mindful of this when calculating GEI or creating WP charts with OT games.

That concludes this tutorial. Thanks for reading, I hope you learned some python in the process! Big thanks to Sebastian Carl and Ben Baldwin for everything they do; I’m looking forward to watching this platform grow! The future of sports analytics has never looked brighter.

### Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

### Reuse

Text and figures are licensed under Creative Commons Attribution CC BY-NC 4.0. Source code is available at https://github.com/mrcaseb/open-source-football, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

### Citation

`Bolger (2020, Aug. 21). Open Source Football: Game Excitement and Win Probability in the NFL. Retrieved from https://mrcaseb.github.io/open-source-football/posts/2020-08-21-game-excitement-and-win-probability-in-the-nfl/`
```@misc{bolger2020game,