In this lesson you’ll learn about selecting data points in your visualizations. Implementing selection behavior is as easy as adding a few specific keywords when declaring your glyphs. You will start by modifying read_nba_data.py
and aggregating data from the player_stats
DataFrame.
For even more information about what you can do upon selection, check out Selected and Unselected Glyphs.
File: read_nba_data.py
import pandas as pd
# Read the csv files
player_stats = pd.read_csv('data/2017-18_playerBoxScore.csv',
parse_dates=['gmDate'])
team_stats = pd.read_csv('data/2017-18_teamBoxScore.csv',
parse_dates=['gmDate'])
standings = pd.read_csv('data/2017-18_standings.csv',
parse_dates=['stDate'])
# Create west_top_2
west_top_2 = (standings[(standings['teamAbbr'] == 'HOU') |
(standings['teamAbbr'] == 'GS')]
.loc[:, ['stDate', 'teamAbbr', 'gameWon']]
.sort_values(['teamAbbr', 'stDate']))
# Find players who took at least 1 three-point shot during the season
three_takers = player_stats[player_stats['play3PA'] > 0]
# Clean up the player names, placing them in a single column
three_takers['name'] = [f'{p["playFNm"]} {p["playLNm"]}'
for _, p in three_takers.iterrows()]
# Aggregate the total three-point attempts and makes for each player
three_takers = (three_takers.groupby('name')
.sum()
.loc[:,['play3PA', 'play3PM']]
.sort_values('play3PA', ascending=False))
# Filter out anyone who didn't take at least 100 three-point shots
three_takers = three_takers[three_takers['play3PA'] >= 100].reset_index()
# Add a column with a calculated three-point percentage (made/attempted)
three_takers['pct3PM'] = three_takers['play3PM'] / three_takers['play3PA']
File: ThreePointAttVsPct.py
# Bokeh Libraries
from bokeh.plotting import figure, show
from bokeh.io import output_file
from bokeh.models import ColumnDataSource, NumeralTickFormatter
# Import the data
from read_nba_data import three_takers
# Output to file
output_file('three_point_att_vs_pct.html',
title='Three-Point Attempts vs. Percentage')
# Store the data in a ColumnDataSource
three_takers_cds = ColumnDataSource(three_takers)
#Specify the selection tools to be made available
select_tools = ['box_select', 'lasso_select', 'poly_select', 'tap', 'reset']
# Create the figure
fig = figure(plot_height=400,
plot_width=600,
x_axis_label='Three-Point Shots Attempted',
y_axis_label='Percentage Made',
title='3PT Shots Attempted vs. Percentage Made (min. 100 3PA), 2017-18',
toolbar_location='below',
tools=select_tools)
# Format the y-axis tick label as percentages
fig.yaxis[0].formatter = NumeralTickFormatter(format='00.0%')
# Add square representing each player
fig.square(x='play3PA',
y='pct3PM',
source=three_takers_cds,
color='royalblue',
selection_color='deepskyblue',
nonselection_color='lightgray',
nonselection_alpha=0.3)
# Visualize
show(fig)
Pygator on Aug. 18, 2019
This set of video tutorials are great! I can already dream up some use cases. Some video series about some image manipulation packages would be great.