A small story about writing multi-line plots with Plotly Express
, a wrapper for the Python graph library Plotly
, whose modern design is irresistible.
I have extracted the 2019 MLB draft data from a site called Spotrac. I will omit the scraping and processing steps, but the data looks like this.
>>> df.head()
PICK TEAM NAME AGE POS SCHOOL SLOTTED_BONUS SIGNED_BONUS
0 1.0 BAL Adley Rutschman 21 C Oregon State 8415300.0 8100000.0
1 2.0 KC Bobby Witt Jr. 18 SS Colleyville Heritage HS 7789900.0 7789900.0
2 3.0 CHW Andrew Vaughn 21 1B California 7221200.0 7221200.0
3 4.0 MIA J.J. Bleday 21 OF Vanderbilt 6664000.0 6670000.0
4 5.0 DET Riley Greene 18 OF Hagerty HS 6180700.0 6180700.0
>>> df.dtypes
PICK float64
TEAM object
NAME object
AGE object
POS object
SCHOOL object
SLOTTED_BONUS float64
SIGNED_BONUS float64
dtype: object
The meaning of the column of DataFrame
is as follows. It's an MLB otaku story, so I hope you can check it out: bow:
PICK
――The order of the draft number in the wholeTEAM
--Drafted team nameNAME
--Drafted player nameAGE
--Age of drafted playersPOS
--PositionSLOTTED_BONUS
--Slot amount assigned in the draft order
--The MLB draft has slot amounts for all rankings, and the sum of the slot amounts for each team is the "good amount to use in the draft".SIGNED_BONUS
--The amount of contract money actually paid to the players in that draft order
――For example, you can make a contract with a lower price than the slot amount in the upper rank, and give the player who drafted in the next order the amount that you just suppressed.What I want to do is overlap the polygonal lines with SLOTTED_BONUS
and SIGNED_BONUS
, and what is the actual sign amount with respect to the slot amount? I want to visualize.
By taking the time to create Tidy Data with pandas.melt ()
, you can write it quickly by feeding the data to Plotly Express
.
>>> mdf = pd.melt(
... df,
... id_vars=["PICK", "TEAM", "NAME", "AGE", "POS", "SCHOOL"],
... value_vars=["SLOTTED_BONUS", "SIGNED_BONUS"],
... var_name="BONUS_TYPE",
... value_name="AMOUNT"
... )
>>> mdf.head()
PICK TEAM NAME AGE POS SCHOOL BONUS_TYPE AMOUNT
0 1.0 BAL Adley Rutschman 21 C Oregon State SLOTTED_BONUS 8415300.0
1 2.0 KC Bobby Witt Jr. 18 SS Colleyville Heritage HS SLOTTED_BONUS 7789900.0
2 3.0 CHW Andrew Vaughn 21 1B California SLOTTED_BONUS 7221200.0
3 4.0 MIA J.J. Bleday 21 OF Vanderbilt SLOTTED_BONUS 6664000.0
4 5.0 DET Riley Greene 18 OF Hagerty HS SLOTTED_BONUS 6180700.0
I modified SLOTTED_BONUS
and SIGNED_BONUS
to hold vertically, and put them in the column called BONUS_TYPE
as the column that holds the original column name and the value as ʻAMOUNT. In this form, you can write a multi-line plot by setting to see
BONUS_TYPE as a parameter to be passed to
Plotly Express`.
import plotly.express as px
px.line(
mdf, x="PICK", y="AMOUNT", color="BONUS_TYPE",
)
Recommended Posts