A beginner in machine learning with python ... the second in a series.

In the previous article "Beginners of machine learning made a horse racing prediction model with python", I summarized the process of building a simple learning model using horse racing as a theme.

Since this is a practical edition, I would like to make the prediction of Arima Kinen, which is the total settlement of accounts for 2020, actually. Since I am in charge of the big bird of the Advent calendar, is it just right in terms of timing? is not it!

Please refer to Previous article for environment construction.

What is Arima Kinen?

It is one of the GI races of the Central Horse Racing, and it is a big race that concludes the year, which is also called the Grand Prix, where the runners are selected by fan voting.

Click here for the race outline.

-** Date: 2020/12/27 (Sun) ** -** Racecourse: Nakayama Racecourse ** -** Course: Shiba 2500m **

As it is a long-distance race at Nakayama Racecourse with a small turn, I personally feel that it is a race that is easy to get confused. In fact, there are many unpopular horses in the third place, which is a predictable race.

Considering the dataset to use

Last time, I used all the data for the past 5 years. I want to win the Arima Kinen this time! And the purpose is clear, so consider a dataset that fits it.

Here is a summary of the number of data samples for each condition.

Data period: 2015-2020 (until 12/20)

conditions	Number of races	Number of racehorse records
All races	20,677	293,120
Nakayama Racecourse turf race	1,276	18,109
Race over 2000m turf at Nakayama Racecourse	479	6,621
Nakayama Racecourse turf 2500m race	58	711

This time, we will not create a general-purpose model, but focus on creating a model that is close to the characteristics of Arima Kinen. However, since a certain number of samples is required, I decided to use the data set of "** Race of turf 2000m or more at Nakayama Racecourse **".

Data preprocessing

As with the last time, we will carry out preprocessing of the data.

Finally I will load the following data into the pandas DataFrame.

data item	Use	Data description
race_index	index	Identification ID that identifies the race to be held
Horse number	Explanatory variable	Racehorse's horse number
Race class	Explanatory variable	Convert the class of the race to a numerical value(*1)
Time index	Explanatory variable	Time index of the last 3 races of racehorses(*2)Median
Passing order 4 corners	Explanatory variable	Median order of passing the final corners of the last three racehorses
Jockey name	Explanatory variable	Use the jockey name as a dummy variable
Stallion name	Explanatory variable	Use the stallion name as a dummy variable
Within 3	Objective variable	Convert the finish order of racehorses to 1 if it is within 3rd place and 0 if it is 4th or less

(* 1) Race class has the following rules

Race class	Converted number
New horse	250
Not won	250
1 win class/5 million	500
2 win class/Ten million	1000
3 win class/16 million	1500
OP	2000
G3	3000
G2	4500
G1	7000

(* 2) The time index is an index of the running time of past races provided by Data acquisition source.

Raw data (CSV file) configuration

The original data this time is this CSV file group. スクリーンショット 2020-12-24 23.37.52.png

Cleanse, integrate and transform these data and load them into the DataFrame as shown below. スクリーンショット 2020-12-24 23.42.17.png

The mounting part was a little complicated, so I will omit it here.

Continuation of pretreatment

After that, the objective variable is generated and the explanatory variable is made into a dummy variable in order, as in the previous time.

`sample.ipynb`


(Omission)

#Add a column to see if the order of arrival is within 3
f_ranking = lambda x: 1 if x in [1, 2, 3] else 0
df['Within 3'] = df['Confirmed order of arrival'].map(f_ranking)

#Generate dummy variable
df = pd.get_dummies(df, columns=['Jockey name'])
df = pd.get_dummies(df, columns=['Stallion name'])

#Set index (use up to 16th byte to specify race only)
df['race_index'] = df['Race ID'].astype(str).str[0:16]
df.set_index('race_index', inplace=True)

#Delete unnecessary columns
df.drop(['Race ID', 'Confirmed order of arrival'], axis=1, inplace=True)

This completes the data preprocessing. スクリーンショット 2020-12-24 23.44.06.png

Model learning and evaluation

Next, we will train the model, but this time we would like to verify the following classification algorithms including the logistic regression used last time.

algorithm	Overview
Logistic regression	A method that uses two-choice prediction results returned with a probability of 0 to 1 for classification.
Support vector machine	A method of classifying by drawing a boundary that divides the class to the maximum
K-nearest neighbor method	A method of classifying by majority vote of data groups in the vicinity of the data to be predicted
Random forest	Decision tree(Yes/No branch condition)Method of making multiple and classifying by majority vote

This article is organized in an easy-to-understand manner. Reference: Roughly organize machine learning information centered on methods

The above algorithms are all included in sklearn, and can be operated with the same implementation except for the process of creating each classifier class.

Learning & evaluation data generation (common processing)

We will carry out the following processing in the same way as last time.

Data partitioning and standardization

Divide the data into training data and evaluation data for each explanatory variable and objective variable. And this time, in order to save time and effort, we will standardize the explanatory variables at the stage before division.

`sample.ipynb`


from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

#Store explanatory variables in dataX
dataX = df.drop(['Within 3'], axis=1)

#Store objective variable in dataY
dataY = df['Within 3']

#Standardize the explanatory variables at this stage
sc = StandardScaler()
dataX_std = pd.DataFrame(sc.fit_transform(dataX), columns=dataX.columns, index=dataX.index)

#Divide the data (learning data 0).8 Evaluation data 0.2）
X_train, X_test, y_train, y_test = train_test_split(dataX_std, dataY, test_size=0.2, stratify=dataY)

Variable name	type of data	Use
X_train	Explanatory variable	Training data
X_test	Explanatory variable	Evaluation data
y_train	Objective variable	Training data
y_test	Objective variable	Evaluation data

Correct data imbalance

`sample.ipynb`


from imblearn.under_sampling import RandomUnderSampler

#Undersampling training data
f_count = y_train.value_counts()[1] * 2
t_count = y_train.value_counts()[1]
rus = RandomUnderSampler(sampling_strategy={0:f_count, 1:t_count})
X_train, y_train = rus.fit_sample(X_train, y_train)

Undersampling with the current version of Random UnderSampler causes a problem that the index (race_index) disappears (it does not have a big effect, so continue as it is)

Learning and evaluation by logistic regression

From here, we will train and evaluate the model using each algorithm. The first is logistic regression.

`sample.ipynb`


from sklearn.linear_model import LogisticRegression

#Create a classifier (logistic regression)
clf = LogisticRegression(max_iter=10000)

#Learning
clf.fit(X_train, y_train)

#Forecast
y_pred = clf.predict(X_test)

#Display correct answer rate
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, y_pred))
0.7488372093023256

#Show precision
from sklearn.metrics import precision_score
print(precision_score(y_test, y_pred))
0.4158878504672897

#Display F value
from sklearn.metrics import f1_score
print(f1_score(y_test, y_pred))
0.39732142857142855

Learning and evaluation by support vector machine (SVM)

Next, let's verify the support vector machine.

`sample.ipynb`


from sklearn.svm import SVC

#Create a classifier (support vector machine)
clf = SVC(kernel='rbf', gamma=0.1, probability=True)

#Learning
clf.fit(X_train, y_train)

#Forecast
y_pred = clf.predict(X_test)

#Display correct answer rate
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, y_pred))
0.7581395348837209

#Show precision
from sklearn.metrics import precision_score
print(precision_score(y_test, y_pred))
0.42168674698795183

#Display F value
from sklearn.metrics import f1_score
print(f1_score(y_test, y_pred))
0.35000000000000003

You can see that it is the same implementation as logistic regression, except where it creates a classifier class. In addition, tuning the parameters of the classifier class can improve accuracy and prevent overfitting.

See the reference below for details. Reference: https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html

Learning and evaluation by K-nearest neighbor method (KNN)

Next, let's verify the K-nearest neighbor method.

`sample.ipynb`


from sklearn.neighbors import KNeighborsClassifier

#Create a classifier (K-nearest neighbor method)
clf = KNeighborsClassifier(n_neighbors=9)

#Learning
clf.fit(X_train, y_train)

#Forecast
y_pred = clf.predict(X_test)

#Display correct answer rate
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, y_pred))
0.68

#Show precision
from sklearn.metrics import precision_score
print(precision_score(y_test, y_pred))
0.31543624161073824

#Display F value
from sklearn.metrics import f1_score
print(f1_score(y_test, y_pred))
0.3533834586466166

This is also the same implementation as logistic regression, except that it creates a classifier class. Also, in the parameters of the classifier class, it is important to set n_neighbors (the number of data for which a majority vote is taken).

See the reference below for details. Reference: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

Random forest learning and evaluation

Finally, let's examine Random Forest.

`sample.ipynb`


from sklearn.ensemble import RandomForestClassifier

#Create a classifier (random forest)
clf = RandomForestClassifier(
  random_state=100,
  n_estimators=50,
  min_samples_split=100
)

#Learning
clf.fit(X_train, y_train)

#Forecast
y_pred = clf.predict(X_test)

#Display correct answer rate
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, y_pred))
0.7851162790697674

#Show precision
from sklearn.metrics import precision_score
print(precision_score(y_test, y_pred))
0.5121951219512195

#Display F value
from sklearn.metrics import f1_score
print(f1_score(y_test, y_pred))
0.35294117647058826

This is also the same implementation as logistic regression, except that it creates a classifier class. Also, be sure to tune and optimize the parameters of the classifier class.

See the reference below for details. Reference: https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html

How to determine if it is overfitting

Overfitting is also called overfitting, but as the name implies, it creates a model that fits only the trained data.

To easily check if there is a tendency to overfit, I think you should use both the training data and the evaluation data for prediction and check the difference in the accuracy rate.

`sample.ipynb`


#Create a classifier (random forest) * Try it without parameters
clf = RandomForestClassifier()

#Learning
clf.fit(X_train, y_train)

#Display the correct answer rate using evaluation data for prediction (normal evaluation flow)
y_pred = clf.predict(X_test)
print(accuracy_score(y_test, y_pred))
0.7609302325581395

#Use training data for prediction to display accuracy rate
y_pred_for_train = clf.predict(X_train)
print(accuracy_score(y_train, y_pred_for_train))
0.9992892679459844

I got noticeable results when I ran Random Forest without parameters, so I tried to reproduce it. When over-learning, the prediction result from the training data tends to be excessively higher than that from the evaluation data as described above.

Verify with past Arima Kinen data

Next, I would like to use the past Arima Kinen data to see the accuracy of the predicted results. After learning and evaluation by each algorithm so far, the following processing is performed.

`sample.ipynb`


#Arima Kinen race_index list
target_race_indexes = [
  '2015122706050810',
  '2016122506050910',
  '2017122406050811',
  '2018122306050811',
  '2019122206050811'
]

for idx in target_race_indexes:
  #Arima Kinen Explanatory Variables(X_target)And the objective variable(y_target)Get
  X_target = dataX_std[dataX_std.index == idx] 
  y_target = dataY[idx]

  #Forecast
  y_pred = clf.predict(X_target)

  #Result display
  print('y=', idx[0:4], 'pred=', y_pred, 'result=', y_target.values, 'precision_score=', precision_score(y_target, y_pred))

The output results of each algorithm are as follows.

`sample.ipynb`


#For logistic regression
y= 2015 pred= [0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0] result= [0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0] precision_score= 0.0
y= 2016 pred= [0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 0] result= [1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0] precision_score= 0.3333333333333333
y= 2017 pred= [0 0 1 0 0 0 0 0 0 1 0 0 0 0 1 0] result= [0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0] precision_score= 0.6666666666666666
y= 2018 pred= [0 0 0 0 1 0 0 0 0 0 0 1 0 1 1 0] result= [0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0] precision_score= 0.5
y= 2019 pred= [0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0] result= [0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0] precision_score= 0.3333333333333333

#Support vector machine
y= 2015 pred= [0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0] result= [0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0] precision_score= 1.0
y= 2016 pred= [1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 0] result= [1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0] precision_score= 0.6
y= 2017 pred= [0 1 0 0 1 0 0 0 0 1 0 0 0 0 0 0] result= [0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0] precision_score= 0.6666666666666666
y= 2018 pred= [0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 1] result= [0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0] precision_score= 0.5
y= 2019 pred= [0 0 0 0 0 1 1 0 0 1 0 0 0 1 0 0] result= [0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0] precision_score= 0.75

#K-nearest neighbor method
y= 2015 pred= [0 0 0 1 0 0 1 0 0 1 0 1 0 0 0 1] result= [0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0] precision_score= 0.2
y= 2016 pred= [1 1 0 0 0 0 0 0 1 0 1 0 0 1 0 1] result= [1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0] precision_score= 0.5
y= 2017 pred= [0 1 1 0 1 1 1 0 0 1 1 1 0 0 0 0] result= [0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0] precision_score= 0.375
y= 2018 pred= [1 0 0 0 0 0 0 0 1 0 1 1 0 0 1 1] result= [0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0] precision_score= 0.3333333333333333
y= 2019 pred= [0 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1] result= [0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0] precision_score= 0.25

#For random forest
y= 2015 pred= [0 0 0 1 0 0 1 0 1 1 0 1 0 0 0 0] result= [0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 0] precision_score= 0.4
y= 2016 pred= [1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0] result= [1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0] precision_score= 1.0
y= 2017 pred= [0 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0] result= [0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0] precision_score= 0.6
y= 2018 pred= [1 0 1 0 1 0 0 0 0 0 0 1 0 1 1 0] result= [0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0] precision_score= 0.3333333333333333
y= 2019 pred= [0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0] result= [0 0 0 0 0 1 1 0 0 1 0 0 0 0 0 0] precision_score= 0.3333333333333333

As I imagined, I predict that multiple horses will be ** 1 (within 3rd place) ** per race. Originally, I would like to narrow down the horses to be purchased by comparing the horses that run in each race, but with the current implementation, can the horses be purchased from the entire data? It is predicted that. I would like to make this an issue for the future.

Algorithm selection

The accuracy of each algorithm so far is summarized in the table.

algorithm	Overall precision rate	Overall F value	Arima Kinen Conformity Rate (for 5 years)
Logistic regression	0.42	0.40	0.0, 0.33, 0.66, 0.5, 0.33
Support vector machine	0.42	0.35	1.0, 0.6, 0.66, 0.5, 0.75
K-nearest neighbor method	0.32	0.35	0.2, 0.5, 0.38, 0.33, 0.25
Random forest	0.51	0.35	0.4, 1.0, 0.6, 0.33, 0.33

Which algorithm is best depends on the characteristics of the dataset used, the number of samples, and the timing of implementation. With that in mind, we will use either a support vector machine or a random forest based on the above results. (However, support vector machines may be overfitting)

Ranking horses by classification probability

It is difficult to use as it is because we predict that multiple horses will be ** 1 (within 3rd place) **. I would like to narrow down the purchase targets by ranking the racehorses with the probability of being classified within ** 3rd place **.

`sample.ipynb`


#Arima Kinen race_index list
target_race_indexes = [
  '2015122706050810',
  '2016122506050910',
  '2017122406050811',
  '2018122306050811',
  '2019122206050811'
]

for idx in target_race_indexes:
  #Arima Kinen Explanatory Variables(X_target)And the objective variable(y_target)Get
  X_target = dataX_std[dataX_std.index == idx] 
  y_target = dataY[idx]

  #Prediction (Predict the probability of being classified as 0 or 1)
  y_pred_proba = clf.predict_proba(X_target)

  #Convert to dictionary(key:Horse number, value:Probability of becoming 1)
  keys = list(range(1, y_pred_proba[:, 1].size + 1))
  values = y_pred_proba[:, 1]
  pred_dict = dict(zip(keys, values))

  #Result display in descending order of dictionary value
  print('y=', idx[0:4])
  print(dict(sorted(pred_dict.items(), key=lambda x:x[1], reverse=True)))

It's verbose, but the output looks like this: The key of the dictionary is the horse number, and the value represents the probability of being classified within the third place. (The dictionary is sorted in descending order of value)

`sample.ipynb`


y= 2015
{7: 0.5696133455536686, 9: 0.4905907696112562, 11: 0.49035299894918755, 13: 0.35007505837022596, 12: 0.34220680265218334, 3: 0.31354320341453473, 4: 0.30980352572486725, 6: 0.30215860817620876, 10: 0.28490440087889995, 16: 0.27909507104899467, 1: 0.27533238657398446, 8: 0.24462710225495993, 2: 0.24459098148537395, 14: 0.24457566067758357, 5: 0.2445741121569982, 15: 0.23657499952423014}
y= 2016
{1: 0.6170252668074172, 2: 0.6051853981429345, 11: 0.5713617761448656, 9: 0.477082991798865, 6: 0.46056067001143736, 12: 0.30720442615574284, 3: 0.30215860817620876, 13: 0.30215860817620876, 8: 0.3007077278874594, 16: 0.2824267715516374, 7: 0.24464207649468928, 10: 0.24460750167495196, 4: 0.24459032539440356, 5: 0.2445880535923202, 14: 0.24458580009313594, 15: 0.24457449358955205}
y= 2017
{10: 0.6170803259427108, 2: 0.617026799448752, 5: 0.4606653690190285, 11: 0.3979634800224914, 14: 0.34913956740973595, 15: 0.3483806159861276, 12: 0.30215860817620876, 4: 0.3021535584865604, 13: 0.30024466402472444, 9: 0.2922074543922137, 1: 0.28743844415935654, 8: 0.2835192845558853, 6: 0.24461953217712495, 7: 0.2445971287209923, 16: 0.24458997746828753, 3: 0.2398748266004306}
y= 2018
{15: 0.5931962545935543, 12: 0.5631034477026525, 13: 0.46364861217784636, 16: 0.4423252760260589, 10: 0.3453931564376497, 3: 0.31157557743661457, 14: 0.30392079440550224, 8: 0.303732258765211, 6: 0.30219848678824074, 2: 0.3021586072259061, 7: 0.302143337075652, 1: 0.2981084912586054, 4: 0.27316635690234087, 5: 0.2445861267179151, 11: 0.2445764568939144, 9: 0.2445733900887549}
y= 2019
{6: 0.6170145067552477, 7: 0.5872900780905845, 10: 0.4904861419159532, 14: 0.43700495515775173, 12: 0.3512586575980933, 2: 0.3087214186649427, 9: 0.30553764130552913, 15: 0.3021220272592637, 16: 0.24776137832454997, 11: 0.2446323520236049, 5: 0.2446088059727512, 13: 0.24459614207316613, 8: 0.24459434296808064, 1: 0.24458784939997164, 4: 0.24457367329291685, 3: 0.24452744515587446}

By using the ** predict_proba ** method, you can get the probability that a point will be classified as 0 or 1 instead of the 0 or 1 classification result. This time we are using the probability of being classified as 1.

2020 Arima Kinen Forecast

Now, let's make a prediction of the main subject of Arima Kinen.

`sample.ipynb`


#Arima Kinen race_index
target_race_index = '2020122706050811'

#Arima Kinen Explanatory Variables(X_target)Get
X_target = dataX_std[dataX_std.index == target_race_index] 

#Forecast
y_pred_proba = clf.predict_proba(X_target)

#Convert to dictionary(key:Horse number, value:Probability of becoming 1)
keys = list(range(1, y_pred_proba[:, 1].size + 1))
values = y_pred_proba[:, 1]
pred_dict = dict(zip(keys, values))

#Result display in descending order of dictionary value
print(dict(sorted(pred_dict.items(), key=lambda x:x[1], reverse=True)))

With the above implementation, I would like to make three predictions each with a support vector machine and a random forest. Then, 3 horses with a high probability of coming within 3rd place are extracted.

Prediction by Support Vector Machine (SVM)

Circular	Horse number	Horse name	probability
First time	5	World premiere	0.58
First time	13	Fierement	0.48
First time	15	Ocean Great	0.42

Circular	Horse number	Horse name	probability
Second time	5	World premiere	0.58
Second time	13	Fierement	0.58
Second time	14	Salacia	0.41

Circular	Horse number	Horse name	probability
Third time	13	Fierement	0.55
Third time	5	World premiere	0.53
Third time	10	Curren Bouquetdore	0.42

Random forest prediction

Circular	Horse number	Horse name	probability
First time	13	Fierement	0.63
First time	5	World premiere	0.52
First time	4	Loves Only You	0.49

Circular	Horse number	Horse name	probability
Second time	13	Fierement	0.64
Second time	5	World premiere	0.55
Second time	9	Chrono Genesis	0.54

Circular	Horse number	Horse name	probability
Third time	13	Fierement	0.60
Third time	4	Loves Only You	0.57
Third time	5	World premiere	0.56

The result was that the 5th world premiere and the 13th Fierement were missing, and the 3rd place was in competition. However, it is quite strange that the horse in 3rd place is different every time, and even though the number of data samples is not so large, train_test_split is used to easily divide the training data and the evaluation data. Seems to be influencing.

This time, I will keep it as it is, and I would like to make it a future issue.

at the end

This time, I tried to experience the practical form of actually using the created learning model, but after all the practice and the actual were different, and I was able to gain various notices. Next time, I would like to work on solving the postponed issues and building a general-purpose model.

I actually bought a betting ticket, uploaded the captured image and tried to close it, but the sale has not started yet ... sorry. I will add it when I actually purchase it.

** Added on December 26, 2020: ** I purchased Wide 5-13 at one point, referring to the forecast results. スクリーンショット 2020-12-26 16.34.51.png

A beginner of machine learning tried to predict Arima Kinen with python

What is Arima Kinen?

Considering the dataset to use

Data preprocessing

Raw data (CSV file) configuration

Continuation of pretreatment

sample.ipynb

Model learning and evaluation

Learning & evaluation data generation (common processing)

Data partitioning and standardization

sample.ipynb

Correct data imbalance

sample.ipynb

Learning and evaluation by logistic regression

sample.ipynb

Learning and evaluation by support vector machine (SVM)

sample.ipynb

Learning and evaluation by K-nearest neighbor method (KNN)

sample.ipynb

Random forest learning and evaluation

sample.ipynb

How to determine if it is overfitting

sample.ipynb

Verify with past Arima Kinen data

sample.ipynb

sample.ipynb

Algorithm selection

Ranking horses by classification probability

sample.ipynb

sample.ipynb

2020 Arima Kinen Forecast

sample.ipynb

Prediction by Support Vector Machine (SVM)

Random forest prediction

at the end

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`

`sample.ipynb`