Purpose

Leave a note to remember what you learned in pyq

pandas

About drawing

・ When drawing a histogram, `plt.hist``

plt.hist(df[df["y"] == 1]["x"], label="men 16years old", bins=100, range = (140, 187), alpha = 0.3, color = "green") (df is csv data)

df[df["y"] == 1]["x"] In df, the row value when the column is y == 1
label="men 16years old" Label description
bins=100 Class width 1 class = range / bins
alpha=0.3 Graph transparency

plt.xlabel ("height [cm] "): x-axis title plt.legend ();: Show data description

-When drawing a scatter plot, plt.scatter plt.scatter(men["height"], men["weight"], color="green") The first argument is the value on the horizontal axis in the data. The second argument is the value on the vertical axis in the data.

・ When drawing a scatter plot matrix pd.plotting.scatter_matrix(df)

DataFrame · Extract column values Specify the column name as df [[" alcohol content "," density "]] df.iloc (row to retrieve, column to retrieve) Use df.iloc

・ Divide the data for training and evaluation (test) use train_test_split from sklearn.model_selection import train_test_split (X_train, X_test, y_train, y_test) = train_test_split( X, y, test_size=0.3, random_state=0 ) What percentage of the data is test_size = 0.3 for testing? random_state = 0 Random seed value when dividing data (usually not used)

Decision tree

The decision tree is "a series of if statements that automatically learn conditions"

Numpy

** How to create a multidimensional array with the same elements ** zeros (size): A multidimensional array with all zeros ones (size): A multidimensional array with all 1 elements full (size, value): A multidimensional array of values for all elements zeros_like (multidimensional array): Multidimensional array with all zero elements ones_like (multidimensional array): Multidimensional array with all 1 elements full_like (multidimensional array, values): A multidimensional array where all elements are values

** Continuous data ** arange ([start,] stop [, step,], dtype = None): Continuous data creation like range linspace (start, stop, num = 50, endpoint = True, retstep = False, dtype = None): Continuous data creation when the range to be created and the number num are determined

** Identity matrix and diagonal matrix ** numpy.eye: Identity matrix with all 1 diagonals numpy.diag: any diagonal matrix

Evaluation criteria

The evaluation standard is the type of measuring rod The evaluation standard is the scale of the measuring rod

Notes on PyQ machine learning python grammar

Purpose

About drawing

Decision tree

Evaluation criteria