Leave a note to remember what you learned in pyq
pandas
・ When drawing a histogram, `plt.hist``
plt.hist(df[df["y"] == 1]["x"], label="men 16years old", bins=100, range = (140, 187), alpha = 0.3, color = "green")
(df is csv data)
plt.xlabel ("height [cm] ")
: x-axis title
plt.legend ();
: Show data description
-When drawing a scatter plot, plt.scatter
plt.scatter(men["height"], men["weight"], color="green")
The first argument is the value on the horizontal axis in the data. The second argument is the value on the vertical axis in the data.
・ When drawing a scatter plot matrix
pd.plotting.scatter_matrix(df)
DataFrame
· Extract column values
Specify the column name as df [[" alcohol content "," density "]]
df.iloc (row to retrieve, column to retrieve)
Use df.iloc
・ Divide the data for training and evaluation (test)
use train_test_split
from sklearn.model_selection import train_test_split (X_train, X_test, y_train, y_test) = train_test_split( X, y, test_size=0.3, random_state=0 )
What percentage of the data is test_size = 0.3 for testing?
random_state = 0 Random seed value when dividing data (usually not used)
The decision tree is "a series of if statements that automatically learn conditions"
Numpy
** How to create a multidimensional array with the same elements ** zeros (size): A multidimensional array with all zeros ones (size): A multidimensional array with all 1 elements full (size, value): A multidimensional array of values for all elements zeros_like (multidimensional array): Multidimensional array with all zero elements ones_like (multidimensional array): Multidimensional array with all 1 elements full_like (multidimensional array, values): A multidimensional array where all elements are values
** Continuous data ** arange ([start,] stop [, step,], dtype = None): Continuous data creation like range linspace (start, stop, num = 50, endpoint = True, retstep = False, dtype = None): Continuous data creation when the range to be created and the number num are determined
** Identity matrix and diagonal matrix ** numpy.eye: Identity matrix with all 1 diagonals numpy.diag: any diagonal matrix
The evaluation standard is the type of measuring rod The evaluation standard is the scale of the measuring rod
Recommended Posts