As a memorandum, I will summarize the outline, classes, examples, keywords to be used, and the sites that were helpful for learning about "supervised learning" and "unsupervised learning".
Mecha Zackri: A prediction model is created by giving training that represents the characteristics and the corresponding answer data. There are classification problems and regression problems in prediction.
Find the parameter that has the smallest loss function (error function) value among all straight lines.
--Class to use: sklearn.linear_model.LinearRegression
--Example: Relationship between the number of visitors and sales, etc.
--Keywords: simple regression, multiple regression, polynomial regression, non-linear regression
--Reference site: [Linear regression with scikit-learn (single regression analysis / multiple regression analysis)](https://pythondatascience.plavox.info/scikit-learn/%E7%B7%9A%E5%BD%A2%E5 % 9B% 9E% E5% B8% B0)
It is a binary classification algorithm and is applied to classification problems.
--Class to use: sklearn.linear_model.LogisticRegression
--Example: Relationship between sales visits / satisfaction and sales, etc.
--Keywords: sigmoid function, cross entropy error function
--Reference site: Classification of iris by logistic regression of scikit-learn
An algorithm that learns the decision boundary (straight line) away from the data and can be used for both classification and regression.
--Class to use: sklearn.svm.SVC
--Case: Text classification, number recognition, etc.
--Keywords: hard margin, soft margin
--Reference site: What is a support vector machine (SVM)? ~ From basic to Python implementation ~
After mapping the data in the real space to a space that can be separated by a hyperplane by the kernel function, the data set is separated.
--Class to use: sklearn.svm.SVC
--Case example: Product identification from color information, etc.
--Keywords: Kernel functions (sigmoid kernel, polynomial kernel, RBF [radial basis function] kernel)
--Reference site: [Python] Implementing support vector machines using various kernel functions [iris dataset]
Under the assumption that each feature is independent, we calculate the probability that the data is a label.
--Class to use: sklearn.naive_bayes.MultinomialNB (Other GaussianNB, GaussianNB, etc.)
--Case: Judgment of junk mail, etc.
--Keyword: Smoothing
--Reference site: Naive Bayes classifier by scikit-learn
Collect output from multiple decision trees with diversity and produce classification results by majority vote.
--Class to use: sklearn.ensemble.RandomForestClassifier
--Case: Classification by behavior history and attributes
--Keywords: Gini coefficient, bootstrap method
--Reference site: [Introduction] Decision tree analysis for beginners by beginners
By sandwiching an intermediate layer between the input and the output, a complex decision boundary is learned.
--Class to use: sklearn.neural_network.MLPClassifier
--Case: Image recognition, voice recognition
--Keywords: simple perceptron, activation function, early stopping
--Reference site: Let's make a neural network by yourself
Judgment is made by majority voting of k classifications in the vicinity of the input data.
--Class to use: sklearn.neighbors.KNeighborsClassifier
--Reference site: Machine learning ~ K-nearest neighbor method ~
-** a. For classification problems **
--a-1. Confusion matrix
Class to use: sklearn.metrics.confusion_matrix
--a-2. Correct answer rate
Class to use: sklearn.metrics.accuracy_score
--a-3. Compliance rate
Class to use: sklearn.metrics.precision_score
--a-4. Recall rate
Class to use: sklearn.metrics.recall_score
--a-5. F value
Class to use: sklearn.metrics.f1_score
sklearn.metrics.roc_curve
Reference site: Generate confusion matrix with scikit-learn, calculate precision rate, recall rate, F1 value, etc. Calculate ROC curve and its AUC with scikit-learn
-** b. For regression problems **
--b-1. Mean squared error
Class to use: sklearn.metrics.mean_squared_error
--b-2. Average absolute error
Class to use: sklearn.metrics.mean_absolute_error
--b-3. Coefficient of determination
Class to use: sklearn.metrics.r2_score
Reference site: [Evaluate the results of the regression model with scikit-learn](https://pythondatascience.plavox.info/scikit-learn/%E5%9B%9E%E5%B8%B0%E3%83%A2% E3% 83% 87% E3% 83% AB% E3% 81% AE% E8% A9% 95% E4% BE% A1)
-** a. Hyperparameters **
--a-1. Grid search
Class to use: sklearn.grid_search.GridSearchCV
--a-2. Random search
Class to use: sklearn.grid_search.RandomizedSearchCV
Reference site: Let's tune the model hyperparameters with scikit-learn!
-** b. Data (learning data & verification data) division **
--b-1. Holdout method
Class to use: sklearn.model_selection.train_test_split
--b-2. Cross-validation method
Class to use: sklearn.model_selection.cross_val_score``
sklearn.model_selection.KFold --b-3. Leave one-out method <br> Class to use:
sklearn.model_selection.LeaveOneOut`
Reference site: [About the method of dividing learning data and test data in machine learning and deep learning](https://newtechnologylifestyle.net/%E6%A9%9F%E6%A2%B0%E5%AD%A6%E7 % BF% 92% E3% 80% 81% E3% 83% 87% E3% 82% A3% E3% 83% BC% E3% 83% 97% E3% 83% A9% E3% 83% BC% E3% 83 % 8B% E3% 83% B3% E3% 82% B0% E3% 81% A7% E3% 81% AE% E5% AD% A6% E7% BF% 92% E3% 83% 87% E3% 83% BC % E3% 82% BF% E3% 81% A8 /)
-** c. Regularization **
--c-1. Ridge regression
Class to use: sklearn.linear_model.Ridge
--c-2. Return to Rosso
Class to use: sklearn.linear_model.Lasso
Reference site: Explanation of ridge regression and lasso regression in the shortest time (learning of machine learning # 3)
Mecha Zackli: Unlike supervised learning, there are no objective variables. Here, the structure of the feature data is extracted by transforming it into another shape or finding a subset. Techniques include dimensionality reduction and clustering.
Summarize a large number of quantitative explanatory variables into fewer indicators and synthetic variables to reduce the variables in the data.
--Class to use: sklearn.decomposition.PCA
--Keywords: Covariance matrix, eigenvalue problem, cumulative contribution rate
--Reference site: Principal component analysis and eigenvalue problem
Classify the data into a given number of clusters and divide similar ones into groups.
--Class to use: sklearn.cluster.KMeans
--Case: Marketing data analysis, image classification
--Keywords: sum of squares in cluster, elbow method, silhouette analysis, k-means ++, k-medoids method
--Reference site: How to find the optimum number of clusters for k-means
In sentence data, the similarity between words and sentences is obtained by reducing the feature amount from the number of words to the number of latent topics.
--Class to use: sklearn.decomposition.TruncatedSVD
--Keywords: Singular value decomposition, topic model, tf-idf
--Reference site: Machine Learning Latent Semantics Theory
A dimension reduction method that has the property that all I / O data values are non-negative.
--Class to use: sklearn.decomposition.NMF
--Case: Recommendation, text mining
--Reference site: Understanding non-negative matrix factorization (NMF) softly
Create a topic from the words in the document and ask which topic the document consists of.
--Class to use: sklearn.decomposition.LatentDirichletAllocation
--Case: Natural language processing
--Keywords: Topic model, Dirichlet distribution
--Reference site: Explanation of points that are difficult for beginners to understand in the topic model (LDA)
Clustering is performed by linear combination of multiple Gaussian distributions.
--Class to use: sklearn.mixture.GaussianMixture
--Keyword: Gaussian distribution
Dimensionality reduction is performed for non-linear data.
--Class to use: sklearn.manifold.LocallyLinearEmbedding
It is a method of reducing high-dimensional data to two or three dimensions, and is used for data visualization.
--Class to use: sklearn.manifold.TSNE
Recommended Posts