Google translated http://scikit-learn.org/0.18/modules/unsupervised_reduction.html [scikit-learn 0.18 User Guide 4. Dataset Conversion](http://qiita.com/nazoking@github/items/267f2371757516f8c168#4-%E3%83%87%E3%83%BC%E3%82%BF From% E3% 82% BB% E3% 83% 83% E3% 83% 88% E5% A4% 89% E6% 8F% 9B)
If you have a large number of feature bases, it is useful to reduce it in unsupervised steps before supervised steps. [Unsupervised learning method](http://qiita.com/nazoking@github/items/267f2371757516f8c168#2-%E6%95%99%E5%B8%AB%E3%81%AA%E3%81%97% Many of E5% AD% A6% E7% BF% 92) implement transform
methods that can be used to reduce dimensions. The following describes two specific examples of this frequently used pattern.
decomposition.PCA is a feature that captures the variance of the original feature well. Look for a combination of. See Decomposing component signals (matrix factorization problem) (http://scikit-learn.org/0.18/modules/decomposition.html#decompositions).
Module: random_projection provides several tools for data reduction with random projection. See the relevant section of the documentation: Random Projection (http://qiita.com/nazoking@github/items/16f65bbcfda517a74df2).
cluster.FeatureAgglomeration is [Hierarchical Clustering](http://scikit- Apply learn.org/0.18/modules/clustering.html#hierarchical-clustering) to group features that behave similarly.
** Feature scaling **
If the feature scaling or statistical properties are very different, then cluster.FeatureAgglomeration Note that you may not be able to get links between related features. In these environments, you can use preprocessing.StandardScaler.
[scikit-learn 0.18 User Guide 4. Dataset Conversion](http://qiita.com/nazoking@github/items/267f2371757516f8c168#4-%E3%83%87%E3%83%BC%E3%82%BF From% E3% 82% BB% E3% 83% 83% E3% 83% 88% E5% A4% 89% E6% 8F% 9B) © 2010 --2016, scikit-learn developers (BSD license).