Google translated http://scikit-learn.org/0.18/tutorial/statistical_inference/settings.html scikit-learn 0.18 Tutorial Table of Contents Statistical Learning Tutorial Table of Contents for Scientific Data Processing

Statistical learning: Settings and estimator objects in scikit-learn

data set

scikit-learn deals with learning the information of one or more datasets represented as a two-dimensional array. They can be understood as a list of multidimensional observations. The first axis of these arrays is the sample axis and the second axis is the feature axis.

** scikit: A simple example shipped with an iris dataset **

>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> data = iris.data
>>> data.shape
(150, 4)

It consists of observations of 150 irises. Each feature is the length and width of its leaves and petals, as detailed in ʻiris.DESCR`.

If the data is not in the form (n_samples, n_features), it must be preprocessed for use with scikit-learn.

** An example of reshaping data is the digits dataset **

The digits dataset consists of 1797 8x8 images of handwritten digits.

>>> digits = datasets.load_digits()
>>> digits.images.shape
(1797, 8, 8)
>>> import matplotlib.pyplot as plt 
>>> plt.imshow(digits.images[-1], cmap=plt.cm.gray_r) 
<matplotlib.image.AxesImage object at ...>

Convert each 8x8 image to a feature vector of length 64 for use in scikit with this dataset

>>> data = digits.images.reshape((digits.images.shape[0], -1))

Estimator object

** Fitting data: ** The main API implemented by scikit-learn is the estimator API. An estimator is an object that learns from data. It may be a classifier, regressionr or clustering algorithm, or transducer that extracts / filters useful features from the raw data. All estimator objects expose a fit method that takes a dataset (usually a two-dimensional array) as an argument.

>>> estimator.fit(data)

** Estimator Parameters: ** All estimator parameters can be set when instantiated or by changing the corresponding attributes.

>>> estimator = Estimator(param1=1, param2=2)
>>> estimator.param1
1

** Estimated parameters: ** When the estimator is made to fit the data, the parameters are estimated from the data at hand. All estimator parameters are attributes of the estimator object that end in an underscore.

>>> estimator.estimated_param_

Next tutorial page

Statistical Learning Tutorial Table of Contents for Scientific Data Processing

[Translation] scikit-learn 0.18 tutorial Statistical learning tutorial for scientific data processing Statistical learning: Settings and estimator objects in scikit-learn

Statistical learning: Settings and estimator objects in scikit-learn

data set

Estimator object