This article is written as a personal memo when implementing One Class SVM with sklearn. The points that I understood by reading other articles and the points that I want to remember are summarized using figures.
The function to read is as follows.
This time, we define the following as training data, test data, and outlier data.
The graphs of X_train
, X_test
and X_outliers
are as follows, respectively.
The parameters used for training are displayed in clf.get_params
In clf.decision_function (X_test)
, the distance from the discrimination boundary at each distribution point is shown, and a positive value means within the classification and a negative value means outside the classification.
clf.predict (X_test)
returns 1 or -1 to see if each given point is an outlier.
The following is a table of predict
and decision_function
to check the range of X_outliers
given as outliers.
Since there are many outliers, there are many data that are -1
, but there are some that are 1
, and it can be seen that they are within the learning data range.
sample_set_1
is a set of outlier data sets that were within the learning range, and sample_set_mina1
is a set of outlier data sets that were outliers.
The two points that are OK below are the points 1
.
This is a simple implementation memo.