1. This article is

This is an explanation of the means to improve the F1 score by introducing a two-step learning model in a binary judgment device that judges input data as binary values of 0 and 1. </ b>

Suppose you want to build a binary judgment device that judges input data as binary values of 0 and 1. The binary judgment device model gives training data for training.

Here, after training data is given as shown below and the determination device 1 is modeled, the output result of the determination device 1 is added to the training data to model the determination device 2. Judgment device 2 has a smaller number of false positives than judgment device 1, so the F1 score is improved.

2. Contents

2-1 Prepare and process data

[1] Import 10 (0-9) handwritten numbers 28bit x 28bit black and white image, 60,000 training image data, 10,000 test image data.

sample.py

from keras.datasets import mnist (x_train, y_train), (x_test, y_test) = mnist.load_data() x_train = x_train / 255.0 #Perform normalization by dividing by 255. x_test = x_test / 255.0 #Perform normalization by dividing by 255.

x_train is a 28x28bit handwritten character displayed with 0,1. y_train is a number represented by handwritten characters.

Execution result </ b> x_train size-> (60000, 28, 28) x_train is a 28x28bit handwritten character displayed with 0,1

y_train-> Numbers represented by handwritten characters. (Size 60000) [5 0 4 ... 5 6 8]

[2] Extract the data corresponding to "3" or "5" from the data of x_train, y_train, x_test, y_test. -> x_sub_train, x_sub_test

sample.py

# Change these params if you want to change the numbers selected num1 = 3 num2 = 5 # Subset on only two numbers: x_In the train data, y_train=Take out the one that corresponds to 3 or 5. x_sub_train = x_train[(y_train == num1) | (y_train == num2)] y_sub_train = y_train[(y_train == num1) | (y_train == num2)] # Subset on only two numbers: x_In the test data, y_test=Take out the one that corresponds to 3 or 5. x_sub_test = x_test[(y_test == num1) | (y_test == num2)] y_sub_test = y_test[(y_test == num1) | (y_test == num2)]

[3] Perform data format conversion (dimensional conversion).

sample.py

#3D data(11552,28,28)2D data(11552,28*28)Convert to. x_train_flat = x_sub_train.flatten().reshape(x_sub_train.shape[0], 28*28) #3D data(1902,28,28)2D data(1902,28*28)Convert to. x_test_flat = x_sub_test.flatten().reshape(x_sub_test.shape[0], 28*28) # One hot encode target variables #y_sub_When the element of train is 3->Returns 1. to_1 by categorical->[0,1]Convert to. #y_sub_When the element of train is 5->Returns 0. to_0 by categorical->[1,0]Convert to. y_sub_train_encoded = to_categorical([1 if value == num1 else 0 for value in y_sub_train]) #Divide the data group into training data and test data. X_train, X_val, Y_train, Y_val = train_test_split(x_train_flat, y_sub_train_encoded, test_size = 0.1, random_state=42)

2-3 Build the first learning model (Primary ML)

Build the first learning model. The training model is built using the neural network of the Keras library.

sample.py

# Build primary model model = Sequential() model.add(Dense(units=2, activation='softmax')) #units ・・・ Number of outputs #activation ・・・ Activation function.(https://keras.io/ja/activations/#relu) #Specify the loss function. Here, categorical_crossentropy model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(x=X_train, y=Y_train, validation_data=(X_val, Y_val), epochs=3, batch_size=320) # batch size is so large so that the model can be poorly fit, Its easy to get 99% accuracy. #The argument epochs is x_Specify the number of times to relearn the block with all the input data of train as one block. #batch_size is x_It is given when the train is subdivided. One set divided into small pieces is called a "sub-batch". This is to prevent "overfitting"

(Reference information) http://marupeke296.com/IKDADV_DL_No2_Keras.html

2-4 Evaluate the first learning model (neural network) constructed.

Build a neural network model and draw a ROC curve.

sample.py

# Plot ROC print('X_train','\n',X_train,len(X_train)) #length:10396 prediction = model.predict(X_train) #prediction:Neural network output print('prediction','\n',prediction,len(prediction))#length:10396 [Probability of 3,Probability of 5]Lined up in prediction = np.array([i[1] for i in prediction]) #You have a probability of 5. print('prediction','\n',prediction,len(prediction))#length:10396 print('Y_train','\n',Y_train) #[0,1] or [1,0] actual = np.array([i[1] for i in Y_train]) == 1 plot_roc(actual, prediction) def plot_roc(actual, prediction): # Calculate ROC / AUC fpr, tpr, thresholds = sk_metrics.roc_curve(actual, prediction, pos_label=1) roc_auc = sk_metrics.auc(fpr, tpr) # Plot plt.plot(fpr, tpr, color='darkorange', lw=2, label='ROC curve (area = %0.2f)' % roc_auc) plt.plot([0, 1], [0, 1], color='navy', lw=2, linestyle='--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver Operating Characteristic Example') plt.legend(loc="lower right") plt.show()

Since the ROC curve is drawn in the area above half, it can be seen that a good binary classification machine learning model can be constructed.

Adjust the threshold so that recall = 0.99.

sample.py

# Create a model with high recall, change the threshold until a good recall level is reached threshold = .30 print(prediction) #You have a probability of 5. prediction_int = np.array(prediction) > threshold #prediction_int -> [False,True,.....] print("prediction_int",prediction_int) # Classification report print(sk_metrics.classification_report(actual, prediction_int)) # Confusion matrix cm = sk_metrics.confusion_matrix(actual, prediction_int) print('Confusion Matrix') print(cm)

2-5 Build a second learning model (neural network).

・ Output of 1st model + X_Train → Train data input for 2nd model construction ・ Output of the 1st model & Y_Train → Train data output for building the 2nd model.

Increase the F1 score by excluding false positives after most positive cases have already been identified by the primary model. In other words, the role of the secondary machine learning algorithm is to determine whether the positive judgment by the primary model is true or false.

sample.py

# Get meta labels meta_labels = prediction_int & actual print("prediction_int",prediction_int) #[False True True ...] print("meta_labels",meta_labels) #[False True True ...] meta_labels_encoded = to_categorical(meta_labels) #[1,0] [0,1] [0,1],.... print(meta_labels_encoded) # Reshape data prediction_int = prediction_int.reshape((-1, 1))#[1,0]->[False], [0,1]->[True]Convert to print("prediction_int",prediction_int) #[False],[True],[True],.... print("X_train", X_train) #28*28 [0,0,....0] #concatenate concatenates arrays # MNIST data + forecasts_int new_features = np.concatenate((prediction_int, X_train), axis=1) print("new_features",new_features ) #[1. 0. 0. ... 0. 0. 0.],.... # Train a new model # Build model meta_model = Sequential() meta_model.add(Dense(units=2, activation='softmax')) meta_model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) #new_features=MNIST data + forecasts_int -> [1. 0. 0. ... 0. 0. 0.],[1. 0. 0. ... 0. 0. 0.],・・・ #meta_labels_encoded =[1,0] [0,1] [0,1],.... # x_train and y_train are Numpy arrays --just like in the Scikit-Learn API. meta_model.fit(x=new_features, y=meta_labels_encoded, epochs=4, batch_size=32)

2-6 Evaluate the second learning model.

X_Train data was put into the first learning model (neural network) and the second learning model (neural network) to obtain prediction data. I compared them with Y_Train and output a Classfication report. It was found that the accuracy of the second learning model (neural network) is improved compared to that of the first learning model (neural network).

sample.py

test_meta_label(primary_model=model, secondary_model=meta_model, x=X_train, y=Y_train, threshold=threshold) def test_meta_label(primary_model, secondary_model, x, y, threshold): """ :param primary_model: model object (First, we build a model that achieves high recall, even if the precision is not particularly high) :param secondary_model: model object (the role of the secondary ML algorithm is to determine whether a positive from the primary (exogenous) model is true or false. It is not its purpose to come up with a betting opportunity. Its purpose is to determine whether we should act or pass on the opportunity that has been presented.) :param x: Explanatory variables :param y: Target variable (One hot encoded) :param threshold: The confidence threshold. This is used :return: Print the classification report for both the base model and the meta model. """ # Get the actual labels (y) from the encoded y labels actual = np.array([i[1] for i in y]) == 1 # Use primary model to score the data x primary_prediction = primary_model.predict(x) primary_prediction = np.array([i[1] for i in primary_prediction]).reshape((-1, 1)) primary_prediction_int = primary_prediction > threshold # binary labels # Print output for base model print('Base Model Metrics:') print(sk_metrics.classification_report(actual, primary_prediction > 0.50)) print('Confusion Matrix') print(sk_metrics.confusion_matrix(actual, primary_prediction_int)) accuracy = (actual == primary_prediction_int.flatten()).sum() / actual.shape[0] print('Accuracy: ', round(accuracy, 4)) print('') # Secondary model new_features = np.concatenate((primary_prediction_int, x), axis=1) # Use secondary model to score the new features meta_prediction = secondary_model.predict(new_features) meta_prediction = np.array([i[1] for i in meta_prediction]) meta_prediction_int = meta_prediction > 0.5 # binary labels # Now combine primary and secondary model in a final prediction final_prediction = (meta_prediction_int & primary_prediction_int.flatten()) # Print output for meta model print('Meta Label Metrics: ') print(sk_metrics.classification_report(actual, final_prediction)) print('Confusion Matrix') print(sk_metrics.confusion_matrix(actual, final_prediction)) accuracy = (actual == final_prediction).sum() / actual.shape[0] print('Accuracy: ', round(accuracy, 4))

It was found that the accuracy of the second neural network was improved even when the actual test data was used instead of the training data.

sample.py

test_meta_label(primary_model=model, secondary_model=meta_model, x=X_val, y=Y_val, threshold=threshold)

Recommended Posts
Improvement of performance metrix by two-step learning model

Try to evaluate the performance of machine learning / regression model

Try to evaluate the performance of machine learning / classification model

Speed improvement by self-implementation of numpy.random.multivariate_normal

Evaluate the accuracy of the learning model by cross-validation from scikit learn

Explanation of production optimization model by Python

[Learning memo] Basics of class by python

Judgment of igneous rock by machine learning ②

Implementation of a model that predicts the exchange rate (dollar-yen rate) by machine learning

Regression by CNN (built model of torch vision)

Classification of guitar images by machine learning Part 1

A verification of AWS SDK performance by language

Parallel learning of deep learning by Keras and Kubernetes

Classify machine learning related information by topic model

Analysis of shared space usage by machine learning

[Translation] scikit-learn 0.18 Tutorial Introduction of machine learning by scikit-learn

Reasonable price estimation of Mercari by machine learning

Classification of guitar images by machine learning Part 2

Implementation of Deep Learning model for image recognition

Deep learning learned by implementation (segmentation) ~ Implementation of SegNet ~

Performance improvement efforts

Count the number of parameters in the deep learning model

Improvement of performance metrix by two-step learning model

1. This article is

2. Contents

2-1 Prepare and process data

[1] Import 10 (0-9) handwritten numbers 28bit x 28bit black and white image, 60,000 training image data, 10,000 test image data.

`sample.py`

[2] Extract the data corresponding to "3" or "5" from the data of x_train, y_train, x_test, y_test. -> x_sub_train, x_sub_test

`sample.py`

[3] Perform data format conversion (dimensional conversion).

`sample.py`

2-3 Build the first learning model (Primary ML)

`sample.py`

2-4 Evaluate the first learning model (neural network) constructed.

`sample.py`

Adjust the threshold so that recall = 0.99.

`sample.py`

2-5 Build a second learning model (neural network).

`sample.py`

2-6 Evaluate the second learning model.

`sample.py`

`sample.py`