This is a story about participating in a Kaggle competition. Until previous, I was learning Titanic with the scikit-learn model. This time, I would like to study with keras.
According to the Detailed Deep Learning book, Keras models are roughly divided into the following three types.
·perceptron ・ Convolutional neural network ・ Recurrent neural network
Perceptron is one of the basic neural networks. Convolutional neural networks (CNNs) are mainly used for image processing. Recurrent neural networks (RNNs) are used for time series data. This time, it is neither image processing nor time series data, so let's learn with Perceptron.
Perceptron is one of the basic neural networks, and it is a model of the following image that you often see.
Source: [Image materials such as neural networks and Deep Learning [WTFPL] for presentations and seminars](http://nkdkccmbr.hateblo.jp/entry/2016/10/06/222245)Building a perceptron in keras is relatively easy. It can be built by adding layers to the Sequential model as shown below.
##############################
#Model building 5-layer perceptron
##############################
def create_model_5dim_layer_perceptron(activation="relu", optimizer="adam", out_dim=100, dropout=0.5):
model = Sequential()
#Input layer-Hidden layer 1
model.add(Dense(input_dim=len(x_train.columns), units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 1-Hidden layer 2
model.add(Dense(units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 2-Hidden layer 3
model.add(Dense(units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 3-Output layer
model.add(Dense(units=1))
model.add(Activation("sigmoid"))
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
Define the layer with Dense. The first Dense defines the dimensions of the input with input_dim. input_dim must match the number of dimensions of the input data. Specify the number of dimensions of the corresponding layer with units. Since the output layer this time has one dimension (survival 1 or 0), it is necessary to set units = 1 at the end.
Define the activation function with Activation. Any activation function other than the output layer can be used, but since the output layer is a binary classification, specify "sigmoid".
In addition, Dropout suppresses overfitting, and Batch Normalization normalizes each mini-batch.
In the above 5-layer perceptron function, activation, optimizer, etc. are received as arguments. As you can see, it is necessary to set the parameters even for the perceptron of keras. In this article, we did a grid search with scikit-learn and investigated the parameters, but you can use the grid search with keras as well. Use KerasClassifier.
from keras.wrappers.scikit_learn import KerasClassifier
model = KerasClassifier(build_fn=create_model_5dim_layer_perceptron(input_dim), verbose=0)
By using KerasClassifier as described above, you can treat Keras model in the same way as scikit-learn model. After that, you can use GridSearchCV as well as scikit-learn. The code looks like this:
# Define options for parameters
activation = ["tanh", "relu"]
optimizer = ["adam", "adagrad"]
out_dim = [234, 468, 702]
nb_epoch = [25, 50]
batch_size = [8, 16]
dropout = [0.2, 0.4, 0.5]
param_grid = dict(activation=activation,
optimizer=optimizer,
out_dim=out_dim,
nb_epoch=nb_epoch,
batch_size=batch_size,
dropout=dropout)
grid = GridSearchCV(estimator=model, param_grid=param_grid)
# Run grid search
grid_result = grid.fit(x_train, y_train)
print(grid_result.best_score_)
print(grid_result.best_params_)
If you print best_score_ and best_params_, the following will be output.
0.7814285721097673
{'activation': 'relu', 'batch_size': 16, 'dropout': 0.5, 'nb_epoch': 25, 'optimizer': 'adam', 'out_dim': 702}
acc:0.72
Once the parameters are decided, learn with Keras and predict the result. The full code is below. The data preparation is the same as Last time.
import numpy
import pandas
import datetime
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.layers.core import Dropout
from keras.layers.normalization import BatchNormalization
############################################################
#Encode SibSp with one hot
# One hot encoding SibSp
############################################################
def get_dummies_sibSp(df_all, df, df_test) :
categories = set(df_all['SibSp'].unique())
df['SibSp'] = pandas.Categorical(df['SibSp'], categories=categories)
df_test['SibSp'] = pandas.Categorical(df_test['SibSp'], categories=categories)
df = pandas.get_dummies(df, columns=['SibSp'])
df_test = pandas.get_dummies(df_test, columns=['SibSp'])
return df, df_test
############################################################
#Encode Parch in one hot
# One hot encoding SibSp
############################################################
def get_dummies_parch(df_all, df, df_test) :
categories = set(df_all['Parch'].unique())
df['Parch'] = pandas.Categorical(df['Parch'], categories=categories)
df_test['Parch'] = pandas.Categorical(df_test['Parch'], categories=categories)
df = pandas.get_dummies(df, columns=['Parch'])
df_test = pandas.get_dummies(df_test, columns=['Parch'])
return df, df_test
############################################################
#Encode Ticket as one hot
# One hot encoding SibSp
############################################################
def get_dummies_ticket(df_all, df, df_test) :
ticket_values = df_all['Ticket'].value_counts()
ticket_values = ticket_values[ticket_values > 1]
ticket_values = pandas.Series(ticket_values.index, name='Ticket')
categories = set(ticket_values.tolist())
df['Ticket'] = pandas.Categorical(df['Ticket'], categories=categories)
df_test['Ticket'] = pandas.Categorical(df_test['Ticket'], categories=categories)
df = pandas.get_dummies(df, columns=['Ticket'])
df_test = pandas.get_dummies(df_test, columns=['Ticket'])
return df, df_test
############################################################
#Standardization
# Standardization
############################################################
def standardization(df, df_test) :
standard = StandardScaler()
df_std = pandas.DataFrame(standard.fit_transform(df[['Pclass', 'Fare']].values), columns=['Pclass', 'Fare'])
df.loc[:,'Pclass'] = df_std['Pclass']
df.loc[:,'Fare'] = df_std['Fare']
df_test_std = pandas.DataFrame(standard.transform(df_test[['Pclass', 'Fare']].values), columns=['Pclass', 'Fare'])
df_test.loc[:,'Pclass'] = df_test_std['Pclass']
df_test.loc[:,'Fare'] = df_test_std['Fare']
return df, df_test
############################################################
#Data preparation
# prepare Data
############################################################
def prepareData() :
##############################
#Data preprocessing
#Extract the required items
# Data preprocessing
# Extract necessary items
##############################
# gender_submission.load csv
# Load gender_submission.csv
df = pandas.read_csv('/kaggle/input/titanic/train.csv')
df_test = pandas.read_csv('/kaggle/input/titanic/test.csv')
df_all = pandas.concat([df, df_test], sort=False)
df_test_index = df_test[['PassengerId']]
df = df[['Survived', 'Pclass', 'Sex', 'SibSp', 'Parch', 'Ticket', 'Fare']]
df_test = df_test[['Pclass', 'Sex', 'SibSp', 'Parch', 'Ticket', 'Fare']]
##############################
#Data preprocessing
#Handle missing values
# Data preprocessing
# Fill or remove missing values
##############################
##############################
df = df[df['Fare'] != 5].reset_index(drop=True)
df = df[df['Fare'] != 0].reset_index(drop=True)
##############################
#Data preprocessing
#Quantify the label (name)
# Data preprocessing
# Digitize labels
##############################
##############################
#sex
##############################
encoder_sex = LabelEncoder()
df['Sex'] = encoder_sex.fit_transform(df['Sex'].values)
df_test['Sex'] = encoder_sex.transform(df_test['Sex'].values)
##############################
#Data preprocessing
# One-Hot encoding
# Data preprocessing
# One-Hot Encoding
##############################
##############################
# SibSp
##############################
df, df_test = get_dummies_sibSp(df_all, df, df_test)
##############################
# Parch
##############################
df, df_test = get_dummies_parch(df_all, df, df_test)
##############################
# Ticket
##############################
df, df_test = get_dummies_ticket(df_all, df, df_test)
##############################
#Data preprocessing
#Standardize numbers
# Data preprocessing
# Standardize numbers
##############################
df, df_test = standardization(df, df_test)
##############################
#Data preprocessing
#Handle missing values
# Data preprocessing
# Fill or remove missing values
##############################
df.fillna({'Fare':0}, inplace=True)
df_test.fillna({'Fare':0}, inplace=True)
##############################
#Separate training data and test data
# Split training data and test data
##############################
x = df.drop(columns='Survived')
y = df[['Survived']]
return x, y, df_test, df_test_index
##############################
#Model building 5-layer perceptron
##############################
def create_model_5dim_layer_perceptron(input_dim, \
activation="relu", \
optimizer="adam", \
out_dim=100, \
dropout=0.5):
model = Sequential()
#Input layer-Hidden layer 1
model.add(Dense(input_dim=input_dim, units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 1-Hidden layer 2
model.add(Dense(units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 2-Hidden layer 3
model.add(Dense(units=out_dim))
model.add(BatchNormalization())
model.add(Activation(activation))
model.add(Dropout(dropout))
#Hidden layer 3-Output layer
model.add(Dense(units=1))
model.add(Activation("sigmoid"))
model.compile(loss='binary_crossentropy', optimizer=optimizer, metrics=['accuracy'])
return model
if __name__ == "__main__":
#Data preparation
x_train, y_train, x_test, y_test_index = prepareData()
#Build a model
model = create_model_5dim_layer_perceptron(len(x_train.columns), \
activation="relu", \
optimizer="adam", \
out_dim=702, \
dropout=0.5)
#Learning
fit = model.fit(x_train, y_train, epochs=25, batch_size=16, verbose=2)
#Forecast
y_test_proba = model.predict(x_test)
y_test = numpy.round(y_test_proba).astype(int)
#Combine the result with the DataFrame of the PassengerId
# Combine the data frame of PassengerId and the result
df_output = pandas.concat([y_test_index, pandas.DataFrame(y_test, columns=['Survived'])], axis=1)
# result.Write csv to current directory
# Write result.csv to the current directory
df_output.to_csv('result.csv', index=False)
The result was "0.79425".
Compared to scikit-learn, Keras gives the impression that you can make detailed settings such as the number of layers and the number of neurons. However, I think there are few types of models other than convolutional neural networks (CNN) and recurrent neural networks (RNN). It seems good to use scikit-learn and Keras properly depending on the situation.
Next time I would like to learn using R language.
Parameter optimization automation with Keras with GridSearch CV https://qiita.com/cvusk/items/285e2b02b0950537b65e
2020/02/15 First edition released 2020/02/22 Next link added