1.First of all

In this article, [AI competitions don't produce useful models](https://lukeoakdenrayner.wordpress.com/2019/09/19/ai-competitions-dont-produce-useful-models/?utm_campaign=piqcy&utm_medium=email&utm_source=Revue Isn't the recent network architecture problem-defined in% 20newsletter) just overtraining datasets & outperforming luck games? The purpose is to investigate the story with the wide-angle fundus data set [^ 1]. Also, for beginners, I will explain how to fine-turn from the trained weights of ImageNet [^ 2] using the network implemented in tf.keras.

All code

2. Environment

--PC specs

CPU: Intel Core i9-9900K
RAM: 16GB
GPU: NVIDIA GeForce GTX 1080 Ti
Library
Python: 3.7.4
matplotlib: 3.1.1
pandas: 0.25.1
tqdm: 4.31.1
pillow: 6.1.0
scikit-learn: 0.21.3
tensorflow-gpu: 2.0.0

AI competitions don’t produce useful models [Article (AI competitions don't produce useful models)](https://lukeoakdenrayner.wordpress.com/2019/09/19/ai-competitions-dont-produce-useful-models/?utm_campaign=piqcy&utm_medium=email&utm_source=Revue This is an overview of% 20 newsletter). In machine learning competitions, the final test data results are not open, so that the model can be evaluated only once. This makes it impossible to model Gasha. It seems that this measure is working because it is true that each team can only be evaluated once. However, this is not the case when looking at the competition as a whole. For example, if 100 teams are participating, it means that they are doing model gasha 100 times. The winning team was lucky enough to beat this Gasha, but not the best method. In addition, the results of the top teams are competing in some competitions. When the significance test is performed, the results of the top teams are just errors, and a larger amount of test data must be prepared to make a significant difference. It seems that the accuracy has improved up to VGG, but it is doubtful after that, and there is a possibility that it is just overfitting with test data.

This time, using the wide-angle fundus dataset [^ 1], VGG16 [^ 3], DenseNet121 [^ 4], ResNet50 [^ 5], InceptionV3 [^ 6], NASNetMobile [^ 7], EfficientNet-B0 [^ 8] ] Performance evaluation is performed to check whether the network after VGG16 is really progressing.

4. Wide-angle fundus dataset & data split

For a description of the wide-angle fundus dataset and how the data was split, see Article (Image classification by wide-angle fundus image dataset) ..

5. Model building & learning

This chapter describes how to fine-turn with the trained model of Implemented Network & ImageNet [^ 2] for getting started. By the way, the wide-angle fundus dataset has too different image domains from ImageNet, so Fine * turning does not contribute to improving accuracy, but it does contribute to improving convergence speed. For more information, click here [Article (Effectiveness of Transfer Learning to Medical Imaging)](https://speakerdeck.com/uchi_k/lun-wen-shao-jie-yi-yong-hua-xiang-hefalsezhuan-yi- xue-xi-falseyou-xiao-xing-nituite-transfusion-understanding-transfer-learning-for-medical-imaging). Change the model construction part from this Article (Image classification by wide-angle fundus image data set). The learning parameter changed the learning rate to 0.00001. First, ʻimport` the library.

from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Activation

Then build the network. This time, I will explain using VGG16 as an example. First, load VGG16 as the base model. This time, the purpose is Fine-turnig from the trained weights of ImageNet [^ 2], so build a network up to the final convolution layer with ʻinclude_top = False, and learn ImageNet [^ 2] with weights ='imagenet'`. Read the completed weights. Then, in the same way as Article (Image classification by wide-angle fundus image dataset), combine it with the fully connected layer loaded VGG16.

base_model = VGG16(include_top=False, weights='imagenet', pooling='avg',
                   input_shape=(image_size[0], image_size[1], 3))
x = Dense(256, kernel_initializer='he_normal')(base_model.output)
x = Dense(classes, kernel_initializer='he_normal')(x)
outputs = Activation('softmax')(x)
model = Model(inputs=base_model.inputs, outputs=outputs)

If you use another network, for example ResNet50, you can change VGG16 to ResNet50. Sample code for other networks is uploaded here [https://github.com/burokoron/StaDeep/tree/master/fine-turning). By the way, EfficientNet-B0 has not been released, but it has been implemented, so keras-team/keras-applications'efficientnet.py on GitHub. I downloaded and used applications / blob / master / keras_applications / coefficientnet.py).

The learning results are as follows. NASNetMobile [^ 7] is the only one that has no discrepancy between learning loss and verification loss. Although accracy is dissociated. EfficientNet-B0 [^ 8] Is it really beautiful because there is only the latest model? is. ネットワーク比較.png

6. Evaluation

The evaluation was done with the F value as in this Article (Image classification by wide-angle fundus image data set). The results are shown below.

Detailed results

# VGG16
              precision    recall  f1-score   support

         AMD       0.28      0.53      0.37        75
       DR_DM       0.81      0.75      0.78       620
         Gla       0.73      0.76      0.74       459
          MH       0.12      0.22      0.16        32
      Normal       0.75      0.75      0.75       871
          RD       0.93      0.77      0.84       176
          RP       0.91      0.78      0.84        50
         RVO       0.84      0.60      0.70       107

    accuracy                           0.73      2390
   macro avg       0.67      0.65      0.65      2390
weighted avg       0.76      0.73      0.74      2390

# ResNet50
              precision    recall  f1-score   support

         AMD       0.26      0.56      0.36        75
       DR_DM       0.95      0.61      0.74       620
         Gla       0.78      0.61      0.68       459
          MH       0.30      0.25      0.27        32
      Normal       0.64      0.84      0.73       871
          RD       0.85      0.85      0.85       176
          RP       0.64      0.88      0.74        50
         RVO       0.85      0.57      0.68       107

    accuracy                           0.71      2390
   macro avg       0.66      0.65      0.63      2390
weighted avg       0.76      0.71      0.71      2390

# InceptionV3
              precision    recall  f1-score   support

         AMD       0.28      0.53      0.37        75
       DR_DM       0.84      0.68      0.75       620
         Gla       0.74      0.68      0.71       459
          MH       0.29      0.16      0.20        32
      Normal       0.69      0.81      0.74       871
          RD       0.91      0.80      0.85       176
          RP       0.83      0.76      0.79        50
         RVO       0.64      0.52      0.57       107

    accuracy                           0.72      2390
   macro avg       0.65      0.62      0.62      2390
weighted avg       0.74      0.72      0.72      2390

# DenseNet121
              precision    recall  f1-score   support

         AMD       0.25      0.60      0.36        75
       DR_DM       0.94      0.66      0.78       620
         Gla       0.82      0.58      0.68       459
          MH       0.45      0.16      0.23        32
      Normal       0.65      0.87      0.74       871
          RD       0.94      0.82      0.88       176
          RP       0.98      0.86      0.91        50
         RVO       0.91      0.64      0.75       107

    accuracy                           0.73      2390
   macro avg       0.74      0.65      0.67      2390
weighted avg       0.78      0.73      0.73      2390

# NASNetMobile
              precision    recall  f1-score   support

         AMD       0.25      0.52      0.34        75
       DR_DM       0.84      0.66      0.74       620
         Gla       0.59      0.81      0.69       459
          MH       0.16      0.22      0.18        32
      Normal       0.72      0.70      0.71       871
          RD       0.94      0.76      0.84       176
          RP       0.94      0.60      0.73        50
         RVO       0.75      0.43      0.55       107

    accuracy                           0.69      2390
   macro avg       0.65      0.59      0.60      2390
weighted avg       0.73      0.69      0.70      2390

# EfficientNet-B0
              precision    recall  f1-score   support

         AMD       0.32      0.44      0.37        75
       DR_DM       0.94      0.60      0.73       620
         Gla       0.79      0.57      0.66       459
          MH       0.21      0.38      0.27        32
      Normal       0.63      0.88      0.73       871
          RD       0.94      0.85      0.89       176
          RP       0.80      0.80      0.80        50
         RVO       0.83      0.56      0.67       107

    accuracy                           0.71      2390
   macro avg       0.68      0.63      0.64      2390
weighted avg       0.76      0.71      0.71      2390

In addition, the results of ImageNet [^ 2] and the results of this time are summarized in the table below. The results on ImageNet [^ 2] were taken from keras-team / keras-applications on GitHub. In ImageNet [^ 2], the network after VGG16 is certainly more accurate than VGG16, but in this wide-angle fundus dataset [^ 1], DenseNet121 is the only network that is superior to VGG16. Furthermore, since there are 2390 test data, a significant difference test at 95% will cause a blur of ± 1 to 2%, so it cannot be said that DenseNet121 is statistically superior to VGG16.

network	ImageNet Top-1 Acc [%]	Wide-angle fundus dataset F1-score
VGG16	71.268	0.65
ResNet50	74.928	0.63
InceptionV3	77.898	0.62
DenseNet121	74.972	0.67
NASNetMobile	74.366	0.60
EfficientNet-B0	77.190	0.64

7. Summary

In this article, we used the wide-angle fundus dataset [^ 1] to investigate whether the network after VGG16 is really improving. Also, as an introduction, I explained how to fine-turn the trained weights of the implemented network & ImageNet [^ 2]. As a result of the experiment, only DenseNet121 was superior to VGG16. However, there was no significant difference and it was not statistically superior. This time I did it with a wide-angle fundus dataset, but I think that if I do it with another dataset, the result will be different, and since I have hardly done image expansion processing, the result will change if I use the latest method. There is also a possibility. However, I usually deal with medical images mainly in the field of ophthalmology, but DenseNet gives good results, so I think the results are as experienced. (I don't know why DenseNet tends to be good, so please let me know.) By the way, the F value of the simple 10-layer CNN of Last time (image classification by wide-angle fundus image data set) was 0.59, so set the network to DenseNet121. I was able to improve it by 0.08 to 0.67. I would like to continue to apply various latest methods to improve accuracy.