Introduction et mise en œuvre de JoCoR-Loss (CVPR2020)

Introduction of the paper

This work, named Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization[CVPR2020],isgenerallybasedonco-teaching[NIPS2018]. To explain what this paper is about, we need some basic conception about co-teaching.

co-teaching co-teaching is a framework that is proposed for robust training against noise. This paper aims to mitigate the negative impact of noisy labels on model training.

The specific method is:


This is the concept of co-teaching. A and B denote two models.


This work follows most of the settings of co-teaching, like

The core technology that this paper proposed, is a novel loss, which named JoCoR Loss.

image.png image.png



This new loss function forces the two models to give the same prediction. According to the author, this Contrastive loss function should be lower when the label is clean, so the model can distinguish whether a label is clean by this loss.


Here are the experiment results on the noisy CIFAR-10 dataset. This method absolutely outperforms the previous method.

My opinion about JoCoR loss


Since this loss compares 2 prediction vectors, but not two Cross_Entropy Loss. So this KL-divergences has nothing to do with the labels. So I don't think that this loss is helpful for selecting data with clean samples. I think the prediction of 2 models will be similar for the images that have significant features which are easy to be distinguished. But this novel loss did help the model to achieve better performance, I think the reason should be:

To support my opinion, I ran an easy experiment. I let the model select clean sample without JoCor loss( the original way ), and backward the loss with JoCoR loss . And I got a similar test accuracy..

So for me, the meaning of this paper is showing that training two models and force them predicting samely is useful. And I think this idea can apply to other tasks. What I have done

