Bug that says'val_loss' is not found when using Early Stopping in pytorch-lightning (0.5.3.2)

Premise

(*** This is an article on January 9, 2020 **. I think that there will be no problem in the near future.)

Pytorch Lightning is PyTorch's Keras-like trumpet. It is attractive to be able to write models, learning, and data in a compact manner.

For details, see the following article by @fam_taro.

PyTorch Sangokushi (Ignite / Catalyst / Lightning) --Qiita

It seems to be very convenient, but I ran into a bug early in the installation, so I will report the contents and solution.

environment

OS: macOS 10.14.6 Python: 3.7.3 pytorch-lightning: 0.5.3.2 How to install Pytorch Lightning: pip install pytorch-lightning

bug

Pytorch Lightning implements Early Stopping, You can write it with the following code (clean).

Model definition part(Excerpt)


import pytorch_lightning as pl
class MyModel(pl.LightningModule):
    ...
    def validation_step(self, batch, batch_nb):
        x, y = batch
        y_hat = self.forward(x)
        return {'val_batch_loss': F.cross_entropy(y_hat, y)}

    def validation_end(self, outputs):
        val_loss = torch.stack([x['val_batch_loss'] for x in outputs]).mean()
        log = {'val_loss': val_loss}
        return {'log': log}
    ...

Early_Around Stopping


early_stop_callback = EarlyStopping(
    min_delta=0.00,
    patience=1,
    verbose=False,
    monitor='val_loss',
    mode='min',
)
model = MyModel()
trainer = pl.Trainer(early_stop_callback=early_stop_callback)
trainer.fit(model) 

However, when I ran it, I got the following error and it didn't work. (I don't know if it always happens, but for the time being it has always been reproduced in my execution environment.)

Early stopping conditioned on metric `val_loss` which is not available. Available metrics are: loss,train_loss

This bug has also been reported on the official Issue.

https://github.com/williamFalcon/pytorch-lightning/issues/490

Solution

It has been fixed in the latest master branch, so installing it with the following command will fix the bug.

pip install git+https://github.com/williamFalcon/pytorch-lightning.git@master --upgrade

important point

Installing the latest branch is likely to cause an API discrepancy with ** January 9, 2020 ** Current Documentation.

Example: Change the argument of the initialization method of the checkpoint saving class pytorch_lightning.callbacks.ModelCheckpoint (Applicable page)

--When installed with pip install pytorch-lightning (same as official documentation) -** save_best_only : Specify whether to save only the best model with Bool value --If you installed the latest branch with the "Solution" command (different from the official documentation) - save_top_k **: Specify how many tops to save as an integer -(save_best_only has been removed as a replacement)

Probably, when a version over 0.5.3.2 is released, the usual pip install pytorch-lightning will be fine.

reference

-PyTorch Sangokushi (Ignite / Catalyst / Lightning) --Qiita -Pytorch Lightning Repository -Pytorch Lightning documentation -Bug Report & Fix Issue -How to install the modified version

Recommended Posts

Bug that says'val_loss' is not found when using Early Stopping in pytorch-lightning (0.5.3.2)
Image processing with PIL (Pillow)
[Error] Symbol not found: _PyUnicodeUCS2_Compare
Precautions when using phantomjs from python
Bug that says'val_loss' is not found when using Early Stopping in pytorch-lightning (0.5.3.2)
Command is not found in sudo
When searching is not working in GAE's Datastore
There is a pattern that the program did not stop when using Python threading
When the selected object in bpy.context.selected_objects is not returned
When "ERROR: HTTP is not supported." Is displayed in mpsyt
About the matter that nosetests does not pass when __init__.py is created in the project directory
What to do when a Missing artifact occurs in a jar that is not defined in pom.xml
A story that stumbled when using pip in a proxy environment
Scripts that can be used when using bottle in Python
I got a TypeError:'int' object is not iterable when using keras
Sequential processing method when there is not enough memory in Keras
[Golang] "package exec is not in GOROOT" when executing the test
[Super Beginner] [Tired Intermediate] When "command not found" or "command not found" is displayed