When I wrote my graduation thesis, I did a semantic segmentation with pytorch, but I have a lot of troubles, so I will summarize it as a memorandum.
IoU There is a box_iou in torchvision, but I was in trouble because I could not find a guy for segmentation. I ended up using this implementation
torch.nn.CrossEntropyLoss This guy is really annoying Looking at document, it says $ Input: [Minibatch, C, d_1 ...] $, so the inferred result and the mask of the correct answer data When I made the shape of the one-hot expression like [number of batches, number of classes, H, W], I got an infinite number of errors and got stuck for a day or two.
Actually, the shape of the inference result is good as above, but the correct answer data seems to be $ target: [Minibatch, d_1 ...] $, the shape is [number of batches, H, W] and each pixel has a label. It seems that the answer is correct. BinaryCrossEntropy had the same shape for both input and target, so I got angry when I matched them. ~~ Check it properly ~~
Also, for target, it seems that the label of each pixel must match the index of the input class. In other words, if the number of classes is 20, the label of target must be a value from 0 to 19.
Recommended Posts