Issue
I am very new at PyTorch so please excuse my ignorance. I am trying to create my own CNN using PyTorch. The problem is that my Fully Connected Layer (i.e. the LazyLinear function) is showing no learnable parameters and the network is obviously not learning anything.
import torch
from torch import nn
import pytorch_lightning as pl
from pytorch_lightning.core.decorators import auto_move_data
class ThreeConvLayer(pl.LightningModule):
def __init__(self, num_classes, num_images):
super(ThreeConvLayer, self).__init__()
self.convolutionlayer1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)
self.BatchNormalization1 = nn.BatchNorm2d(16)
self.ReLU1 = nn.ReLU()
self.maxpool1 = nn.MaxPool2d(kernel_size = 2, stride = 2)
self.convolutionlayer2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
self.BatchNormalization2 = nn.BatchNorm2d(32)
self.ReLU2 = nn.ReLU()
self.maxpool2 = nn.MaxPool2d(kernel_size = 2, stride=2)
self.convolutionlayer3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
self.BatchNormalization3 = nn.BatchNorm2d(64)
self.ReLU3 = nn.ReLU()
self.fc1 = nn.LazyLinear(num_classes)
self.softmax = nn.Softmax(dim=1)
self.loss = nn.CrossEntropyLoss()
def forward(self, x):
#print('shape x: ', x.shape)
out = self.convolutionlayer1(x)
out = self.BatchNormalization1(out)
out = self.ReLU1(out)
out = self.maxpool1(out)
# print('shape out1: ', out.shape)
out = self.convolutionlayer2(out)
out = self.BatchNormalization2(out)
out = self.ReLU2(out)
out = self.maxpool2(out)
#print('shape out2: ', out.shape)
out = self.convolutionlayer3(out)
out = self.BatchNormalization3(out)
out = self.ReLU3(out)
#print('shape out3: ', out.shape)
out = out.reshape(out.size(0), -1)
#print('out after reshpae: ', out.shape)
out = self.fc1(out)
out = self.softmax(out)
#print('final out: ', out.shape)
return out
def training_step(self, batch, batch_no):
# implement single training step
x, y = batch
y = y.long()
logits = self(x)
loss = self.loss(logits, y)
self.log('val_loss', loss)
return loss
def configure_optimizers(self):
# choose your optimizer
return torch.optim.RMSprop(self.parameters(), lr=0.05) #lr=0.005
Here is the PyTorch printout on learnable parameters:
| Name | Type | Params
----------------------------------------------------------
0 | convolutionlayer1 | Conv2d | 160
1 | BatchNormalization1 | BatchNorm2d | 32
2 | ReLU1 | ReLU | 0
3 | maxpool1 | MaxPool2d | 0
4 | convolutionlayer2 | Conv2d | 4.6 K
5 | BatchNormalization2 | BatchNorm2d | 64
6 | ReLU2 | ReLU | 0
7 | maxpool2 | MaxPool2d | 0
8 | convolutionlayer3 | Conv2d | 18.5 K
9 | BatchNormalization3 | BatchNorm2d | 128
10 | ReLU3 | ReLU | 0
11 | fc1 | LazyLinear | 0
12 | softmax | Softmax | 0
13 | loss | CrossEntropyLoss | 0
----------------------------------------------------------
Here you can see that the FullyConnected Layer has no parameters which is clearly wrong. What did I do wrong here?
Solution
This is one limitation of lazy modules such as LazyLinear. If you read through the documentation of nn.modules.lazy.LazyModuleMixin
, you will see:
Modules that lazily initialize parameters, or “lazy modules”, derive the shapes of their parameters from the first input(s) to their forward method. Until that first forward they contain
torch.nn.UninitializedParameter
s that should not be accessed or used, and afterward they contain regulartorch.nn.Parameter
s.
In other words, you need to first perform a dry run with random data before you can proceed with your calls. This is required because lazy modules infer the missing arguments (such as in_features
for nn.Linear
) after the first inference call, i.e. based on the shape of the first input it receives.
getting the number of parameters registered.
properly register them inside an optimizer.
Answered By - Ivan
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.