Thursday, August 11, 2022

[FIXED] PyTorch: Fully Connected Layer has no Parameters

August 11, 2022 conv-neural-network, pytorch No comments

Issue

I am very new at PyTorch so please excuse my ignorance. I am trying to create my own CNN using PyTorch. The problem is that my Fully Connected Layer (i.e. the LazyLinear function) is showing no learnable parameters and the network is obviously not learning anything.

import torch
from torch import nn
import pytorch_lightning as pl
from pytorch_lightning.core.decorators import auto_move_data


class ThreeConvLayer(pl.LightningModule):

def __init__(self, num_classes, num_images):
  super(ThreeConvLayer, self).__init__()
  self.convolutionlayer1 = nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, padding=1)
  self.BatchNormalization1 = nn.BatchNorm2d(16)
  self.ReLU1 = nn.ReLU()

  self.maxpool1 = nn.MaxPool2d(kernel_size = 2, stride = 2)

  self.convolutionlayer2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)
  self.BatchNormalization2 = nn.BatchNorm2d(32)
  self.ReLU2 = nn.ReLU()

  self.maxpool2 = nn.MaxPool2d(kernel_size = 2, stride=2)

  self.convolutionlayer3 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
  self.BatchNormalization3 = nn.BatchNorm2d(64)
  self.ReLU3 = nn.ReLU()

  self.fc1 = nn.LazyLinear(num_classes)
  self.softmax = nn.Softmax(dim=1)
  self.loss = nn.CrossEntropyLoss()

def forward(self, x):
  #print('shape x:  ', x.shape)
  out = self.convolutionlayer1(x)
  out = self.BatchNormalization1(out)
  out = self.ReLU1(out)
  out = self.maxpool1(out)
  # print('shape out1:  ', out.shape)

  out = self.convolutionlayer2(out)
  out = self.BatchNormalization2(out)
  out = self.ReLU2(out)
  out = self.maxpool2(out)
  #print('shape out2:  ', out.shape)

  out = self.convolutionlayer3(out)
  out = self.BatchNormalization3(out)
  out = self.ReLU3(out)
  #print('shape out3:  ', out.shape)

  out = out.reshape(out.size(0), -1)
  #print('out after reshpae:  ', out.shape)

  out = self.fc1(out)
  out = self.softmax(out)
  #print('final out:  ', out.shape)

  return out

def training_step(self, batch, batch_no):
  # implement single training step
  x, y = batch
  y = y.long()
  logits = self(x)
  loss = self.loss(logits, y)
  self.log('val_loss', loss)
  return loss

def configure_optimizers(self):
  # choose your optimizer
  return torch.optim.RMSprop(self.parameters(), lr=0.05)  #lr=0.005

Here is the PyTorch printout on learnable parameters:

   | Name                | Type             | Params
   ----------------------------------------------------------
   0  | convolutionlayer1   | Conv2d           |  160
   1  | BatchNormalization1 | BatchNorm2d      | 32
   2  | ReLU1               | ReLU             | 0
   3  | maxpool1            | MaxPool2d        | 0
   4  | convolutionlayer2   | Conv2d           | 4.6 K
   5  | BatchNormalization2 | BatchNorm2d      | 64
   6  | ReLU2               | ReLU             | 0
   7  | maxpool2            | MaxPool2d        | 0
   8  | convolutionlayer3   | Conv2d           | 18.5 K
   9  | BatchNormalization3 | BatchNorm2d      | 128
   10 | ReLU3               | ReLU             | 0
   11 | fc1                 | LazyLinear       | 0
   12 | softmax             | Softmax          | 0
   13 | loss                | CrossEntropyLoss | 0
   ----------------------------------------------------------

Here you can see that the FullyConnected Layer has no parameters which is clearly wrong. What did I do wrong here?

Solution

This is one limitation of lazy modules such as LazyLinear. If you read through the documentation of nn.modules.lazy.LazyModuleMixin, you will see:

Modules that lazily initialize parameters, or “lazy modules”, derive the shapes of their parameters from the first input(s) to their forward method. Until that first forward they contain torch.nn.UninitializedParameters that should not be accessed or used, and afterward they contain regular torch.nn.Parameters.

In other words, you need to first perform a dry run with random data before you can proceed with your calls. This is required because lazy modules infer the missing arguments (such as in_features for nn.Linear) after the first inference call, i.e. based on the shape of the first input it receives.

getting the number of parameters registered.
properly register them inside an optimizer.

Answered By - Ivan

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Thursday, August 11, 2022

[FIXED] PyTorch: Fully Connected Layer has no Parameters

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels