Issue
I know this is because the shapes don't match for the multiplication, but why when my code is similar to most example code I found:
import torch.nn as nn
...
#input is a 256x256 image
num_input_channels = 3
self.encoder = nn.Sequential(
nn.Conv2d(num_input_channels*2**0, num_input_channels*2**1, kernel_size=3, padding=1, stride=2), #1 6 128 128
nn.Tanh(),
nn.Conv2d(num_input_channels*2**1, num_input_channels*2**2, kernel_size=3, padding=1, stride=2), #1 12 64 64
nn.Tanh(),
nn.Conv2d(num_input_channels*2**2, num_input_channels*2**3, kernel_size=3, padding=1, stride=2), #1 24 32 32
nn.Tanh(),
nn.Conv2d(num_input_channels*2**3, num_input_channels*2**4, kernel_size=3, padding=1, stride=2), #1 48 16 16
nn.Tanh(),
nn.Conv2d(num_input_channels*2**4, num_input_channels*2**5, kernel_size=3, padding=1, stride=2), #1 96 8 8
nn.Tanh(),
nn.Conv2d(num_input_channels*2**5, num_input_channels*2**6, kernel_size=3, padding=1, stride=2), #1 192 4 4
nn.LeakyReLU(),
nn.Conv2d(num_input_channels*2**6, num_input_channels*2**7, kernel_size=3, padding=1, stride=2), #1 384 2 2
nn.LeakyReLU(),
nn.Conv2d(num_input_channels*2**7, num_input_channels*2**8, kernel_size=2, padding=0, stride=1), #1 768 1 1
nn.LeakyReLU(),
nn.Flatten(),
nn.Linear(768, 1024*32),
nn.ReLU(),
nn.Linear(1024*32, 256),
nn.ReLU(),
).cuda()
I get the error "RuntimeError: mat1 and mat2 shapes cannot be multiplied (768x1 and 768x32768)"
To my understanding I should end up with a Tensor of shape [1,768,1,1] after the convolutions and [1,768] after flattening, so I can use a fully connected Linear layer that goes to 1024*32 in size (by which I tried to add some more ways for the neural net to store data/knowledge).
Using nn.Linear(1,1024*32)
runs with a warning later: "UserWarning: Using a target size (torch.Size([3, 256, 256])) that is different to the input size (torch.Size([768, 3, 256, 256]))". I think it comes from my decoder, though
What am I not understanding correctly here?
Solution
All torch.nn
Modules require batched inputs, and it seems in your case you have no batch dimension. Without knowing your code I'm assuming you are using
my_input.shape == (3, 256, 256)
But you will need to add a batch dimension, that is, you need to have
my_input.shape == (1, 3, 256, 256)
You can easily do that by introducing a dummy dimension using:
my_input = my_input[None, ...]
Answered By - flawr
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.