Issue
I have read other people's questions for similar issues, but can't figure it out in my case. My code is below, how do I fix this? Thank you.
data = ImageFolder(data_dir, transform=transforms.Compose([transforms.Resize((224,224)),transforms.ToTensor()]))
trainloader = torch.utils.data.DataLoader(data, batch_size=3600,
shuffle=True, num_workers=2)
dataiter = iter(trainloader)
x_train, y_train = dataiter.next()
print(x_train.size())
print(y_train.size())
torch.Size([3600, 3, 224, 224])
torch.Size([3600])
class Net(torch.nn.Module):
def __init__(self):
super().__init__()
# here we set up the tensors......
self.layer1 = torch.nn.Linear(224, 12)
self.layer2 = torch.nn.Linear(12, 10)
def forward(self, x):
# here we define the (forward) computational graph,
# in terms of the tensors, and elt-wise non-linearities
x = F.relu(self.layer1(x))
x = self.layer2(x)
return x
net = Net()
y = net.forward(x_train)
lossFn = torch.nn.CrossEntropyLoss()
loss = lossFn(y, y_train)
print(loss)
Solution
Your input to the network is a 2D image. That is a tensor with 4 dimensions: batch, channel, height and width.
However, you treat the 2D input as a 1D signal by applying nn.Linear
layers to its width dimension only, resulting with an output of shape batchchannelheight*output_dim. In contrast, the nn.CrossEntropyLoss
expects only one output vector per target label.
You need to change your Net
to properly process images into a single vector of predictions.
You can checkout milestone image classification architectures here.
Answered By - Shai
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.