Saturday, November 19, 2022

[FIXED] nn.Parameter() doesn't register as a model parameter with torch.randn()

November 19, 2022 deep-learning, pytorch No comments

Issue

I'm trying to create a module, which contains certain layers of nn.Parameters(). If I initialize the layer as following -

self.W = nn.Parameter(torch.randn(4,4), requires_grad=True).double()

then this layer doesn't appear to register in the module parameters.

However, this initialization does work -

self.W = nn.Parameter(torch.FloatTensor(4,4), requires_grad=True)

Full example -

class TestNet(nn.Module):
    def __init__(self):
        super(TestNet, self).__init__()
        self.W = nn.Parameter(torch.randn(4,4), requires_grad=True).double()

    def forward(self, x):
        x = torch.matmul(x, self.W.T)
        x = torch.sigmoid(x)
        return x

tnet = TestNet()
print(list(tnet.parameters())) 
### Output = [] (an empty list)

Compared to -

class TestNet(nn.Module):
    def __init__(self):
        super(TestNet, self).__init__()
        self.W = nn.Parameter(torch.FloatTensor(4,4), requires_grad=True)

    def forward(self, x):
        x = torch.matmul(x, self.W.T)
        x = torch.sigmoid(x)
        return x

tnet = TestNet()
print(list(tnet.parameters()))

Which prints -

[Parameter containing:
 tensor([[-1.8859e+26,  6.0240e-01,  1.0842e-19,  3.8177e-05],
         [ 1.5229e-27, -8.5899e+09,  1.5226e-27, -3.6893e+19],
         [ 4.2039e-45, -4.6566e-10,  1.5229e-27, -2.0000e+00],
         [ 2.8026e-45,  0.0000e+00,  0.0000e+00,  4.5918e-40]],
        requires_grad=True)]

So what is the difference? Why doesn't the torch.randn() version work? I couldn't find anything about this in the docs or in previous answers online.

Solution

Calling randn is completely fine. The issue is that .double() is being called at the end of the operation:

class TestNet(nn.Module):
    def __init__(self):
        super(TestNet, self).__init__()
        self.W = nn.Parameter(torch.randn(4,4, dtype = torch.double), requires_grad=True)
        # self.W = nn.Parameter(torch.randn(4,4).double(), requires_grad=True) # also works

    def forward(self, x):
        x = torch.matmul(x, self.W.T)
        x = torch.sigmoid(x)
        return x

tnet = TestNet()
print(tnet.W.dtype)
# torch.float64

print(list(tnet.parameters())) 
# [Parameter containing:
# tensor([[-1.9645, -1.5445,  0.2435,  0.4380],
#         [ 1.1403,  0.8836,  0.1811, -0.1212],
#         [ 1.5983, -0.1854, -0.2626,  0.2881],
#         [-1.2364, -0.4802, -0.6038,  0.1164]], requires_grad=True)]

Now the code registers the parameters. I added dtype = torch.double in the initialization of randn to make sure that self.W contains doubles as before.

In summary, we cannot call nn.Parameter, and then register its conversion to another data type as our neural network weights for the deep learning system.

Answered By - C-3PO

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Saturday, November 19, 2022

[FIXED] nn.Parameter() doesn't register as a model parameter with torch.randn()

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels