Friday, January 14, 2022

[FIXED] Does dissecting a Pytorch model lower memory usage?

January 14, 2022 machine-learning, pytorch No comments

Issue

Suppose I have a Pytorch autoencoder model defined as:

class ae(torch.nn.Module):
   def __init__(self, z_dim, n_channel=3, size_=8):
       super(ae, self).__init__()
       self.encoder = Encoder()
       self.decoder = Decoder()

   def forward(self, x):
       z = self.encoder(x)
       x_reconstructed = self.decoder(z)
       return z, x_reconstructe

Now instead of defining an specific ae model and loading it, I can use the Encoder and Decoder code directly in my code. I know the number of total parameters wouldn't change but here's my question: since these two models are now separated, is it possible that the code can run on lower ram/gpu-memory? Does separating them means they do not need to be loaded into memory at once?

(Note that autoencoder is just an example, My question is really about any models that consists of several sub-modules).

Solution

is it possible that the code can run on lower ram/gpu-memory?

The way you created it right now no, it isn't. If you instantiate it and move to device, something along those lines:

encoder = ...
decoder = ...
autoencoder = ae(encoder, decoder).to("cuda")

It will take, in total, decoder + encoder GPU memory when moved to the device and will be loaded to memory at once.

But, instead, you could do this:

inputs = ...
inputs = inputs.to("cuda")

encoder = ...
encoder.to("cuda")

output = encoder(inputs)

encoder.to("cpu")  # Free GPU memory

decoder = ...
decoder.to("cuda")  # Uses less in total

result = decoder(output)

You could wrap this idea in model (or function), still one would have to wait for parts of the network to be copied to GPU and your performance will be inferior (but GPU memory will be smaller).

Depending on where you instantiate the models RAM memory footprint could also be lower (Python will automatically destroy object in function scope), let's look at this option (no need for casting to cpu as the object will be automatically garbage collected as mentioned above):

def encode(inputs):
    encoder = ...
    encoder.to("cuda")

    results = encoder(inputs)
    return results


def decode(inputs):
    decoder = ...
    decoder.to("cuda")
    return decoder(inputs)

outputs = encode(inputs)
result = decode(outputs)

Answered By - Szymon Maszke

This Answer collected from stackoverflow and tested by PythonFixing community admins, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Friday, January 14, 2022

[FIXED] Does dissecting a Pytorch model lower memory usage?

Issue

Solution

0 comments:

Post a Comment

Popular Posts

Labels