Issue
Suppose I have a Pytorch autoencoder model defined as:
class ae(torch.nn.Module):
def __init__(self, z_dim, n_channel=3, size_=8):
super(ae, self).__init__()
self.encoder = Encoder()
self.decoder = Decoder()
def forward(self, x):
z = self.encoder(x)
x_reconstructed = self.decoder(z)
return z, x_reconstructe
Now instead of defining an specific ae
model and loading it, I can use the Encoder
and Decoder
code directly in my code. I know the number of total parameters wouldn't change but here's my question: since these two models are now separated, is it possible that the code can run on lower ram/gpu-memory? Does separating them means they do not need to be loaded into memory at once?
(Note that autoencoder is just an example, My question is really about any models that consists of several sub-modules).
Solution
is it possible that the code can run on lower ram/gpu-memory?
The way you created it right now no, it isn't. If you instantiate it and move to device, something along those lines:
encoder = ...
decoder = ...
autoencoder = ae(encoder, decoder).to("cuda")
It will take, in total, decoder
+ encoder
GPU memory when moved to the device and will be loaded to memory at once.
But, instead, you could do this:
inputs = ...
inputs = inputs.to("cuda")
encoder = ...
encoder.to("cuda")
output = encoder(inputs)
encoder.to("cpu") # Free GPU memory
decoder = ...
decoder.to("cuda") # Uses less in total
result = decoder(output)
You could wrap this idea in model (or function), still one would have to wait for parts of the network to be copied to GPU and your performance will be inferior (but GPU memory will be smaller).
Depending on where you instantiate the models RAM memory footprint could also be lower (Python will automatically destroy object in function scope), let's look at this option (no need for casting to cpu
as the object will be automatically garbage collected as mentioned above):
def encode(inputs):
encoder = ...
encoder.to("cuda")
results = encoder(inputs)
return results
def decode(inputs):
decoder = ...
decoder.to("cuda")
return decoder(inputs)
outputs = encode(inputs)
result = decode(outputs)
Answered By - Szymon Maszke
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.