Issue
I want to train a VGG model on a gpu, because I have many images (137 099), and I need the process to go faster.
For this, I have a notebook: test.ipynb, on VSCode. My gpu is on a cluster (SLURM) where I am connected by ssh via remote-ssh with VSCode.
I am working with a conda environment env2, Python3.7.12, torch 1.8.1+cu101, torch.version.cuda == 10.1
In my first cell, I do
import torch
print(torch.cuda.is_available())
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
and I get
False
DEVICE = 'cpu'
It looks like the system can't access to the gpu, and the training of my VGG is indeed very slow.
Howevever, if I do !nvidia-smi on my notebook, I can see the gpu (TITAN X Pascal)
Now I try the same with a python file test.py instead of the notebook test.ipynb (still on VSCode with env2) I have
torch.cuda.is_available() = True,
and the training gets much faster.
And if I run test.ipynb with JupyterLab, I also get
torch.cuda.is_available() = True,
So it looks like VSCode cannot access the gpu from the notebook (test.ipynb), but can from a python file (test.py) even if I am using the same python Kernel (env2) for both files. This might come from VSCode since it works well on jupyterlab.
Does anyone know where does it come from?
Remark:
print(sys.executable)
> /home/manon/.conda/envs/env2/bin/python
both for test.py and test.ipynb files
Solution
I actually figured it out. I had first to create a tunnel, so that I could run my script with a Jupyter kernel using a remote jupyter server. I created the tunnel with:
jupyter notebook --ip localhost --port 3001 --no-browser
This command gave me an URI:
http://localhost:3001/?token=8afee394ef093456
Then I selected a jupyter remote server by cliking on "Jupyter Server:Local" button in the VSCode Status bar (you can also use "Jupyter: Specify local or remote Jupyter server for connections command" from the Command Palette)
and copied the URI obtained previously in: "Enter the URI of the running Jupyter server".
After this, everything worked fine
Answered By - manon chossegros
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.