Issue
I am using cupy to run a cuda code with pytorch.
My env is ubuntu 20, anaconda-python 3.7.6, nvidia-driver 440, cuda 10.2, cupy-cuda102, torch 1.4.0
First, I wrote a simple main code
import data_load_test
from tqdm import tqdm
import torch
from torch.utils.data import DataLoader
def main():
dataset = data_load_test.DataLoadTest()
training_loader = DataLoader(dataset, batch_size=1)
with torch.cuda.device(0):
pbar = tqdm(training_loader)
for epoch in range(3):
for i, img in enumerate(pbar):
print("see the message")
if __name__ == "__main__":
main()
and data loader like this.
from torch.utils.data import Dataset
import cv2
import cupy as cp
def read_cuda_file(cuda_path):
f = open(cuda_path, 'r')
source_line = ""
while True:
line = f.readline()
if not line: break
source_line = source_line + line
f.close()
return source_line
class DataLoadTest(Dataset):
def __init__(self):
source = read_cuda_file("cuda/cuda_code.cu")
cuda_source = '''{}'''.format(source)
module = cp.RawModule(code=cuda_source)
self.myfunc = module.get_function('myfunc')
self.input = cp.asarray(cv2.imread("hi.png",-1), cp.uint8)
h, w, c = self.input.shape
self.h = h
self.w = w
self.output = cp.zeros((w, h, 3), dtype=cp.uint8)
self.block_size = (32, 32)
self.grid_size = (h // self.block_size[1], w // self.block_size[0])
def __len__(self):
return 1
def __getitem__(self, idx):
self.myfunc(self.grid_size, self.block_size, (self.input, self.output, self.h, self.w))
return cp.asnumpy(self.output)
And my cuda code is,
#define PI 3.14159265358979323846f
extern "C"{
__global__ void myfunc(const unsigned char* refImg, unsigned char* warpImg, const long long cols, const long long rows)
{
long long x = blockDim.x * blockIdx.x + threadIdx.x;
long long y = blockDim.y * blockIdx.y + threadIdx.y;
long long indexImg = y * cols + x;
warpImg[indexImg * 3] = 0;
warpImg[indexImg * 3 + 1] = 1;
warpImg[indexImg * 3 + 2] = 2;
}
}
I have two GPUs TITAN V (device 0) and TITAN RTX (device 1)
When I run this code with TITAN V,(main function 3rd line)
with torch.cuda.device(0):
it works fine, but
with TITAN RTX,
with torch.cuda.device(1):
It gives an error message like this.
File "cupy/core/raw.pyx", line 66, in cupy.core.raw.RawKernel.__call__
File "cupy/cuda/function.pyx", line 162, in cupy.cuda.function.Function.__call__
File "cupy/cuda/function.pyx", line 144, in cupy.cuda.function._launch
File "cupy/cuda/driver.pyx", line 293, in cupy.cuda.driver.launchKernel
File "cupy/cuda/driver.pyx", line 118, in cupy.cuda.driver.check_status
cupy.cuda.driver.CUDADriverError: CUDA_ERROR_CONTEXT_IS_DESTROYED: context is destroyed
Please help.
Solution
In main() when dataLoadTest() class is instantiated, it is happening on the default device 0, so cuPy is compiling myFunc() there.
The next line “with torch.cuda.device(0):“ is where you switch to device 1 in the version that fails?
What happens if you call
cuPy.cuda.Device(1).use()
as the first line in main(), to make sure myFunc() gets instantiated on device 1?
Answered By - Stripedbass
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.