Issue
I try to train a pytorch model on amazon sagemaker studio.
It's working when I use an EC2 for training with:
estimator = PyTorch(entry_point='train_script.py',
role=role,
sagemaker_session = sess,
train_instance_count=1,
train_instance_type='ml.c5.xlarge',
framework_version='1.4.0',
source_dir='.',
git_config=git_config,
)
estimator.fit({'stockdata': data_path})
and it's work on local mode in classic sagemaker notebook (non studio) with:
estimator = PyTorch(entry_point='train_script.py',
role=role,
train_instance_count=1,
train_instance_type='local',
framework_version='1.4.0',
source_dir='.',
git_config=git_config,
)
estimator.fit({'stockdata': data_path})
But when I use it the same code (with train_instance_type='local') on sagemaker studio it doesn't work and I have the following error: No such file or directory: 'docker': 'docker'
I tried to install docker with pip install but the docker command is not found if use it in terminal
Solution
This indicates that there is a problem finding the Docker service.
By default, the Docker is not installed in the SageMaker Studio (confirming github ticket response).
Answered By - Dawid Laszuk
0 comments:
Post a Comment
Note: Only a member of this blog may post a comment.