How to access the GPU using Docker
Docker is a platform as a service (PaaS) tool that uses OS-level virtualization to deliver applications in packages called containers. Each container is isolated from the others and each of them bundles their own software, libraries, and configuration files.
Unlike virtual machines, containers share the services of a single operating system kernel. This reduces unnecessary overhead and makes them lightweight and portable. Docker containers can run on any computer running macOS, Windows, or Linux, either on-premises or in a public cloud environment, such as Scaleway.
All Scaleway GPU Instances come with prebuilt Docker images which can be launched as soon as you connect to your Instance. Each image provides a different AI environment. When you launch one of these images, you are in your chosen environment within seconds with all your favorite python packages already installed. Using Docker for your AI projects in this way allows you to ensure that your working environments are both isolated and portable, since they are in containers which can be easily transferred between machines.
You can also run Docker images provided by other sources and use them with your GPU Instance - for instance, you might want to use Docker images provided by Nvidia, Google, etc. Alternatively, you could also choose to build your own Docker images.
- Connect to your GPU Instance via SSH.
- Choose a Docker image from the containers shipped with your GPU Instance. See our additional content for more information about the available Docker images, including the specific commands to run each of them.
- Use the following command to pull/run the Docker container. Remember to replace
[OPTIONS]with any relevant options parameters, and
<IMAGE_NAME:TAG>with the image name and tag of your chosen image, as shown in the additional content linked above:docker run --rm -it [OPTIONS] <IMAGE_NAME:TAG> [COMMAND]
We recommend that you map volumes from your GPU Instance to your Docker containers, so that your work is not lost when you exit the Docker container. For increased data redundancy, we also recommend that the directories you map to your Docker containers are stored on Block Storage.
You can map directories from your GPU Instance’s local storage, to your Docker container, using the
-v <local_storage>:<container_mountpoint> flag. See the example command below:
docker run -it --rm -v /root/mydata/:/workspace nvidia/cuda:11.2.1-runtime-ubuntu20.04# use the `exit` command for exiting this docker container
In the above example, everything in the
/root/mydata directory on the Instance will be available in the Docker container, in a directory called
/workspace. Everything then created, deleted or modified in the
/workspace directory will be mapped back to
/root/mydata on the Instance, and remain there after the container is exited.
When you connect to your GPU Instance, you are probably connecting as the
root user. Once you then run and enter the Scaleway Docker container, the user
jovyan is used to run Jupyter Lab. You should therefore adjust the user access rights of your local folder on your GPU Instance OS so that files can be read and written from and to the container as required. This can be achieved by setting the ownership of the directory to map on your GPU Instance OS to the user/group
1000:100, as used by the user
jovyan in Jupyter Lab.
In the following example, we create a directory called
my-data, create a “Hello World” text file inside that directory, then use the
chown command to set appropriate ownership for the directory before running the Docker container and specifying the mapped directories. The “Hello World” file is then available inside the Docker container:
mkdir -p /root/my-data/echo "Hello World" > /root/my-data/hello.txtchown -R 1000:100 /root/my-datadocker run --runtime=nvidia --rm -it -p 8888:8888 -v /root/my-data/:/home/jovyan/ai/my-data rg.fr-par.scw.cloud/scw-ai/tensorflow:latest
You can also map Block Storage volumes into your containers. Block Storage is fully backed by SSDs. These three-time replicated, high-speed drives allow up to 5,000 IOPS. Once attached and mounted in the host OS of the GPU Instance, you can map the volume like a local volume, as we did above.
Block Storage volumes are independent of your GPU Instance and provide three-time replicated storage. It is recommended to use Block Storage for storing your datasets, training logs, model source code, etc.
Below is a list of the most common commands you use when dealing with Docker containers:
docker login <private-registry-url>
|This command is used to login to Docker’s default repository (Docker Hub) or any other private Docker registry
docker pull <docker_image>
|This command is used to pull images from the Docker Hub.
docker run -it -d <docker_image>
|This command is used to create and execute a container from an image.
|This command is used to list all running containers.
docker ps -a
|This command is used to display all running and exited containers.
docker exec -it <container_id> bash
|This command is used to access the
bash command prompt on a running container with the ID
docker stop <container_id>
|This command is used to stop a running container with the ID
<container_id>. This command allows the container to shut down gracefully.
docker kill <container_id>
|This command is used to “kill” a running container with the ID
<container_id>. This commands ends the execution of the container immediately.
docker build <path_to_Dockerfile>
|This command is used to build a new image from a specified Dockerfile.
docker commit <conatainer_id> <registry_user/docker_image>
|This command is used to create a new local image of an edited container.
docker tag <image_name:tag> <image_name:new_tag> or
docker tag <local-image-name:tag> <registry_url>/<image_name:tag>
|This command is used to tag a local image (necessary before pushing to a registry). An image can have multiple tags, like a specific version or “latest”.
docker push <registry_user/docker_image>
|This command is used to push a local image to a remote repository.
|This command is used to list all available Docker images on the local system.
docker rm <container_id>
|This command is used to remove a stopped container from the local system.
docker rmi <image_id>
|This command is used to delete an image from the local storage.
|This command is used to display information about the currently installed version of Docker.
For more information regarding the
docker run command, refer to the official documentation.
The default behavior of Docker when running a container using
docker run, is to not publish any internal ports of the container to the outside world. To access services on a container outside of Docker, you have to map the containers internal ports using the
|This flag maps TCP port 80 in the container to port 8080 on the Docker host.
|This flag maps TCP port 80 in the container to port 8080 on the Docker host for connections to host IP 192.168.1.100.
|This flag maps UDP port 80 in the container to port 8080 on the Docker host.
-p 8080:80/tcp -p 8080:80/udp
|This flag maps TCP port 80 in the container to TCP port 8080 on the Docker host, and map UDP port 80 in the container to UDP port 8080 on the Docker host.
You can access the GPU of your Instance from the inside of a Docker container using the preinstalled
With a “vanilla” Docker setup, you need to manually specify the Nvidia runtime when launching a Docker container. That way, you can use the GPU inside your container. Check the following example to get an idea of what it looks like:
docker run --runtime=nvidia -it --rm nvidia/cuda:11.2.1-runtime-ubuntu20.04 nvidia-smi
You can omit this option if using the “Ubuntu Focal GPU OS11” Operating System image with your GPU Instance (as this option is set by default in the Docker configuration files)
If your Instance has several GPUs, you can specify which GPU(s) to use with the container via the Docker CLI using either the
--gpus option (starting with Docker 19.03) or using the environment variable
The possible values of the
NVIDIA_VISIBLE_DEVICES variable are:
|a comma-separated list of GPU UUID(s) or index(es).
|all GPUs will be accessible. This is the default value.
|no GPUs will be accessible, but driver capabilities will be enabled.
void or empty or unset
|nvidia-container-runtime will have the same behavior as run (i.e. neither GPUs nor capabilities are exposed).
When using the
--gpus option to specify the GPUs, the device parameter should be used. This is shown in the examples below. The format of the device parameter should be encapsulated within single quotes, followed by double quotes for the devices you want enumerated to the container. For example:
'"device=2,3"' will enumerate GPUs 2 and 3 to the container.
When using the
NVIDIA_VISIBLE_DEVICES variable, you may need to set
nvidia, unless already set as default.
Starting a GPU enabled CUDA container (using
--gpus)docker run --runtime=nvidia -it --rm --gpus all nvidia/cuda:11.2.1-runtime-ubuntu20.04 nvidia-smi
Starting a GPU enabled container using
NVIDIA_VISIBLE_DEVICESand specify the nvidia runtimedocker run --runtime=nvidia -it --rm --e NVIDIA_VISIBLE_DEVICES=all nvidia/cuda:11.2.1-runtime-ubuntu20.04 nvidia-smi
Starting a GPU enabled Tensorflow container with a Jupyter notebook using
NVIDIA_VISIBLE_DEVICESand map the port
88888to access the web GUI:docker run --runtime=nvidia -e NVIDIA_VISIBLE_DEVICES=all -it --rm -p 8888:8888 tensorflow/tensorflow:latest-gpu-jupyter
Query the GPU UUID of the first GPU using nvidia-smi and then specifying that to the container:nvidia-smi -i 0 --query-gpu=uuid --format=csvuuidGPU-18a3e86f-4c0e-cd9f-59c3-55488c4b0c24docker run -it --rm --gpus device=GPU-b40b736a-9a07-9cf6-d9da-ed22a1ae16c5 nvidia/cuda:11.2.1-runtime-ubuntu20.04 nvidia-smi
GPU Instances are compatible with Scaleway Container Registry. You can create a Container Registry to store and pull your own Docker images.
See detailed information on how to pull your own custom Docker images on your GPU Instance.