dev-resources.site
for different kinds of informations.
Dockerize CUDA-Accelerated Applications
Before we start
This guide expects the reader is already familiar with docker, PyTorch, CUDA, etc., and will not explain how and why things work instead it will describe how to get particular things done.
Abstract
Dockerizing applications has become a norm in the software industry for a while now. Everything nowadays is a container and almost every developer knows how to build containers! However, if your application requires GPU (i.e. AI/ML applications) acceleration, containerizing the application becomes slightly different. You have to make sure your docker container is enabled to harness the power of the CUDA cores in your machine. In this post, we will see how to do that.
Prerequisites
You have docker installed. Confirm this by executing the command and observing a similar output:
$ docker -v
Docker version 20.10.21, build baeda1f
You have NVIDIA GPU drivers installed and set up properly in your system. You can ensure this by:
$ nvidia-smi
Wed Feb 22 12:55:05 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.65 Driver Version: 527.56 CUDA Version: 12.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:01:00.0 Off | N/A |
| N/A 39C P8 9W / 30W | 0MiB / 4096MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 21 G /Xwayland N/A |
| 0 N/A N/A 34 G /Xwayland N/A |
| 0 N/A N/A 45 G /Xwayland N/A |
+-----------------------------------------------------------------------------+
If any of the above steps do not provide you with the expected output, you must stop here and install the required drivers to proceed to the further sections of this post.
You also need to make sure which CUDA version your system supports. For instance, nvidia-smi says my CUDA Version is 12.0. Therefore, I can use CUDA container images up to version 12.0. You should use whatever your targetted system supports and meets your requirements.
Example
Let’s assume we have a simple project that uses PyTorch and CUDA which we want to dockerize. Our app is very straightforward and the project tree is as follows -
app
├── main.py
├── requirements.txt
└── Dockerfile
We will be going through each of the files for a better illustration.
main.py
import torch
if torch.cuda.is_available():
print(f"Using CUDA. Version: {torch.version.cuda}")
else:
printf("CUDA is not available")
The main script is as simple as importing the torchmodule and checking if it can use CUDA.
requirements.txt
In this test app, we are going to use a specific version of torch 1.12.1 with CUDA 11.3. Hence, the requirements.txt file -
torch==1.12.1+cu113
You can pick any version as per your requirements from: https://pytorch.org/get-started/previous-versions/
Dockerfile
Having all the previous steps marked done, your system is now ready to write a Dockerfile. In your project directory add a new Dockerfile:
FROM nvidia/cuda:11.3.0-base-ubuntu20.04
RUN apt-get update && \
apt-get install --no-install-recommends -y python3-pip python3-dev ffmpeg libsm6 libxext6 gcc g++ && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY requirements.txt ./
RUN pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu113 --verbose
COPY . ./
ENTRYPOINT ["python3", "main.py"]
Explanation
In the first layer, we are using an official image of CUDA by NVIDIA based on Ubuntu 20.04. This image will be automatically communicating with your machine’s GPU driver and should be able to provide your application with the capability of using GPU acceleration in it. Note that, we are using an image based on CUDA 11.3 similar to the torch version we used in our requirements file. You should be careful about selecting/determining the version you want to use as it’s a good idea to keep it the same across the application.
In the second layer, we installed a few recommended packages, not all of these packages are necessary for this test application but they are often needed in a production app.
Then, we set the working directory and copied the requirements.txt file to it. We could copy the whole project at this stage but it is a good practice to copy the requirements file first and install the requirements as we don’t want to re-install all those packages every time we change our source codes. It’s less likely that a requirements file will change more frequently in the development lifecycle than the source codes. The cached layers will save us a lot of time while rebuilding the image in the future.
In the consecutive layers, we install the requirements, copy source codes, and run the application.
Build Image & Run Container
In the final step, we build the docker image and run the image through a container. To build an image, we use the following command in the working directory -
$ docker build --rm -t image_name .
After the image is built, we can run the app through a container -
$ docker run --gpus all image_name
Using CUDA. Version: 11.3
This will expose all available GPUs to the container and let it use them. If everything goes right, you should be able to see console output on line 2.
In case, if you have multiple GPUs and want to allow the specific device(s) to be used by your container, in that case, you can use the device parameter.
$ docker run --gpus device={DEVICE_ID} image_name
The above command will expose the specific GPU device to your container. If you want to allow multiple GPUs to be accessed by a container, you can use the following. For instance, this will expose the first and second available GPUs to the container.
$ docker run --gpus '"device=0,1"' image_name
References
Featured ones: