Running OpenGL & CUDA Applications with nvidia-docker2 on a Remote (headless) GPU System

5 min readJan 3, 2020

This story tackles the problem of running OpenGL accelerated GUI applications using HW-GPU support in a docker container on a remote headless server system (or cloud).

A possible use case for this is the popular Robot Operating System (ROS) together with its simulator Gazebo. However, all OpenGL applications are in scope (e.g. Unity, OpenAI´s Gym).

Desktop environment within the docker container with glxgears running and output of glxinfo.

For a long time there were multiple issues that hindered the above mentioned use cases. These are for example:

X11 forwarding using OpenGL with HW support with docker. This works fine as long as you do not need HW acceleration. For more details please check: https://medium.com/@pigiuz/hw-accelerated-gui-apps-on-docker-7fd424fe813e
Having the need for GLX in docker using NVIDIA graphics card: https://github.com/NVIDIA/nvidia-docker/issues/534. “OpenGL is not supported at the moment and there is no plan to support GLX is the near future (same as 1.0). OpenGL+EGL however is on the roadmap and will be supported. We will update #11 once we publish it.” I found this pretty frustrating since Gazebo disables certain functionalities if there is no OpenGL available.
Xorg with HW-accel on a headless server system. Persistent settings to have the advantage of the NVIDIA driver capabilities.

Most issues are caused by lack of knowledge from my side. But now that possible solutions are found I want to give the reader a guide to allow for all possible use-cases. If you want the read more about the background and the tutorials which my guide is based on please refer to:

Setting up a HW accelerated desktop on AWS G2 instances

GPU access is becoming a more and more common requirement in a growing number of use cases. Whether we’re doing some…

medium.com

Creating a GPU-Enhanced Virtual Desktop for Udacity

We’re thrilled to announce that Udacity has launched one-button access to a full GPU-driven Ubuntu desktop.

engineering.udacity.com

Structure

At the end we will have a docker container that is running on a remote, headless server (or cloud, VM, etc.) using NVIDIA GPU acceleration and OpenGL capabilities. The container accesses the hosts X11 server but the GUI functionalities are enabled by a combination of:

VirtualGL
TurboVNC
noVNC

Since 2018 nvidia-docker2 NVIDIA released a new (vendor specific?) version of the libglvnd library that allows for OpenGL support together with docker (see). We therefore start from these image (often called cudagl).

To allow for multiple use-cases a simple python script was written to generate the final Dockerfile that is used. The following images shows to possibilities and the structure.

Flowchart for the Dockerfile-generation wizard.

The part in the header is always required. Here we select:

The Ubuntu base version (e.g. 16.04)
The CUDA version (e.g. 9.0)

Afterwards we are asked if we want to include the CuDNN library which is useful for Deep Learning applications.
In the main part, all required modules and packages are installed for the GUI abilities (VirtualGL, TurboVNC, noVNC, Lubuntu Desktop)
Here we can add a ROS distribution if we want.
Next, if we require Gazebo it can be added here.
Sometimes it is useful to install common python packages for machine learning and development.
For fast inference of neural network the popular TensorRT can be added here. Please note that the .deb needs to be present in the folder since it can not be downloaded automatically (NVIDIA portal credentials required).

In the tail part the entrypoint, command etc of the docker container is defined. The next part describes a step by step instruction for the project.

Step-by-Step Guide

Prerequisites:

Properly set-up host OS (Ubuntu in our case)
docker and nvidia-docker2
NVIDIA Driver (more on this in the following)

Before we can start we need to check if the Xorg process on the remote host is running and uses the dedicated NVIDIA hardware. For this we look at the output of the command:

nvidia-smi

The output should look like this:

Output of nvidia-smi with HW-accel. Xorg process.

The important part is the HW-accelarated Xorg process. This is very important for proper functioning! If the process is missing please follow my guide on Github to fix this problem. You can find it here.

Now we can start the actual guide. First clone the repository with all the required scripts and files: https://github.com/trn84/recipe-wizard

Clone the repository:

git clone https://github.com/trn84/recipe-wizard

2. Execute the wizard:

python3 wizard.py

Follow the instructions of the script. Please be aware that for the selection of CuDNN the full version is required.

3. Check the generated Dockerfile. Take a look into the build.sh file.

It is useful to use a meaningful image tag name.

4. Create the docker image:

./build.sh

Now wait, it takes some time. The Dockefile.XXX templates can be optimized. The repo right now is optimized for Robotics and Deep Learning applications.

5. If everything went without errors take a look into the run.sh file. I will go through the arguments and parameters step-by-step:

nvidia-docker run -it \
--name=super-container \
--security-opt seccomp=unconfined \
--init \
--net=host \
--privileged=true \
--rm=true \
-e DISPLAY=:1 \
-v /tmp/.X11-unix/X0:/tmp/.X11-unix/X0:ro \
-v /etc/localtime:/etc/localtime:ro \
-v /share-docker:/share-docker \
recipe-wiz

We use nvidia-docker as command. You can also use the — runtime=nvidia flag for the same effect.
The security-opt and privileged flags are meant to allow for a close coupling of the container to the host OS. Potentially this is not required.
The flag — net=host allows for native port forwarding. Otherwise you need to specifically map the ports like VNC, ROS, Gazebo etc…
The other flags are quite standard. We have added a shared volume mount. You should edit the flags to your need.

6. Execute container creation and run:

./run.sh

This will give you a /bin/bash shell since in our template the CMD was set to /bin/bash. You can change this to your needs. We can start the desktop in the container with the following command:

./usr/local/bin/start_desktop.sh

You will see a link to the noVNC webserver. Please note that the IP/Hostname could be wrong and the correct IP needs to be replaced.

7. Start noVNC and connect to the container.

After you are connected to the Lubuntu desktop open the Terminator from the desktop and check the proper GPU features by executing:

glxinfo
glxgears
gazebo
roscore
python -> import tensorflow

You should be good to go now :-)

Sources

This project is an aggregation and combination of many different source:

Known Issues

Of course there are issues with certain combinations of OS, CUDA, CuDNN, ROS, and Gazebo with the wizard. Two specific problems are the versions of libarmadilloX (X is the major version) and libsdformatX. They depend on the chosen CUDA/OS combo and ROS/Gazebo versions. So there is some manual work required here.

I am happy to get feedback for improvements…