JustToThePoint English Website Version
JustToThePoint en español
Colaborate with us

Run Ollama with GPU Acceleration in Docker on NixOS

I have yet to see any problem, however complicated, which, when you looked at it in the right way, did not become still more complicated, Poul Anderson

Computers are like Old Testament gods; lots of rules and no mercy, Joseph Campbell

Anime girl

Ollama is an open-source machine learning model that works similarly to OpenAI’s ChatGPT. It makes it easy to get up and running with large language models locally, making it accessible for users who prefer not to rely on cloud-based solutions. We are going to configure a NixOS environment for running Ollama with GPU support using Docker.

NixOS GPU Configuration (OpenGL, NVIDIA Drivers, DRI)

To leverage GPU acceleration for running Ollama in Docker containers on NixOS, you need to configure the system properly. Here’s a step by step breakdown of the configuration settings in configuration.nix to enable graphics support and proprietary NVIDIA drivers:

# Allow proprietary NVIDIA drivers (non-free license).
nixpkgs.config.allowUnfree = true;
# OpenGL Settings
hardware.opengl = {
    enable = true; # Activates OpenGL support, essential for graphical applications and GPU computations.
    # GPU compute: CUDA (a parallel computing platform and application programming interface (API) developed by NVIDIA) and OpenCL (Open Computing Language, an open standard for parallel programming across heterogeneous systems).
    driSupport = true; # Enables Direct Rendering Infrastructure (DRI) to allow direct access to the GPU.
    driSupport32Bit = true; # Provides support for 32-bit applications, which is important if you run legacy or specific software that requires it.
    # Needed for Steam/Proton/WINE compatibility
    setLdLibraryPath = true; # Ensures that the library path is set correctly for OpenGL drivers.
};

# NVIDIA GPU Configuration
hardware.nvidia = {
    modesetting.enable = true; # Required to load the NVIDIA kernel module.
    open = false; # Indicates that proprietary (closed-source) NVIDIA drivers should be used instead of the open-source alternatives.
    nvidiaSettings = true; # Enables the NVIDIA settings utility for managing GPU settings (not needed for CLI).
    package = config.boot.kernelPackages.nvidiaPackages.stable;
    # Specifies the version of the NVIDIA driver package to use.
  };

services.xserver.videoDrivers = [ "nvidia" ];
# Specifies that the NVIDIA driver should be used for the X server, enabling graphical applications to utilize the GPU.
# Required for X11-based containers (omit if pure Wayland)

Install Docker and NVIDIA Container Toolkit

Let’s enable Docker in NixOS: (edit configuration.nix, then rebuild)

virtualisation.docker = {
  enable = true;              # Install Docker service
  enableOnBoot = true;
  package = pkgs.docker_25;   # (Optional) Use Docker 25+, the latest Docker API version for NVIDIA
};
oci-containers.backend = "docker";
# Next enable the NVIDIA Container Toolkit so Docker can see the GPU.
hardware.nvidia-container-toolkit.enable = true;
users.users.nmaximo7.extraGroups = [ "docker" ];
# Allow running Docker without sudo (REPLACE 'nmaximo7' WITH YOUR USERNAME!)
# You can verify Docker by running a test container with GPU access (CUDA + GPU visibility), for example:
docker run --rm --device nvidia.com/gpu=all nvidia/cuda:12.9.0-base-ubuntu24.04 nvidia-smi
# Image not found
Unable to find image 'nvidia/cuda:12.9.0-base-ubuntu24.04' locally
# Docker starts downloading the image from the NVIDIA repository.
12.9.0-base-ubuntu24.04: Pulling from nvidia/cuda
0622fac788ed: Pull complete
2b2da1c48640: Pull complete
b2276ea4fcfd: Pull complete
2311d82dd6d8: Pull complete
1bba15468fcc: Pull complete
Digest: sha256:48e21b10467354655f5073c05eebdeaac9818c6b40d70f334f7ad2df000463d8
# Confirms that the image is now available locally.
Status: Downloaded newer image for nvidia/cuda:12.9.0-base-ubuntu24.04
Tue May 20 03:51:59 2025
# Displays the current status of the NVIDIA GPU, including driver and CUDA versions.
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05             Driver Version: 550.127.05     CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 970         Off |   00000000:01:00.0  On |                  N/A |
| 32%   44C    P8             23W /  170W |     366MiB /   4096MiB |      3%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+

nmaximo7 on nixos ~ took 6s

Here’s a breakdown of the Docker command:

  1. docker run: Runs a command in a new container.
  2. ‐‐rm: Automatically removes the container when it exits.
  3. ‐‐–device nvidia.com/gpu=all: Grants the container access to all NVIDIA GPUs on the host.
  4. nvidia/cuda:12.9.0-base-ubuntu24.04: Specifies the Docker image to use, which is the NVIDIA CUDA base image for Ubuntu 24.04.
  5. nvidia-smi: The command run inside the container, which queries the NVIDIA driver and GPU status.docker

Running the Ollama Container

Ollama provides an official Docker image that can use the NVIDIA GPU. We will run it with a named Docker volume for persistence and map the API port (11434) to the host.

# Let's pull the ollama/ollama image is available.
docker pull ollama/ollama
# Then, run the container
docker run -d \
  --device nvidia.com/gpu=all \
  -v ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

docker ps # Check if container is running
docker logs ollama # View container logs
docker exec ollama nvidia-smi # If everything is set up correctly, this will display the GTX 970 (my graphic card) and its utilization

Using Ollama via CLI and Loading Models

# Enter the ollama container, pull and run models
# exec -it: Executes a command in an interactive terminal inside the running container.
# ollama: This is the name of the running Docker container in which the command will be executed.
# ollama run: This part indicates we are calling the run command from the ollama application or service inside the container.
# deepseek-r1:32b: This specifies the resource (model) that we want to run within the ollama application.
❯ docker exec -it ollama ollama run deepseek-r1:32b
pulling manifest
pulling 6150cb382311: 100% ▕███████▏  19 GB
pulling 369ca498f347: 100% ▕███████▏  387 B
pulling 6e4c38e1172f: 100% ▕███████▏ 1.1 KB
pulling f4d24e9138dd: 100% ▕███████▏  148 B
pulling c7f3ea903b50: 100% ▕███████▏  488 B
verifying sha256 digest
writing manifest
success
>>> Could you explain to me Ollama, and running it inside a Docker co
... ntainer

Okay, so I need to understand what Ollama is and how to run it
inside a Docker container. Let's start with the basics.

First, I know that Ollama is an AI tool, but I'm not exactly
sure what it does. The user mentioned it's for running ML
models locally, so maybe it helps in deploying machine learning
models on personal devices without relying on cloud services?
That makes sense because sometimes people want to keep their
data private or have faster access.
[...]

>>> /?
Available Commands:
  /set            Set session variables
  /show           Show model information
  /load    Load a session or model
  /save    Save your current session
  /clear          Clear session context
  /bye            Exit
  /?, /help       Help for a command
  /? shortcuts    Help for keyboard shortcuts

Use """ to begin a multi-line message.

>>> /bye

# Direct API call
curl http://localhost:11434/api/generate -d '{
  "model": "deepseek-r1:32b",
  "prompt": "Explain NixOS containers"
}'

docker exec -it ollama ollama run deepseek-r1:671b
docker exec -it ollama ollama rm deepseek-r1:671b
# It removes the specified resource (deepseek-r1:671b) from the ollama application running inside the ollama Docker container.

docker stop ollama # Stop the container
docker rm ollama # Remove the container
Bitcoin donation

JustToThePoint Copyright © 2011 - 2025 Anawim. ALL RIGHTS RESERVED. Bilingual e-books, articles, and videos to help your child and your entire family succeed, develop a healthy lifestyle, and have a lot of fun. Social Issues, Join us.

This website uses cookies to improve your navigation experience.
By continuing, you are consenting to our use of cookies, in accordance with our Cookies Policy and Website Terms and Conditions of use.