I have yet to see any problem, however complicated, which, when you looked at it in the right way, did not become still more complicated, Poul Anderson
Computers are like Old Testament gods; lots of rules and no mercy, Joseph Campbell
Ollama is an open-source machine learning model that works similarly to OpenAI’s ChatGPT. It makes it easy to get up and running with large language models locally, making it accessible for users who prefer not to rely on cloud-based solutions. We are going to configure a NixOS environment for running Ollama with GPU support using Docker.
To leverage GPU acceleration for running Ollama in Docker containers on NixOS, you need to configure the system properly. Here’s a step by step breakdown of the configuration settings in configuration.nix to enable graphics support and proprietary NVIDIA drivers:
# Allow proprietary NVIDIA drivers (non-free license).
nixpkgs.config.allowUnfree = true;
# OpenGL Settings
hardware.opengl = {
enable = true; # Activates OpenGL support, essential for graphical applications and GPU computations.
# GPU compute: CUDA (a parallel computing platform and application programming interface (API) developed by NVIDIA) and OpenCL (Open Computing Language, an open standard for parallel programming across heterogeneous systems).
driSupport = true; # Enables Direct Rendering Infrastructure (DRI) to allow direct access to the GPU.
driSupport32Bit = true; # Provides support for 32-bit applications, which is important if you run legacy or specific software that requires it.
# Needed for Steam/Proton/WINE compatibility
setLdLibraryPath = true; # Ensures that the library path is set correctly for OpenGL drivers.
};
# NVIDIA GPU Configuration
hardware.nvidia = {
modesetting.enable = true; # Required to load the NVIDIA kernel module.
open = false; # Indicates that proprietary (closed-source) NVIDIA drivers should be used instead of the open-source alternatives.
nvidiaSettings = true; # Enables the NVIDIA settings utility for managing GPU settings (not needed for CLI).
package = config.boot.kernelPackages.nvidiaPackages.stable;
# Specifies the version of the NVIDIA driver package to use.
};
services.xserver.videoDrivers = [ "nvidia" ];
# Specifies that the NVIDIA driver should be used for the X server, enabling graphical applications to utilize the GPU.
# Required for X11-based containers (omit if pure Wayland)
Let’s enable Docker in NixOS: (edit configuration.nix, then rebuild)
virtualisation.docker = {
enable = true; # Install Docker service
enableOnBoot = true;
package = pkgs.docker_25; # (Optional) Use Docker 25+, the latest Docker API version for NVIDIA
};
oci-containers.backend = "docker";
# Next enable the NVIDIA Container Toolkit so Docker can see the GPU.
hardware.nvidia-container-toolkit.enable = true;
users.users.nmaximo7.extraGroups = [ "docker" ];
# Allow running Docker without sudo (REPLACE 'nmaximo7' WITH YOUR USERNAME!)
# You can verify Docker by running a test container with GPU access (CUDA + GPU visibility), for example:
docker run --rm --device nvidia.com/gpu=all nvidia/cuda:12.9.0-base-ubuntu24.04 nvidia-smi
# Image not found
Unable to find image 'nvidia/cuda:12.9.0-base-ubuntu24.04' locally
# Docker starts downloading the image from the NVIDIA repository.
12.9.0-base-ubuntu24.04: Pulling from nvidia/cuda
0622fac788ed: Pull complete
2b2da1c48640: Pull complete
b2276ea4fcfd: Pull complete
2311d82dd6d8: Pull complete
1bba15468fcc: Pull complete
Digest: sha256:48e21b10467354655f5073c05eebdeaac9818c6b40d70f334f7ad2df000463d8
# Confirms that the image is now available locally.
Status: Downloaded newer image for nvidia/cuda:12.9.0-base-ubuntu24.04
Tue May 20 03:51:59 2025
# Displays the current status of the NVIDIA GPU, including driver and CUDA versions.
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.127.05 Driver Version: 550.127.05 CUDA Version: 12.4 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 970 Off | 00000000:01:00.0 On | N/A |
| 32% 44C P8 23W / 170W | 366MiB / 4096MiB | 3% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
+-----------------------------------------------------------------------------------------+
nmaximo7 on nixos ~ took 6s
❯
Here’s a breakdown of the Docker command:
Ollama provides an official Docker image that can use the NVIDIA GPU. We will run it with a named Docker volume for persistence and map the API port (11434) to the host.
# Let's pull the ollama/ollama image is available.
docker pull ollama/ollama
# Then, run the container
docker run -d \
--device nvidia.com/gpu=all \
-v ollama:/root/.ollama \
-p 11434:11434 \
--name ollama \
ollama/ollama
docker ps # Check if container is running
docker logs ollama # View container logs
docker exec ollama nvidia-smi # If everything is set up correctly, this will display the GTX 970 (my graphic card) and its utilization
# Enter the ollama container, pull and run models
# exec -it: Executes a command in an interactive terminal inside the running container.
# ollama: This is the name of the running Docker container in which the command will be executed.
# ollama run: This part indicates we are calling the run command from the ollama application or service inside the container.
# deepseek-r1:32b: This specifies the resource (model) that we want to run within the ollama application.
❯ docker exec -it ollama ollama run deepseek-r1:32b
pulling manifest
pulling 6150cb382311: 100% ▕███████▏ 19 GB
pulling 369ca498f347: 100% ▕███████▏ 387 B
pulling 6e4c38e1172f: 100% ▕███████▏ 1.1 KB
pulling f4d24e9138dd: 100% ▕███████▏ 148 B
pulling c7f3ea903b50: 100% ▕███████▏ 488 B
verifying sha256 digest
writing manifest
success
>>> Could you explain to me Ollama, and running it inside a Docker co
... ntainer
Okay, so I need to understand what Ollama is and how to run it
inside a Docker container. Let's start with the basics.
First, I know that Ollama is an AI tool, but I'm not exactly
sure what it does. The user mentioned it's for running ML
models locally, so maybe it helps in deploying machine learning
models on personal devices without relying on cloud services?
That makes sense because sometimes people want to keep their
data private or have faster access.
[...]
>>> /?
Available Commands:
/set Set session variables
/show Show model information
/load Load a session or model
/save Save your current session
/clear Clear session context
/bye Exit
/?, /help Help for a command
/? shortcuts Help for keyboard shortcuts
Use """ to begin a multi-line message.
>>> /bye
# Direct API call
curl http://localhost:11434/api/generate -d '{
"model": "deepseek-r1:32b",
"prompt": "Explain NixOS containers"
}'
docker exec -it ollama ollama run deepseek-r1:671b
docker exec -it ollama ollama rm deepseek-r1:671b
# It removes the specified resource (deepseek-r1:671b) from the ollama application running inside the ollama Docker container.
docker stop ollama # Stop the container
docker rm ollama # Remove the container