Kubernetes + NVIDIA on K3S
Goal: Setup a Kubernetes node to expose NVIDIA GPU so that GPU loads (AI, Crypto, etc…) can run on Kubernetes:
Platform:
- Debian 12
- AMD64/x86_64
- NVIDIA RTX 3070
- Kubernetes (K3S)
What are we trying to do?
How do we do it?
K3S update/gotcha
You must run an up-to-date Kubernetes. Older versions (then this blog post…) of K3S will error on NVIDIA enabled node:
executing \"compiled_template\" at <.SystemdCgroup>: can't evaluate field SystemdCgroup in type templates.ContainerdRuntimeConfig"
Already fixed: https://github.com/k3s-io/k3s/issues/8754
Workaround to install specific commit from ticket:
curl -sfL https://get.k3s.io | INSTALL_K3S_COMMIT=1ae053d9447229daf8bbd2cd5adf89234e203bcc sh -s - --disable traefik --disable servicelb
Zero to hero
Bare Metal
- Install NVIDIA CUDA drivers
- NVIDIA Container Toolkit
- Install/upgrade K3S
Reboot as needed, sudo grep -i nvidia /var/lib/rancher/k3s/agent/etc/containerd/config.toml
will match lines if installation was successful.
Kubernetes Node
To expand on the K3S notes above:
1. RuntimeClass
Deploy the RuntimeClass
(from K3s docs - essential! - this selects the GPU enabled container runtime that K8S sets up for us)
apiVersion: node.k8s.io/v1
kind: RuntimeClass
metadata:
name: nvidia
handler: nvidia
2. Device Plugin
Deploy the NVIDIA device plugin via Helm and use the RuntimeClass
configured above:
helm upgrade -i nvdp nvdp/nvidia-device-plugin \
--namespace nvidia-device-plugin \
--create-namespace \
--version 0.14.2 \
--set runtimeClassName=nvidia
# tricks to expose NVIDIA device nodes under /dev. Not normally needed
# --set deviceListStrategy=volume-mounts
# --set compatWithCPUManager=true
NVIDIA device plugin should have labelled the node as having an NVIDIA GPU:
kubectl describe node | grep nvidia.com/gpu
3. GPU Feature discovery
Not all GPUs are created equal. This extra pod labels Kubernetes Nodes with GPU Features supported. Deploy NVIDIA GPU Feature Discovery via Helm and configure RuntimeClass
again:
helm upgrade -i nvgfd nvgfd/gpu-feature-discovery \
--version 0.8.2 \
--namespace gpu-feature-discovery \
--create-namespace \
--set runtimeClassName=nvidia
Test labeling by looking for a bunch of new nvidia.com
labels on the node:
kubectl describe node |grep nvidia.com
4. Deploy the benchmark pod (sample workload - from k3s)
apiVersion: v1
kind: Pod
metadata:
name: nbody-gpu-benchmark
namespace: default
spec:
restartPolicy: OnFailure
runtimeClassName: nvidia
containers:
- name: cuda-container
image: nvcr.io/nvidia/k8s/cuda-sample:nbody
args: ["nbody", "-gpu", "-benchmark"]
resources:
limits:
nvidia.com/gpu: 1
env:
- name: NVIDIA_VISIBLE_DEVICES
value: all
- name: NVIDIA_DRIVER_CAPABILITIES
value: all
If everything is working:
- The pod will run and go to state
completed
- Pod logs should looks something like:
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies=<N> (number of bodies (>= 1) to run in simulation)
-device=<d> (where d=0,1,2.... for the CUDA device to use)
-numdevices=<i> (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6
> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3070]
47104 bodies, total time for 10 iterations: 38.352 ms
= 578.534 billion interactions per second
= 11570.683 single-precision GFLOP/s at 20 flops per interaction
Deploy GPU workloads
The previous step proved that CUDA pods are working so now its time to create your own CUDA containers. Some hints:
- Container needs to embed a CUDA runtime. The easiest way to do this is to use the cuda image from NVIDIA
- GPU not working in container/strange errors? First step is to run
nvidia-smi
inside the container. It should give the same output as the node - Strange errors running GPU workloads - check you are using the “right” CUDA version, eg nvidia base image matches what app was compiled against
- Example Dockerfile (GPU mining)
- Example k8s deployment
---
apiVersion: apps/v1
kind: Deployment
metadata:
namespace: crypto
name: bzminer
spec:
replicas: 1
selector:
matchLabels:
app: bzminer
template:
metadata:
name: bzminer
labels:
app: bzminer
spec:
hostname: bzminer
runtimeClassName: nvidia
containers:
- name: bzminer
image: quay.io/declarativesystems/cryptodaemons_bzminer:17.0.0
imagePullPolicy: Always
args:
- "-a"
- "meowcoin"
- "-w"
- "MGq7UPAASNwzTKWPKjrsrJxyDxpwdvdTr5"
- "-r"
- "cloud"
- "-p"
- "stratum+tcp://stratum.coinminerz.com:3323"
- "--nc"
- "1"
env:
- name: NVIDIA_VISIBLE_DEVICES
value: all
- name: NVIDIA_DRIVER_CAPABILITIES
value: all
resources:
limits:
nvidia.com/gpu: 1
restartPolicy: Always
Improvements
These examples dedicate 1 GPU to 1 workload. Its possible to do timesharing as well so that one GPU can be shared between a bunch of apps. This is left an an exercise for the reader ;-)