GPU Passthrough in Proxmox LXC Containers: Plex and Ollama
GPU passthrough in LXC containers isn't as straightforward as it sounds. After plenty of trial and error, I've got two NVIDIA GPUs running in Proxmox LXC containers — a GTX 1060 6GB for Plex hardware transcoding and a GTX 1050 Ti 4GB for Ollama LLM inference. Here's everything I learned along the way.
Why LXC Instead of VMs?#
Proxmox supports both VMs and LXC containers. For GPU workloads, you might expect VMs with PCIe passthrough to be the obvious choice. But LXC containers offer some compelling advantages:
- Lower Overhead: No hypervisor layer between the GPU and the application
- Shared GPU: The host and container can share the same GPU simultaneously
- Simpler Management: LXC containers are lighter weight and faster to start
- Better Integration: Direct access to host kernel and drivers
The trade-off? You need privileged containers and careful cgroup configuration.
The Hardware#
| Host | GPU | VRAM | Driver | Kernel | Use Case |
|---|---|---|---|---|---|
| pve01 | GTX 1060 6GB | 6 GB | 580.126.09 | 6.14.11-5-pve | Plex transcoding |
| pve02 | GTX 1050 Ti 4GB | 4 GB | 550.163.01 | 6.14.11-5-pve | Ollama inference |
Both GPUs are consumer-grade cards — nothing exotic. The GTX 1060 handles Plex transcoding effortlessly, and the GTX 1050 Ti is enough for smaller LLM models with GPU acceleration.
Step 1: Install NVIDIA Drivers on the Proxmox Host#
Before containers can access the GPU, you need working NVIDIA drivers on the Proxmox host itself. I followed the standard approach of installing from NVIDIA's .run package rather than Debian packages, since Proxmox's kernel versions don't always match Debian's repos.
After installation, verify with nvidia-smi:
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 580.126.09 Driver Version: 580.126.09 CUDA Version: 13.0 |
| GPU Name ... |
| 0 NVIDIA GeForce GTX 1060 6GB |
+-----------------------------------------------------------------------------------------+
Driver Compatibility Warning#
This is the gotcha that cost me hours: NVIDIA driver 550.x requires kernel 6.14.x. When Proxmox updated to kernel 6.17.x, the driver broke completely. I had to pin the kernel at 6.14.11-5-pve until NVIDIA released a compatible driver.
If you see errors about missing kernel modules after a Proxmox update, check your kernel version first. Rolling back to a compatible kernel is usually the fastest fix.
Step 2: Configure Privileged LXC Containers#
GPU passthrough requires privileged containers. This is non-negotiable — unprivileged containers can't access GPU device nodes.
If you're converting an existing unprivileged container:
# In the Proxmox web UI or via CLI:
pct set <ctid> -unprivileged 0Container Configuration#
The magic happens in /etc/pve/lxc/<ctid>.conf. Here's what I added for the Plex container (CT 113 on pve01):
# NVIDIA GPU passthrough - cgroup permissions
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 234:* rwm
lxc.cgroup2.devices.allow: c 226:* rwm
# NVIDIA device mounts
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
lxc.mount.entry: /dev/dri dev/dri none bind,optional,create=dir
# NVIDIA binaries and libraries from host
lxc.mount.entry: /usr/bin/nvidia-smi usr/bin/nvidia-smi none bind,optional,create=file
lxc.mount.entry: /usr/lib/x86_64-linux-gnu/libcuda.so.580.126.09 usr/lib/x86_64-linux-gnu/libcuda.so.1 none bind,optional,create=file
lxc.mount.entry: /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.580.126.09 usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1 none bind,optional,create=fileLet me break down what each section does:
cgroup2 device permissions (c 195:* rwm, etc.): These grant the container read/write/mknod access to NVIDIA device character nodes. Major number 195 is the NVIDIA control device, 234 is nvidia-uvm (unified virtual memory), and 226 is the DRI (Direct Rendering Infrastructure) subsystem.
Device mounts: Bind-mount the actual GPU device files from the host into the container. The optional flag prevents the container from failing to start if a device is temporarily unavailable.
Library mounts: Bind-mount the host's NVIDIA libraries directly into the container. This is critical — the container doesn't need its own NVIDIA driver installation. It shares the host's driver binaries and libraries. Note the version-specific paths (580.126.09) — these must match your installed driver version exactly.
Step 3: Plex GPU Transcoding#
With GPU access configured, enabling hardware transcoding in Plex is straightforward:
- Plex Settings → Transcoder
- Enable "Use hardware acceleration when available"
- Set Hardware transcoding device to the NVIDIA GPU
Verifying Hardware Transcoding#
Start a transcode (play something that requires transcoding) and verify GPU usage:
# Inside the Plex container
nvidia-smiYou should see a Plex process using GPU memory. During a 1080p → 480p transcode, I see about 67 MiB of GPU memory allocated.
For programmatic verification, query the Plex API:
curl -s -H 'X-Plex-Token: YOUR_TOKEN' \
'http://localhost:32400/transcode/sessions'Look for these indicators in the response:
transcodeHwRequested="1"— Hardware was requestedtranscodeHwEncodingTitle="Nvidia ()"— NVIDIA encoder activespeed> 2x — Hardware acceleration confirmed (software is typically 0.5–1x)
My test results: 1080p H.264 → 480p transcode at 2.6x realtime with NVENC encoding confirmed. Without the GPU, that same transcode crawls at 0.8x.
Permission Fix After Privilege Change#
If you converted from an unprivileged to privileged container, you'll likely need to fix file ownership:
pct exec 113 -- chown -R plex:plex /var/lib/plexmediaserver/Without this, Plex can't read its own database and configuration files.
Step 4: Ollama GPU Inference#
Ollama (CT 114 on pve02) uses the same GPU passthrough approach but with different VRAM considerations. The GTX 1050 Ti has only 4 GB of VRAM, which severely limits model selection.
The VRAM Problem#
Not all models fit in 4 GB of VRAM. When a model exceeds available VRAM, Ollama falls back to CPU-only inference. This sounds harmless, but it's actually dangerous in my setup:
- pve02 has only 7.7 GB total RAM shared across docker02, Zabbix, Pi-hole, and other services
- CPU-mode models consume system RAM instead of VRAM
- Large models + existing workloads = memory exhaustion = system freeze
I learned this the hard way when llama3:8b (4.7 GB on disk, ~5.3 GB in RAM) caused pve02 to freeze solid. No SSH, no console — had to hard-reboot from the Proxmox UI.
Model Selection Guide#
| Model | Disk | RAM Loaded | GPU Offload | Safe? |
|---|---|---|---|---|
mistral:7b-instruct-q4_0 | 4.1 GB | ~5.5 GB | 75% GPU / 25% CPU | Yes (recommended) |
starcoder2:3b | 1.7 GB | ~2 GB | 100% GPU | Yes |
llama3:8b | 4.7 GB | ~5.3 GB | 0% GPU (100% CPU) | No — causes freezes |
Rules of thumb:
- Models with disk size ≤ 3.5 GB should fully offload to the 4 GB GPU
mistral:7b-instruct-q4_0is the sweet spot — mostly GPU-accelerated with manageable CPU spillover- Anything larger than ~4.5 GB on disk will run entirely on CPU and risk memory exhaustion
Keep-Alive Configuration#
To prevent the GPU from being occupied by idle models, I set OLLAMA_KEEP_ALIVE=5m in the systemd service. Models unload from VRAM after 5 minutes of inactivity, freeing resources for other workloads.
Lessons Learned#
1. Version-Lock Your Kernel#
GPU drivers are tightly coupled to kernel versions. A routine apt upgrade can break your GPU setup if it pulls a new kernel. Pin your kernel version or test upgrades on a non-GPU node first.
2. Driver Versions Must Match Everywhere#
The library paths in the LXC config (libcuda.so.580.126.09) are version-specific. When you update the host driver, you must update the container config to match. Forget this, and nvidia-smi inside the container will fail silently or report mismatched versions.
3. VRAM Is Your Hard Limit#
With consumer GPUs, VRAM is the constraint that matters most. Plan your workloads around it. My 6 GB card handles Plex transcoding with room to spare, but the 4 GB card requires careful model selection for Ollama.
4. Privileged Containers Have Security Implications#
Privileged LXC containers have broader access to the host system. Keep them on trusted networks and limit what runs inside them. Don't put untrusted workloads in a privileged GPU container.
5. Test After Every Host Update#
Proxmox updates, kernel upgrades, and driver updates can all break GPU passthrough. After any system update on a GPU host, verify nvidia-smi works both on the host and inside each container before calling it done.
What's Next#
The current setup handles my needs — Plex transcoding and small LLM inference — but I'm watching the market for affordable GPUs with more VRAM. An 8 GB or 12 GB card on pve02 would open up larger language models without the memory-exhaustion risk.
For now, the GTX 1050 Ti running Mistral 7B with partial GPU offload is a solid workaround. It powers the AI-driven Zabbix alert analysis workflow through n8n, providing meaningful analysis of infrastructure problems without breaking the bank.
For a broader view of how these GPUs fit into my overall infrastructure, check out Building My Homelab.