GPU Passthrough

GPU support in Quilt is an explicit platform feature, not a mount workaround.

If a workload needs GPU capacity, ask for it through Quilt’s scheduling contract from the start. Retrofitting raw device access later is not the supported path.

Use GPUs When

creating containers that need NVIDIA access
placing workloads on GPU-capable nodes
reporting node GPU inventory during registration or heartbeat
debugging why a workload did or did not receive GPU assignment

Example Container Request

{
  "name": "gpu-demo",
  "image": "prod",
  "gpu_count": 1,
  "gpu_ids": ["nvidia0"],
  "command": ["/bin/sh", "-lc", "nvidia-smi"]
}

Important Rules

raw /dev/nvidia* bind mounts remain blocked
gpu_count is the main request field
gpu_ids is optional and must exactly match gpu_count when supplied
invalid requests are rejected before an operation is created

Expected deterministic failures:

400 invalid GPU request shape
403 plan gating
503 CAPACITY_FULL

Cluster Tie-In

Node GPU inventory is agent-reported control-plane state and is exposed as gpu_inventory on node list and detail responses. Scheduler placement must satisfy that inventory before assigning workloads.

Rule of Thumb

Request GPUs declaratively in the create or scheduling payload. If you find yourself thinking about device mounts, you are probably leaving the supported Quilt path.