Tooling

GPU Passthrough

Request GPUs through Quilt's explicit scheduling contract.

GPU support in Quilt is an explicit platform feature, not a mount workaround.

If a workload needs GPU capacity, ask for it through Quilt’s scheduling contract from the start. Retrofitting raw device access later is not the supported path.

Use GPUs When

  • creating containers that need NVIDIA access
  • placing workloads on GPU-capable nodes
  • reporting node GPU inventory during registration or heartbeat
  • debugging why a workload did or did not receive GPU assignment

Example Container Request

{
  "name": "gpu-demo",
  "image": "prod",
  "gpu_count": 1,
  "gpu_ids": ["nvidia0"],
  "command": ["/bin/sh", "-lc", "nvidia-smi"]
}

Important Rules

  • raw /dev/nvidia* bind mounts remain blocked
  • gpu_count is the main request field
  • gpu_ids is optional and must exactly match gpu_count when supplied
  • invalid requests are rejected before an operation is created

Expected deterministic failures:

400 invalid GPU request shape
403 plan gating
503 CAPACITY_FULL

Cluster Tie-In

Node GPU inventory is agent-reported control-plane state and is exposed as gpu_inventory on node list and detail responses. Scheduler placement must satisfy that inventory before assigning workloads.

Rule of Thumb

Request GPUs declaratively in the create or scheduling payload. If you find yourself thinking about device mounts, you are probably leaving the supported Quilt path.