Topograph Node Labels and Annotations
Topograph enriches Kubernetes nodes with labels and annotations that describe their physical network topology. This reference covers every label and annotation key written by Topograph, how values are derived, and how to configure them.
Labels
Labels are set by the Kubernetes engine (engine: k8s) and the Slinky engine (engine: slinky). They are intended for use by workload schedulers (e.g. KAI Scheduler, gang-scheduling plugins, topology-aware bin-packers) and observability tools to reason about network locality.
Default label keys
Labels are additive: a node that belongs to both a block topology (NVLink domain) and a tree topology (switch fabric) carries both accelerator and leaf/spine/core simultaneously.
Not all providers produce both topology types:
Relationship to nvidia.com/gpu.clique: The GPU Operator device plugin sets nvidia.com/gpu.clique on nodes with Multi-Node NVLink (MNNVL) GPUs. The infiniband-bm and infiniband-k8s providers derive their accelerator value from the same ClusterUUID.CliqueId hardware identifiers, so the values are directly comparable. The netq provider uses a DomainUUID from the NMX management API — a different identifier that refers to the same physical domain but cannot be compared as a string.
NVIDIA Fabric Manager runs at node init on MNNVL-capable hardware, discovers the NVLink fabric across GPUs, and registers each GPU with NVML (NVIDIA Management Library — a C API that exposes per-GPU state). The GPU Operator’s IMEX labeler writes nvidia.com/gpu.clique only once NVML reports the node’s fabric state as GPU_FABRIC_STATE_COMPLETED — meaning Fabric Manager finished initialization successfully and the node is part of an NVLink domain.
On non-MNNVL systems (e.g., DGX B200, B300), the GPU fabric never reaches GPU_FABRIC_STATE_COMPLETED, so nvidia.com/gpu.clique is not set at all. On these systems, Topograph with an InfiniBand provider is the only source of network topology for scheduling decisions.
Choosing between accelerator and nvidia.com/gpu.clique for scheduling
Workload schedulers consuming topology labels may need to choose between Topograph’s network.topology.nvidia.com/accelerator and the NVIDIA GPU Operator’s nvidia.com/gpu.clique. The right choice depends on the provider and the desired granularity:
- MNNVL hardware + Fabric Manager completed + NVL Partition granularity desired: prefer
nvidia.com/gpu.clique. On the AWS provider this is finer granularity thanaccelerator(which carries the CapacityBlockId, i.e., the NVL Domain). On DRA, InfiniBand, and Lambda AI providers the two labels carry the same value. - MNNVL but Fabric Manager not yet completed, or non-MNNVL hardware:
nvidia.com/gpu.cliqueis absent. Usenetwork.topology.nvidia.com/accelerator. - Slurm clusters (no Kubernetes node labels): neither label applies. Consumers read Slurm’s
topology.confdirectly.
Caveats when preferring nvidia.com/gpu.clique:
-
The label encodes node identity within MNNVL domains, not fabric proximity between them. NVL Partition is encoded as the full
<ClusterUUID>.<CliqueID>value; NVL Domain is encoded as theClusterUUIDprefix. A scheduler can therefore distinguish racks — two nodes with differentClusterUUIDare in different NVL Domains — and act on that distinction (same-Domain affinity to pack a job onto a single rack, cross-Domain anti-affinity to spread independent jobs across racks). What the label does not encode is the physical proximity between Domains:ClusterUUIDs are opaque identifiers, so the label cannot tell a scheduler which racks share a top-of-rack switch, an aggregation tier, or a core. For cross-rack proximity-aware placement, Topograph populates the following labels from the InfiniBand or NetQ providers regardless of whethergpu.cliqueis present:- Same top-of-rack switch (cross-rack within a first-tier fabric) — Topograph’s
leaflabel. - Same second-tier aggregation (typically Scalable-Unit / pod-scale grouping above individual racks) — Topograph’s
spinelabel. - Same third-tier aggregation (present in large three-tier fabrics — typically cross-SU grouping in multi-SU SuperPOD deployments) — Topograph’s
corelabel.
These labels are also relevant for mixed-workload fragmentation avoidance (see
docs/engines/k8s.md§ Mixed Workload Considerations). - Same top-of-rack switch (cross-rack within a first-tier fabric) — Topograph’s
-
The label is refreshed by GPU Feature Discovery at its configured interval (the k8s-device-plugin default is 60s) rather than propagated instantly. Fabric-state changes in the window between refreshes are not yet reflected in the label.
-
Persistence of
ClusterUUID/CliqueIDacross node reboots is administratively controlled via Fabric Manager’sFABRIC_MODE_RESTARTconfiguration (default: preserve partition configurations). Deployments that disable preservation may see identifiers change across restarts, which can invalidate scheduler state cached on those values.
Label value behavior
Label values are used as-is when they are 63 characters or shorter (the Kubernetes label value limit). Values longer than 63 characters are replaced with their FNV-64a hash rendered as an x-prefixed lowercase hex string (e.g., x3e4f1a2b3c4d5e6f) to stay within the limit. This means two nodes with the same long switch identifier will carry the same hash value — locality is preserved, but the original identifier is not recoverable from the label alone.
Configuring label keys
The default network.topology.nvidia.com/ prefix is configurable via the Helm topologyNodeLabels value. If you need to map topograph’s topology layers to a custom label schema, override the keys at deploy time. The label values (topology identifiers) are always derived from the provider’s topology discovery and cannot be configured.
Relationship to upstream standardization (KEP-4962)
An active Kubernetes Enhancement Proposal (KEP), KEP-4962: Standardizing the Representation of Cluster Network Topology (draft in PR #4965), advocates reserved label keys under the topology.kubernetes.io/ namespace for a standardized representation of cluster network topology. The KEP is pre-GA and still under upstream review. Topograph’s current network.topology.nvidia.com/* keys predate any potential upstream standard and are presently vendor-scoped — the KEP’s framing allows vendor prefixes and standard labels to coexist rather than replace one another. If KEP-4962 reaches GA with stable keys, Topograph will evaluate aligning or providing both; for now, the network.topology.nvidia.com/* keys remain authoritative for Topograph-deployed clusters.
Without Topograph
When Topograph is not deployed, the labels commonly available for topology-aware scheduling are:
These labels are set by cloud provider integrations and the NVIDIA GPU Operator’s GPU Feature Discovery (GFD) component — not by Topograph.
Annotations
Topograph sets the following annotations on nodes as internal bookkeeping metadata. These are not intended for scheduler use but may be useful for debugging and observability.
Additional annotations are set on topology ConfigMaps (used by the Slinky engine):
Integration with NVSentinel
NVSentinel’s Metadata Augmentor enriches health events with node labels from a configurable allowedLabels list. As of NVSentinel #1226 (merged 2026-04-23; shipping in the next NVSentinel release), the four network.topology.nvidia.com/* labels are included in the default allowedLabels — so on clusters where Topograph is deployed, NVSentinel propagates topology into health event metadata automatically, with no operator configuration required. Downstream consumers — fault-quarantine CEL rules, remediation custom resources, dashboards, blast-radius analysis — can then reason about topological locality at NVL Partition, NVL Domain, or switch-hierarchy level.
NVSentinel’s Metadata Augmentor skips labels that aren’t present on a node, so nodes without Topograph (or MNNVL-only labels on non-MNNVL hardware) behave cleanly — no configuration conditionals needed.
Operators on earlier NVSentinel versions, or operators running a customized allowedLabels list, can add the Topograph labels explicitly in distros/kubernetes/nvsentinel/values.yaml:
See NVSentinel’s docs/INTEGRATIONS.md § Topology Awareness (Topograph).