reproducible-deployment

1 article
sort: new top best
clear filter
0 2/10

NVIDIA's AI Cluster Runtime is an open-source project that provides validated, reproducible Kubernetes cluster configurations for GPU-accelerated AI workloads through layered recipes, CLI tooling, and validation mechanisms. It enables consistent deployment across different cloud environments and hardware by capturing exact component versions, dependencies, and configuration parameters.

AI Cluster Runtime NVIDIA Kubernetes Amazon EKS Kubeflow Trainer NVIDIA Dynamo NVIDIA GPU Operator NCCL CNCF Certified Kubernetes AI Conformance Program H100 Blackwell ArgoCD Mark Chmarny Nathan Taber
developer.nvidia.com · mchmarny · 1 day ago · details · hn