Engineering Architecture Brief
NVIDIA UFM® (Unified Fabric Manager) is a comprehensive fabric management and monitoring platform designed for InfiniBand and high-performance Ethernet AI networks. It provides real-time visibility, telemetry, analytics, and automation for large-scale AI factories, HPC clusters, and cloud data center fabrics. UFM enables administrators to discover, monitor, and optimize network performance, detect congestion hotspots, automate provisioning, and troubleshoot issues across thousands of nodes and interconnects. It integrates deeply with NVIDIA Quantum InfiniBand switches and supports advanced features such as predictive analytics, fabric-wide orchestration, and AI-driven optimization for mission-critical workloads.
NVIDIA UFM® (Unified Fabric Manager)
Advanced InfiniBand and AI fabric management platform providing real-time monitoring, automation, and performance optimization for large-scale HPC and AI networks.
