News
Accelerate Token Production in AI Factories Using Unified Services and Real-Time AI
5+ hour, 26+ min ago (356+ words) Operations teams and administrators need more than dashboards. They need flexibility and foresight. In one example where NVIDIA had MAX-Q profile in operation, domain power service allowed the data center to run at 85% power with only 7% throughput loss. It was…...
NVIDIA Extreme Co-Design Delivers New MLPerf Inference Records
5+ hour, 18+ min ago (671+ words) Co-designed hardware, software, and models are key to delivering the highest AI factory throughput and lowest token cost. Measuring this goes far beyond peak chip specifications. Rigorous AI inference performance benchmarks are critical to understanding real-world token output, which drives…...
CUDA Tile Programming Now Available for BASIC!
4+ hour, 26+ min ago (949+ words) CUDA 13.1 introduced CUDA Tile, a next generation tile-based GPU programming paradigm designed to make fine-grained parallelism more accessible and flexible. One of its key strengths is language openness: any programming language can target CUDA Tile, enabling developers to bring tile-based…...
Stream High-Fidelity Spatial Computing Content to Any Device with NVIDIA CloudXR 6.0
1+ day, 2+ hour ago (675+ words) At NVIDIA GTC 2026, NVIDIA CloudXR 6.0 introduced a universal OpenXR-based streaming runtime that works across headsets, operating systems, and browsers'including native visionOS integration. This post walks through how the CloudXR 6.0 architecture works and how to start building today. The release focuses…...
Build and Stream Browser-Based XR Experiences with NVIDIA CloudXR.js
1+ day, 2+ hour ago (1103+ words) Delivering high-fidelity VR and AR experiences to enterprise users has typically required native application development, custom device management, and complex deployment pipelines. Now, with the new JavaScript SDK NVIDIA CloudXR.js, developers can stream GPU-rendered immersive content directly to a…...
NVIDIA Grace CPU Delivers High Bandwidth and Efficiency for Modern Data Centers
3+ mon, 3+ week ago (372+ words) In this blog post, we'll explore the advantages of the Grace Non-Uniform Memory Access (NUMA) monolithic architecture. We'll dive into memory bandwidth per-core, scalability, and efficiency, and compare its design approach to traditional x86 chiplet-based CPUs. Figure 1 below shows the NVIDIA…...
Maximize AI Infrastructure Throughput by Consolidating Underutilized GPU Workloads
1+ week, 3+ hour ago (706+ words) Solving this isn't just about cost reduction'it's about optimizing cluster density to serve more concurrent users on the same world-class hardware. This guide details how to implement and benchmark GPU partitioning strategies, specifically NVIDIA Multi-Instance GPU (MIG) and time-slicing to…...
How Centralized Radar Processing on NVIDIA DRIVE Enables Safer, Smarter Level 4 Autonomy
1+ week, 4+ hour ago (841+ words) The real 3D/4D "image" signal is instead processed inside the edge device. The radar outputs objects, or in some cases point clouds, which is similar to a camera outputting a classical CV Canny edge'detection image. In this blog, we explain how…...
Designing Protein Binders Using the Generative Model Proteina-Complexa
1+ week, 7+ hour ago (895+ words) To address these challenges, NVIDIA has released Proteina-Complexa, a generative model that designs de novo protein binders and enzymes." In this post, we detail the key technologies behind Proteina-Complexa, explore primary use cases, and highlight the extensive experimental validation of…...
Building NVIDIA Nemotron 3 Agents for Reasoning, Multimodal RAG, Voice, and Safety
1+ week, 1+ day ago (923+ words) Agentic AI is an ecosystem where specialized models work together to handle planning, reasoning, retrieval, and safety guardrailing. As these systems scale, developers need models that can understand real-world multimodal data, converse naturally with users globally, and operate safely across…...