NVIDIA GTC Keynote: Huang Unveils “AI Factories,” Blackwell-Rubin Roadmap, and $1T Demand View

NVIDIA (NASDAQ:NVDA) founder and CEO Jensen Huang used his GTC keynote to frame the company as a “platform company” built around three core offerings: its CUDA-X software stack, NVIDIA systems, and a newer concept he described as “AI factories.” Huang repeatedly returned to a central theme: ecosystems and vertical integration paired with open, horizontal integration across clouds, OEMs, and enterprise platforms.

CUDA at 20 years and the “flywheel” of installed base

Huang said GTC marked the 20th anniversary of CUDA, which he described as foundational to NVIDIA’s strategy and installed base. He argued that CUDA’s growth created a reinforcing cycle: a large installed base attracts developers, developers create new algorithms and breakthroughs, those breakthroughs open new markets, and markets expand the installed base.

He also emphasized that NVIDIA supports “every single phase of the AI life cycle,” and suggested that the breadth of CUDA-enabled applications extends the useful life of deployed GPUs. As one example, he said cloud pricing for Ampere-based instances has been rising even years after launch, attributing it to continued software optimization and broad application support.

Accelerating structured and unstructured data with cuDF and cuVS

Huang highlighted what he called the “ground truth” role of structured data in enterprise computing and argued that AI systems and agents will increasingly rely on both structured databases and unstructured data sources such as PDFs, video, and audio. NVIDIA, he said, built two “foundational libraries” for this shift:

  • cuDF for accelerating data frames and structured data processing
  • cuVS for vector search and semantic access to unstructured data

He announced several examples of adoption and integrations discussed during the keynote:

  • IBM is accelerating watsonx.data SQL engines with NVIDIA GPU computing libraries. In a case study cited in the presentation, IBM said Nestlé ran a specific “order to cash” data mart refresh workload five times faster at 83% lower cost using accelerated watsonx.data on NVIDIA GPUs compared with CPUs.
  • Dell worked with NVIDIA on the “Dell AI Data Platform,” integrating cuDF and cuVS, with an example referencing work with NTT DATA.
  • Google Cloud work included accelerating Vertex AI and BigQuery; Huang cited an example with Snapchat that he said reduced computing cost by nearly 80%.

Huang argued that with “Moore’s Law” slowing, accelerated computing and ongoing algorithm optimization are key to improving performance and lowering costs.

Cloud partnerships, confidential computing, and “token factories”

Huang described NVIDIA’s role with major cloud providers as deeply integrated across services, while also acting as a conduit for customers and developers to land workloads in those clouds. He cited work across Google Cloud, AWS, Microsoft Azure, Oracle, and AI-native cloud providers such as CoreWeave.

He also pointed to confidential computing as a capability he said NVIDIA GPUs pioneered, emphasizing use cases where operators should not be able to view customer data or models.

A major portion of the keynote focused on what Huang called the “inference inflection”—the shift from AI being primarily training-driven to inference-driven as models reason, plan, and perform tasks. He said growing agentic use cases increase both token volume and compute requirements dramatically, describing a surge in demand he characterized as orders of magnitude higher over the past two years.

Huang said that at last year’s GTC he saw roughly $500 billion of “very high confidence demand and purchase orders” for Blackwell and Rubin through 2026. He added that, as of this keynote, he now sees through 2027 “at least $1 trillion,” while cautioning he believes demand could be higher.

Roadmap: Blackwell, Vera Rubin, and a Groq integration

Huang detailed NVIDIA’s evolution from individual GPUs to rack-scale systems, emphasizing NVLink-based scale-up designs and tighter hardware-software co-design. He cited an inference analysis he attributed to SemiAnalysis, focusing on “tokens per watt” and interactivity as constraints for AI factories that are power-limited by design.

He discussed NVIDIA’s transition to new systems architecture, including:

  • Grace Blackwell NVLink 72, which he said involved major re-architecture and new capabilities such as NVFP4 and the Dynamo software layer
  • Vera Rubin, which he described as a full-system platform designed for agentic workloads, including heavy memory, storage, and tool-use demands
  • New rack and infrastructure elements including co-packaged optics in the Spectrum-X switch and liquid cooling, which he said reduces installation time and improves data center efficiency

Huang also described an integration with Groq technology for what he positioned as ultra-low-latency token generation. He said NVIDIA “acquired the team that worked on the Groq chips and licensed the technology,” and outlined a “disaggregated inference” approach using Dynamo to split workloads between Vera Rubin and Groq hardware. He said the Groq LPX system is in production and indicated shipments in the second half of the year, “probably about Q3 timeframe.”

He presented NVIDIA’s forward roadmap including Rubin Ultra and the next platform “Feynman,” referencing continued annual architectural updates and scaling via both copper and optical technologies.

DSX for AI factory design, OpenClaw and enterprise security, and robotics partnerships

Huang introduced NVIDIA DSX as an “AI factory platform” built on Omniverse digital twins to design and operate data centers for maximum token throughput and energy efficiency. He described components including simulation, operational data exchange, power management with the grid, and dynamic optimization (Max-Q), alongside an ecosystem of partners and tools referenced in a supporting video segment.

He also described a new open-source agentic framework called OpenClaw, calling it a major industry development and comparing its importance to earlier software standards. Huang said NVIDIA is supporting OpenClaw and worked with its creator to make it “enterprise secure and enterprise private capable,” referencing a set of additions he called OpenShell and a reference design called NemoClaw, including network guardrails and privacy routing for corporate environments.

On models, Huang highlighted NVIDIA’s “Open Model Initiative,” describing families including Nemotron, Cosmos, Groot, BioNeMo, Earth-2, and Alpamayo. He also announced a Nemotron Coalition with partners he listed including Black Forest Labs, Cursor, LangChain, Mistral, Perplexity, Reflection, Sarvam, and Thinking Machines Lab, aimed at advancing Nemotron 4.

Finally, Huang outlined NVIDIA’s physical AI and robotics efforts, describing a stack spanning training, simulation/synthetic data generation, and onboard compute. He announced four new automotive partners for NVIDIA’s “robotaxi-ready platform”—BYD, Hyundai, Nissan, and Geely—and also described a partnership with Uber to deploy and connect robotaxi-ready vehicles into its network across multiple cities. He also cited robotics partners including ABB, Universal Robots, and KUKA, and referenced ongoing work in AI-RAN with telecom partners such as Nokia and T-Mobile.

Huang closed by reiterating that AI factories, agentic systems, and physical AI are converging into what he framed as a new computing platform shift, with NVIDIA positioned to deliver chips, systems, software libraries, models, and factory-scale design tools through a broad ecosystem.

About NVIDIA (NASDAQ:NVDA)

NVIDIA Corporation, founded in 1993 and headquartered in Santa Clara, California, is a global technology company that designs and develops graphics processing units (GPUs) and system-on-chip (SoC) technologies. Co-founded by Jensen Huang, who serves as president and chief executive officer, along with Chris Malachowsky and Curtis Priem, NVIDIA has grown from a graphics-focused chipmaker into a broad provider of accelerated computing hardware and software for multiple industries.

The company’s product portfolio spans discrete GPUs for gaming and professional visualization (marketed under the GeForce and NVIDIA RTX lines), high-performance data center accelerators used for AI training and inference (including widely adopted platforms such as the A100 and H100 series), and Tegra SoCs for automotive and edge applications.

Featured Articles