Nvidia's Vera Rubin Chips Enter Full Production

Nvidia says its next-generation Rubin platform is now in full production, with volume shipments to Meta, Microsoft and Google's data centers due in the second half of 2026.

Jul 4, 2026Source: Nvidia

Nvidia said its next-generation Vera Rubin platform has entered full production, with volume shipments to the largest cloud operators due in the second half of 2026. The platform, first unveiled at CES in January, is the successor to Blackwell and is built for the agentic AI workloads now driving demand.

A big jump in performance-per-dollar

Vera Rubin pairs the 336-billion-transistor Rubin R100 GPU with a custom 88-core Vera CPU. Nvidia says the platform delivers roughly five times the inference performance of Blackwell at about a tenth of the cost per token — the efficiency gain that matters most as companies move from training models to running them at scale.

Hyperscalers including Meta, Microsoft and Google have already secured allocation, and are expected to deploy Blackwell and Vera Rubin side by side into 2027 — Blackwell handling existing workloads while Rubin targets the largest new models. Analysts estimate 2026 Rubin output at roughly 200,000 to 300,000 GPUs, with most going to the biggest cloud buyers first.

Demand measured in trillions

The ramp underscores how central Nvidia remains to the AI buildout. Chief executive Jensen Huang has said he expects combined orders for Blackwell and Vera Rubin to reach $1 trillion through 2027. With production now running at volume, the constraint shifts from whether Nvidia can ship Rubin to how quickly power and data-center capacity can be built to house it.

Why It Matters

Rubin's efficiency gains land as the industry's bottleneck moves from training to inference — the cost of actually serving models to users and agents. A platform that cuts cost per token roughly tenfold changes the math for every company running large models, and deepens Nvidia's grip on the layer everyone else builds on. The remaining limit is physical: with chips in full production, power and data-center capacity, not silicon, increasingly gate how fast AI can scale.

Nvidia's Vera Rubin Chips Enter Full Production

A big jump in performance-per-dollar

Demand measured in trillions

Why It Matters

Related Stories