NVIDIA Unveils Rubin CPX, Targeting Ultra-Long Context AI Inference Tasks

Tags:
2025-09-10

IMG_4584.jpeg

U.S. chip giant NVIDIA on September 9 (local time) unveiled its next-generation GPU, “Rubin CPX,” at the AI Infrastructure Summit. Aimed at applications requiring ultra-long context processing—such as video generation, software development, and large-scale inference—the new chip is built on the next-generation Rubin architecture and is scheduled for official release at the end of 2026. It is regarded as the successor to the Blackwell series.

The Rubin CPX is designed to handle millions of tokens, integrating video decoding, encoding, and AI inference into a single chip to deliver significantly improved performance. Its core platform, the Vera Rubin NVL144 CPX, can integrate 144 GPUs and 36 CPUs in a single rack, delivering computing power of up to 8 exaflops—more than seven times the performance of the current GB300 system. Its specialized design enables it to handle both massive cross-file codebase correlations and the data requirements of parsing videos up to one hour long.

According to NVIDIA, the Rubin CPX adopts a disaggregated inference architecture that optimizes context processing and generation tasks in stages. Combined with 100TB of high-speed memory and the latest InfiniBand interconnect technology, it greatly enhances both computational and memory efficiency. Startups such as Cursor, Runway, and Magic have already announced plans to adopt the system for code generation, creative workflows, and the construction of ultra-large foundation models.

NVIDIA CEO Jensen Huang emphasized that the Rubin CPX is the first CUDA GPU designed specifically for “massive-context AI.” He stated, “Just as RTX revolutionized graphics computing, CPX will transform inference, enabling models to simultaneously process knowledge across millions of tokens.”

Industry analysts believe this chip not only showcases NVIDIA’s keen grasp of AI application needs but also strengthens its dominance in the data center and AI infrastructure markets. Forecasts suggest that NVIDIA’s data center revenue could exceed $180 billion this fiscal year, far surpassing its competitors.