Close Menu
    Trending
    • Negotiations that enable Israel’s land-grabs | Israel-Palestine conflict
    • True-or-false for Round 1 of 2026 NFL Draft: Will Cowboys regret their trade?
    • Opinion | Stewart Brand, Silicon Valley’s Favorite Prophet, on Life’s Most Important Principle
    • Struggling to scale your company? Here are five things that could be holding you back
    • What happens if you’re hit by a primordial black hole?
    • When is London Marathon 2026? Start time and how to watch race for FREE
    • Pentagon Requests $54 Billion For AI War
    • Clavicular Hit With New YouTube Crackdown
    Benjamin Franklin Institute
    Friday, April 24
    • Home
    • Politics
    • Business
    • Science
    • Technology
    • Arts & Entertainment
    • International
    Benjamin Franklin Institute
    Home»Technology»Nvidia Rubin’s Network Doubles Bandwidth
    Technology

    Nvidia Rubin’s Network Doubles Bandwidth

    Team_Benjamin Franklin InstituteBy Team_Benjamin Franklin InstituteJanuary 10, 2026No Comments5 Mins Read
    Share Facebook Twitter Pinterest Copy Link LinkedIn Tumblr Email VKontakte Telegram
    Share
    Facebook Twitter Pinterest Email Copy Link

    Earlier this week, Nvidia surprise-announced their new Vera Rubin architecture (no relation to the recently unveiled telescope) at the Consumer Electronics Show in Las Vegas. The new platform, set to reach customers later this year, is advertised to offer a ten-fold reduction in inference costs and a four-fold reduction in how many GPUs it would take to train certain models, as compared to Nvidia’s Blackwell architecture.

    The usual suspect for improved performance is the GPU. Indeed, the new Rubin GPU boasts 50 quadrillion floating-point operations per second (petaFLOPS) of 4-bit computation, as compared to 10 petaflops on Blackwell, at least for transformer-based inference workloads like large language models.

    However, focusing on just the GPU misses the bigger picture. There are a total of six new chips in the Vera-Rubin-based computers: the Vera CPU, the Rubin GPU, and four distinct networking chips. To achieve performance advantages, the components have to work in concert, says Gilad Shainer, senior vice president of networking at Nvidia.

    “The same unit connected in a different way will deliver a completely different level of performance,” Shainer says. “That’s why we call it extreme co-design.”

    Expanded “in-network compute”

    AI workloads, both training and inference, run on large numbers of GPUs simultaneously. “Two years back, inferencing was mainly run on a single GPU, a single box, a single server,” Shainer says. “Right now, inferencing is becoming distributed, and it’s not just in a rack. It’s going to go across racks.”

    To accommodate these hugely distributed tasks, as many GPUs as possible need to effectively work as one. This is the aim of the so-called scale-up network: the connection of GPUs within a single rack. Nvidia handles this connection with their NVLink networking chip. The new line includes the NVLink6 switch, with double the bandwidth of the previous version (3,600 gigabytes per second for GPU-to-GPU connections, as compared to 1,800 GB/s for NVLink5 switch).

    In addition to the bandwidth doubling, the scale-up chips also include double the number of SerDes—serializer/deserializers (which allow data to be sent across fewer wires) and an expanded number of calculations that can be done within the network.

    “The scale-up network is not really the network itself,” Shainer says. “It’s computing infrastructure, and some of the computing operations are done on the network…on the switch.”

    The rationale for offloading some operations from the GPUs to the network is two-fold. First, it allows some tasks to only be done once, rather than having every GPU having to perform them. A common example of this is the all-reduce operation in AI training. During training, each GPU computes a mathematical operation called a gradient on its own batch of data. In order to train the model correctly , all the GPUs need to know the average gradient computed across all batches. Rather than each GPU sending its gradient to every other GPU, and every one of them computing the average, it saves computational time and power for that operation to only happen once, within the network.

    A second rationale is to hide the time it takes to shuttle data in-between GPUs by doing computations on them en-route. Shainer explains this via an analogy of a pizza parlor trying to speed up the time it takes to deliver an order. “What can you do if you had more ovens or more workers? It doesn’t help you; you can make more pizzas, but the time for a single pizza is going to stay the same. Alternatively, if you would take the oven and put it in a car, so I’m going to bake the pizza while traveling to you, this is where I save time. This is what we do.”

    In-network computing is not new to this iteration of Nvidia’s architecture. In fact, it has been in common use since around 2016. But, this iteration adds a broader swath of computations that can be done within the network to accommodate different workloads and different numerical formats, Shainer says.

    Scaling out and across

    The rest of the networking chips included in the Rubin architecture comprise the so-called scale-out network. This is the part that connects different racks to each other within the data center.

    Those chips are the ConnectX-9, a networking interface card; the BlueField-4 a so-called data processing unit, which is paired with two Vera CPUs and a ConnectX-9 card for offloading networking, storage, and security tasks; and finally the Spectrum-6 Ethernet switch, which uses co-packaged optics to send data between racks. The Ethernet switch also doubles the bandwidth of the previous generations, while minimizing jitter—the variation in arrival times of information packets.

    “Scale-out infrastructure needs to make sure that those GPUs can communicate well in order to run a distributed computing workload and that means I need a network that has no jitter in it,” he says. The presence of jitter implies that if different racks are doing different parts of the calculation, the answer from each will arrive at different times. One rack will always be slower than the rest, and the rest of the racks, full of costly equipment, sit idle while waiting for that last packet. “Jitter means losing money,” Shainer says.

    None of Nvidia’s host of new chips are specifically dedicated to connect between data centers, termed ‘“scale-across.” But Shainer argues this is the next frontier. “It doesn’t stop here, because we are seeing the demands to increase the number of GPUs in a data center,” he says. “100,000 GPUs is not enough anymore for some workloads, and now we need to connect multiple data centers together.”

    From Your Site Articles

    Related Articles Around the Web



    Source link

    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Telegram Copy Link

    Related Posts

    Technology

    How This Former Roboticist’s Students Rebuilt ENIAC

    April 23, 2026
    Technology

    How AI Is Changing Cybersecurity

    April 23, 2026
    Technology

    Ham Radio Brings Teletext Back to Life

    April 22, 2026
    Technology

    Energy in Motion: Unlocking the Interconnected Grid of Tomorrow

    April 22, 2026
    Technology

    Tech Life – A hologram to remember: Pam and Bill’s love story

    April 21, 2026
    Technology

    Engineering Manager Vs IC: How to Choose With Clarity

    April 21, 2026
    Editors Picks

    Kara Swisher dishes on OpenAI, Meta, Google—and the bidding war for Warner Bros. Discovery

    December 14, 2025

    Harnessing Plasmons for Alternative Computing Power

    January 22, 2026

    World’s richest 10% caused two thirds of global warming: Study

    May 7, 2025

    Opinion | Are We Giving Trump Too Much Credit?

    February 7, 2026

    Lawyers search for Epstein survivors for Bank of America $72.5m settlement | Courts News

    April 3, 2026
    About Us
    About Us

    Welcome to Benjamin Franklin Institute, your premier destination for insightful, engaging, and diverse Political News and Opinions.

    The Benjamin Franklin Institute supports free speech, the U.S. Constitution and political candidates and organizations that promote and protect both of these important features of the American Experiment.

    We are passionate about delivering high-quality, accurate, and engaging content that resonates with our readers. Sign up for our text alerts and email newsletter to stay informed.

    Latest Posts

    Negotiations that enable Israel’s land-grabs | Israel-Palestine conflict

    April 24, 2026

    True-or-false for Round 1 of 2026 NFL Draft: Will Cowboys regret their trade?

    April 24, 2026

    Opinion | Stewart Brand, Silicon Valley’s Favorite Prophet, on Life’s Most Important Principle

    April 24, 2026

    Subscribe for Updates

    Stay informed by signing up for our free news alerts.

    Paid for by the Benjamin Franklin Institute. Not authorized by any candidate or candidate’s committee.
    • Privacy Policy
    • About us
    • Contact us

    Type above and press Enter to search. Press Esc to cancel.