E12: NVIDIA'S HUGE AI Chips Will Change Everything (Watch Before Earnings!)

Nvidia Grace-Blackwell Compute Nodes this is a really exciting episode of funding awesome most people know that Nvidia builds AI chips for data centers and supercomputers but I'm going to give you an inside look at their next Generation Blackwell gpus which are about to change everything while I was at Nvidia GTC I had the privilege of interviewing Dion Harris nvidia's director of accelerated Computing go to market I used that time to dive into every part of the Blackwell compute stack including nvidia's Blackwell gpus themselves the grace CPU and Grace Blackwell Super Chips how they come together to make a supercomputer in a single rack and the insane amounts of compute nvidia's Blackwell architecture can achieve your time is valuable so let's get right into it hi I'm Dean Harris director of accelerated Computing go to market here at Nvidia So today we're going to start with the Blackwell GPU and work all the way up to the gb200 nvl 72 the full Rasel Grace Blackwell powered system recently announced here at GTC I'm super excited for this absolutely so it starts with our Blackwell GPU again that's 20,000 Tera flops of AI performance and so it allows us to have this 192 gigs of hbm 3E memory all accessed via our Envy link C Toc across both of these gpus and all connected with the grace CPU there's one Grace there's two black Wells what's the reasoning behind that absolutely so when you look at the the grace black black well super chip it involves two Blackwell gpus connected via our mvy link C Toc to a Grace CPU and the reason behind that when we look at a lot of our our most robust AI workloads they're really heavy on our gpus so we really saw the balance of of the workload being shifted more to the gpus and so we wanted to build a super chip that could reflect that as well so a lot of customers rather than having to buy a super chip with a CPU that isn't fully utilized we could leverage this architecture so we started with the Blackwell can you tell us a little bit about Grace absolutely so this Grace CPU it's it's the same one that we had highlighted for for Grace Hopper and the incredible thing about this is is incredibly energy efficient it's about two times as energy efficient as traditional um x86 CPUs and a lot of that is because of the lpddr5 memory that we use within this this uh CPU and the incredible thing about it is it gives access to the gpus to over 480 gigs of memory so that allows you know your high-end llms and and AI models that require lots of data to access that data over our MV link C Toc and essentially act as a seamless CPU GPU combination seamless CPU GPU so that means you can just go back and forth depending on what the workload is there's some way that these guys are connected I'm sure under this board and it's automatically load balancing oh this workload is better for the GPU oh this one's better for a CPU and it's doing all that on the fly so what what happening is as you start to Leverage The the workload itself the application is usually determining that and so in other words it's all connected like I mentioned via the NV link C Toc so it's a connection that's you know seven times faster than than traditional pcie and allows the data movement across the CPU and GPU to happen very seamlessly so to the application it's almost as if you had 480 gigs of memory running on the actual GPU itself got it and that's just this half right so now you know you are these two halves connected all or do they just run in parallel so this is one grac Blackwell um module Y and this is also another Grace Blackwell super chip module and they're all connected via Envy link so there's an Envy link connection from this side to this side as well so this is actually to talk to each other they typically go out of the system in order to talk to the other gpus and so this is where you have your MV link connections going out and they feed into that MV link spine that we'll talk about a little bit later and they allow them all to talk to talk to each other at the all to all full line 1.8 terabits a second so it's you know pretty pretty incredible in terms of all the all the data flowing across the system the numbers are mindboggling indeed so Blackwell Grace what is the rest of this board the rest of this board involves pretty much getting the information out of these compute modules into that Envy link spine right so you have your EnV link connections here which then go out and Connect into our Envy link switch which then gives you all the the connectivity across the full full Network we also have um multiple Envy link chips within the system as well that basically connect to that Envy link spine and allow you to have that communication at Full Line rate when you look at this entire system you have literally 880,000 Tera flops of performance within a single compute node incredible compute density incredible compute performance and efficiency all delivered in in one server node that you can then plug into the overall MV link system to create that full rack scale system that we're building up to the one thing I will highlight as well that's not shown here is the liquid cooling elements of it so that's another key PE key piece of the architecture that we have to implement in order to to to deliver the compute density required for this system so we use liquid cooling that that gives you incredible um density because again you don't have to have the heat sinks that create additional um rack requirements but then it also allows you to have it packed tighter that gives you all the networking efficiencies that we talked about in terms of being able to leverage copper as opposed to fiber optics so it's all purpose-built to be you know highly performant very dense and also very compact which is why cooling is so important right absolutely and the Energy Efficiency of this system is is something that we really thought about going into it is how do we maximize the performance to the compute and not to the the power and cool engine oh that's awesome so this Trail looks pretty Nvidia NVLink Switch Trays different what's what is this as opposed to the grace Black Walls we just saw so this is basically the the the range that connects all of those systems together so this is our Envy link uh switch chips that we announced in the keynote as well and so this allows all of those mbink connected gpus to talk to each other we have nine of these trades that connect all 72 of those gpus over MV link and so these cables look pretty special too can you tell me a little bit about what these are as opposed to the ones we saw on the other side yeah and so basically we've built these 30s uh to to allow these chips to talk to each other at 1.8 terabytes per second across these four channels um going out and they represent you know connections to four of those those gpus that are that were on the other trade that we talked about can you help me comprehend how fast 1.8 terabytes per second is like that sounds blazing fast you you know when you aggregate it at the full system level they say it's about 30 terabyt per second which is more than the entire internet traffic so you can still the entire internet in about a second that's mind-blowing that's insane yeah it is so yeah but but definitely you know needed given the requirements of today's models and the requirements of today's workloads this is really what we saw as as a need and requirement to deliver the performance that would get you to the next level in terms of servicing the next gen AI models sure okay so this isn't just about today's workloads this is also about tomorrow's workloads as well and future proofing for those absolutely absolutely as we said it's it's for the trillion scale paramet models so when you think about how the models are evolving over time um they're becoming more complex larger in a lot of cases and that's requiring more memory more compute more bandwidth and so we've optimized the full system to start to enable those models today that's wild that's awesome so okay so so far Grace Blackwell we saw those trays these are the EnV link trays for networking those all together what's next okay let's go take a look at how this come together in a rack scale system awesome okay Nvidia GB200 NVL72 Full Rack so here we're standing in front of the Nvidia gb200 mvl 72 and what's incredible about this system like I said you have um 18 compute nodes all of which have four Blackwell uh gpus in them as well as two gray CPUs in them so in combination you have 72 Blackwell gpus 36 gray CPUs all interconnected Bel behaving as one single compute system whoa and so just to break that down a little bit you can see that these are the compute trays that we covered over there want to come in so you have 10 on the top and you have eight on the bottom here okay and then those mvlink switch trays that we just talked about a second ago you have nine of those connecting here oh I see yeah the interfaces on the outside are actually different so the networking trayes are in the middle and the computes are on the top and the bottom ex exactly so you got 10 compute trays here eight there so what we're looking at you can see that this is basically an interface for the front end cuz like we talked about before you have your MV link connections that are all going out through the back in in fact we can walk around the back and take a look to see how those fiber cables um you know connect it all and it's it's an incredible site to see sure let's take a look so this is the back and this is the MV leak spine and this is sort of a a brand new invention and and Architectural Innovation that allows you to have instant connectivity and cabling so think of it like a docking station you literally take those compute trays and you plug them in they have industrial connection points that automatically latch in and connect to the wiring and the cabling that's already installed and this cabling looks pretty special too can you tell me a little bit more about the like the actual cabling we're seeing absolutely so this is the EnV link uh connection points and again you have over 500 mvlink connections over 2 miles of cabling 2 miles yes so so again you think about what's Happening Here here it's connecting every GPU to every other GPU in this system and and this this is exactly what allows it to behave as a single GPU so every single GPU can access the memory on every other GPU at that full 1.8 terabits a second line rate so when we say that this is one GPU it truly when you're building an application where you're when you're deploying models it doesn't care if it's on gpu1 versus gp72 it can still access that those data points at the same speed okay so what what is of these thick cables here is that power is that cooling what is that this is the liquid cooling and one of the things that allows us to create such compute density to put over an exif flop a compute in a single rack is by leveraging liquid cooling and so this is a unique opportunity for us to sort of you know Advance um Computing and compute density but also reduce the cost so that's another key element when we look at how we're using liquid cooling how we're using sort of this copper based cabling system it's all designed to deliver Optimal Performance at the lowest cost with the best efficiency and I'm noticing cables on the top there as well right absolutely so how do they connect to multiple R let's go back around the front sure yeah okay so now let's look at some of the ports coming in and out of the system here on the front tray so we have our infin band switches that we'll typically have as top of rack switches like we talked about MV link goes to the back and those are all connected on that MV link spine however when you start to put multiple of these racks together you can still leverage infiniband infin yep and so we have our latest um infin man switches that we highlighted at in the keynote which are pushing over 800 gbits per second and you can basically have these links if you look at you see you have four there on the top side those are coming out of each GPU and so each of those connect to us to its own individual Port as well W so you have 70 72 ports coming out of this front end that connect to that top rack switch which then goes into your factory Network to connect all of those racks all of those gpus together got it and so I know Nvidia has sort of two networking products at this level right one is infin band and one is ethernet can you help me understand the difference between those two absolutely it's it's really a matter of choice right so some customers in their data center they want to deploy ethernet to to have a seamless uh management capability if they're running most of their systems on ethernet a lot of times uh hyperscalers will do that just for Simplicity and so we've de we've developed our Spectrum X platform which allows you to have an AI capable Network and what that means is when you have um IP based networking often times it hasn't been as applicable for AI because you get congestion you get noise neighbor issues and so because of the performance characteristics of AI it doesn't really tolerate that so we we built in a number of resiliency um capabilities like we have Packer reordering we have adaptive routing we have in network computing that we built into our IP based Network so now you get all of those manageability benefits that we have on our infin band side also on the Spectrum side so it's really just a matter of choice the other thing that we' done as well we have some ports for our dpus and so when you think about how your system interfaces with with the other systems in your data center there's also east west traffic connecting to other compute nodes but there's also North South traffic comp connecting to storage for example sure and so we basically implemented our blue field dpus and you can see these connection points here on the right okay which basically help you do a lot of that North South traffic management yeah and what's really cool about this is it gives you security and isolation in addition to Performance because you're able to offload some of those tasks that perform best in the network for example like all reduce calculations that happen we're able to do those within with the switching Computing Network to offload that from from your compute network but it also provides security isolation and performance isolation got it so again we have our CPUs our gpus and our dpus and our dpus basically all all helping to you know deliver the solution and dpus are data processing units or correct and that's all about like data transfer actually right that's all like a exactly cuz I know that's an expensive thing to do move data from Chip to chip right yeah and this is basically acts as an interface to bring the data in it it act it interacts with the CPU complex to basically move it in and then that's how you get it rotted through the gpus your dpu acts as a Gateway if you will to really make sure you can ensure security but also optimize performance by having some of those data movement functions isolated and centralized within the dpu got it awesome and then one last thing so this is one rack I know what's behind us is our Hopper racks but can we go and like talk a little bit about the rack scale at those racks AB yeah absolutely let's go take a look at Grace Hopper so so this is our Grace Hopper Nvidia AI Datacenter Scaling mgx based system and so back in computex last year we announced something called mgx yeah which is basically a reference architecture that allows all of our oems and odm partners to build systems and servers that are optimized for Accelerated Computing and that's really at the tray level right at the tray level and so with this repr presents in this case we're actually showing Grace Hopper and let's see if we can pull these guys out we can take a look and see what's in here so when you look at what this you have two Grace Hopper Super Chips that also have liquid cooling so just like the grace Blackwell system that we're highlighting again giving you incredible performance density and compute density but also giving you efficiency as well so you have your two Grace Hopper Super Chips that are connected and in this case they're connected via um infiniband yeah and so you don't have the mvy link spine to to provide the all to all communication that we're describing but this is still an incredible way to scale and get incredible compute density but also using our mgx form factor which allows all of our partners to quickly ramp up on our latest and greatest um gpus so we'll have an mgx platform that will also be for Blackwell that will allow these guys to instantly embrace our latest and greatest so you can literally take this tray out and slide a Blackwell version of the tray that we saw earlier in so it makes it easier for our partners it makes it easier for our customers to adopt our our latest technology yeah and then whe whether you're using the infin ban solution or the ethernet solution up here in the Blackwell version yes that's how you connect all these racks together right yes so and you can see here like I said you have the connection points going out to those top of rack switches up there yeah and we saw the equivalent of these on the Blackwell system over there correct correct so how many of these can you stitch together how big can this get as big as you like I mean quite honestly when you look at sort of how this this system is architected it really starts to become a matter of how you've architected your data center right cuz some of this is going to be more power dependent because the compute density and the power density it basically requires you know people to be very thoughtful about how they're building out their data centers how they're cabling their racks how they're powering their racks oh how they're cooling it too right that exactly exactly so this is where we're working with a lot of our data center infrastructure Partners to build reference designs to help people as their architect in their data center they can look at this this rack and say okay this is the amount of power I require this is the amount of cabling connection I require this the amount of space I have the waiting on the floor all of those dimensions are taken into account and then that tells you sort of how you want to build out your data center but in terms of scale you know you can go to you know multiple layers of fat tree um networking topologies to get you thousands tens of thousands of these gpus all interconnected via infiniband Andor the Spectrum networking that's awesome and so what's what's the idea here will we see a new super computer built on the Blackwell versions of these like we saw EOS Nvidia is building our own data center you know right now as we speak okay and we plan to deploy over 32,000 um Grace Blackwells into a single single Data Center and so we're building that out um during the keynote we showcased sort of this as we were doing the buildup that last shot which showed that fullon data center view is not just a cartoon it is our a digital twin of the data center that we're actually building oh wow and so we're leveraging again all these Technologies we talked about we're leveraging the Blackwell GPU we're leveraging the grace CPU and the grace Blackwell super chip module of course we have the compute tray and then that's building up and connected via that MV link Network infrastructure and all of that is then connected via our Quantum x800 um infin man switches to then they give you that fullon 32,000 Blackwell GPU system and then can you tell us a little bit about what you plan to use all that for that sounds incredible well Nvidia is very much a practitioner of AI we don't just build you know compute we actually build models we train models we use it for our self self-driving um car model development we use it for a lot of llm model development so all the work that we do with our customers and partners we're doing it ourselves and so that gives us a new informed view of how we should build and optimize you know the infrastructure that we build but it also allows us to make real contributions Beyond just the hardware you get a whole different level of insights when you're a user not just a builder right exactly for sure this has been amazing thanks for walking us through uh Blackwell through grace through the Envy link switches through the full rack and then the multi rack scale you know all the way up to supercomputers I really appreciate your time absolutely thank you so much thank you a huge thank you to Dion Harris for walking us through nvidia's entire Blackwell architecture the gpus CPUs and dpus the grace Blackwell Super Chips EnV link and Quantum infin band and and how it all comes together to create mind-blowing compute power for the next generation of AI applications another big thank you to Nvidia for inviting me to GTC to learn everything I can in person and share it with all of you and of course thank you for supporting the channel thanks for watching and until next time this is ticker symol U my name is Alex reminding you that the best investment you can make is in you

Share your thoughts

Related Transcripts

Nvidia CEO Drops BOMBSHELL: Most IMPORTANT AI Stock | NVDA STOCK thumbnail
Nvidia CEO Drops BOMBSHELL: Most IMPORTANT AI Stock | NVDA STOCK

Category: Education

Nvidia stock investors would probably argue that nvidia is the most important company in the world but what if i told you that in a recent conference nvidia ceo pretty much confirmed these claims more specifically he mentioned how the demand for his chips are creating very emotional customers what's... Read more

NVIDIA'S HUGE AI Chip Breakthroughs Change Everything (Supercut) thumbnail
NVIDIA'S HUGE AI Chip Breakthroughs Change Everything (Supercut)

Category: Science & Technology

Nvidia h100 gpu for ai - overview this is the new computer industry software is no longer programmed just by computer engineers software is programmed by computer engineers working with ai supercomputers we have now reached the tipping point of accelerated computing we have now reached the tipping point... Read more

Nvidia Stock Drops After Earnings What Went WRONG? -- NVDA Stock thumbnail
Nvidia Stock Drops After Earnings What Went WRONG? -- NVDA Stock

Category: Education

Nvidia just reported earnings and unfortunately the stock is down roughly 7% on today's episode i want to share my overall thoughts on earnings but i don't want to hold you hostage i want to say i think earnings were great i don't think there's a reason for the selloff and i think as a long-term investor... Read more

Oracle's AI SURGE: Is NVIDIA's Stock About to Skyrocket? thumbnail
Oracle's AI SURGE: Is NVIDIA's Stock About to Skyrocket?

Category: Education

Nvidia's stock is down over 20% from its year- to-day high as many worry that the ai market is filled with hopes and dreams however oracle just reported earnings giving us a great reminder of how strong and resilient the ai market truly is especially noteworthy is the ceo's mention of designing a massive... Read more

¨This Will Cause A Once In A Lifetime Pump To Nvidia Stock..¨ - Nvidia CEO thumbnail
¨This Will Cause A Once In A Lifetime Pump To Nvidia Stock..¨ - Nvidia CEO

Category: Education

[music] i think the market wanted more on blackwell they wanted more specifics and i'm trying to go through all of the call and the transcript it seems like a very clearly this was a production issue and not a fundamental design issue with blackwell but the deployment in the real world what does that... Read more

"Nvidia Stock Is About to Go Completely Crazy!" - Jim Cramer thumbnail
"Nvidia Stock Is About to Go Completely Crazy!" - Jim Cramer

Category: Education

Who is nidia what is she that all our swains commend her holy fair and wise is she the heaven such grace did lend her that she might admire be on the eve of its quarter with the whole market on ten hooks a d advancing 10 points as and she got .6% that that's that gaining .6% we have to ask who is nvidia... Read more

Chamath Palihapitiya Nvidia's Price is Going to Blow Your Mind thumbnail
Chamath Palihapitiya Nvidia's Price is Going to Blow Your Mind

Category: Education

Over the next five years i see nvidia's stock easily reaching a market cap of 50 to 60 trillion do nvidia jumped 12% on some comments made by meta um and microsoft that both said that there's increased ai demand and they're going to continue to uh to build out capacity so chamat i know you've talked... Read more

Why Nvidia's Stock is Set to Skyrocket with Apple’s AI Integration thumbnail
Why Nvidia's Stock is Set to Skyrocket with Apple’s AI Integration

Category: Education

As we know nvidia is the main provider to train ai models and apple just integrated ai into its phones now will be just a matter of time for nvidia stock to massively explode welcome back to squawk on the street apple unveiling a slew of new products at monday's big product event including a new ai... Read more

NVDA Earnings SHOCKER! What's NEXT? thumbnail
NVDA Earnings SHOCKER! What's NEXT?

Category: People & Blogs

What really happened after the earnings report were released everyone were really expecting in the stock market this stock would really push the stock market on the positive side of course even i was expecting the same of course i have even the shares in nvidia stock but something really small minor... Read more

Breaking Down Nvidia's Stock Earnings Aftermath! | VectorVest thumbnail
Breaking Down Nvidia's Stock Earnings Aftermath! | VectorVest

Category: Education

[music] all right all right all right good evening vvv nation my name is ryan welcome to the channel especially if this is your first time today's going to be nvidia so what else would you expect me to talk about right nvidia's earnings now i do record this video in the morning it's pre-market right... Read more

Nvidia Stock Analysis | Nvidia Earnings Disaster | AMD ARM AVGO MU SMCI thumbnail
Nvidia Stock Analysis | Nvidia Earnings Disaster | AMD ARM AVGO MU SMCI

Category: Education

Hey what's up you guys this is junior trader and in this video let's take a look at nvidia stock today so we finally had its earnings we have a chart in our hand on daily interval you can't really see any difference because trading view doesn't really highlight what happens in the post market or pre-market... Read more

The Real Story Behind Nvidia Stock's Earnings: NVDA Stock Analysis thumbnail
The Real Story Behind Nvidia Stock's Earnings: NVDA Stock Analysis

Category: Education

The moments you've been waiting for nvidia earnings are out now this is another strong quarter the stock is down it's thrashing around like a ragd out what's going on with nvidia earnings and where is the stock headed next i've been investing for 25 years and this is the most important earnings report... Read more