NVIDIA'S HUGE AI Chip Breakthroughs Change Everything (Supercut)

Nvidia H100 GPU for AI - Overview this is the new computer industry software is no longer programmed just by computer Engineers software is programmed by computer Engineers working with AI supercomputers we have now reached the Tipping Point of accelerated Computing we have now reached the Tipping Point of generative Ai and we are so so so excited to be in full volume production of the h100 this is going to touch literally every single industry let's take a look at how h100 is produced [Music] okay [Music] 35 000 components on that system board eight Hopper gpus let me show it to you all right this I would I would lift this but I I um I still have the rest of the keynote I would like to give this is 60 pounds 65 pounds it takes robots to lift it of course and it takes robots to insert it because the insertion pressure is so high and has to be so perfect this computer is two hundred thousand dollars and as you know it replaces an entire room of other computers it's the world's single most expensive computer that you can say the more you buy the more you save this is what a compute trade looks like even this is incredibly heavy see that so this is the brand new h100 with the world's first computer that has a Transformer engine in it the performance is utterly incredible Accelerated Computing for AI there are two fundamental transitions happening in the computer industry today all of you are deep within it and you feel it there are two fundamental Trends the first trend is because CPU scaling has ended the ability to get 10 times more performance every five years has ended the ability to get 10 times more performance every five years at the same cost is the reason why computers are so fast today that trend has ended it happened at exactly the time when a new way of doing software was discovered deep learning these two events came together and is driving Computing today accelerated Computing and generative AI of doing software just a way of doing computation is a reinvention from the ground up and it's not easy accelerated Computing has taken us nearly three decades to accomplish well this is how accelerated Computing works this is accelerated Computing used for large language models basically the core of generative AI this example is a 10 million dollar server and so 10 million dollars gets you nearly a thousand CPU servers and to train to process this large language model takes 11 gigawatt hours 11 gigawatt hours okay and this is what happens when you accelerate this workload with accelerated Computing and so with 10 million dollars for a 10 million dollar server you buy 48 GPU servers it's the reason why people say that GPU servers are so expensive remember people say GPS servers are so expensive however the GPU server is no longer the computer the computer is the data center your goal is to build the most cost effective data center not build the most cost effective server back in the old days when the computer was the server that would be a reasonable thing to do but today the computer is the data center so for 10 million dollars you buy 48 GPU servers it only consumes 3.2 gigawatt hours and 44 times the performance let me just show it to you one more time this is before and this is after and this is we want dense computers not big ones we want dense computers fast computers not big ones let me show you something else this is my favorite if your goal if your goal is to get the work done and this is the work you want to get done ISO work okay this is ISO work all right look at this look at this look at this before after you've heard me talk about this How Nvidia Builds AI Factories for so many years in fact every single time you saw me I've been talking to you about accelerated computing and now why is it that finally it's the Tipping Point because we have now addressed so many different domains of science so many Industries and in data processing in deep learning classical machine learning so many different ways for us to deploy software from the cloud to Enterprise to Super Computing to the edge so many different configurations of gpus from our hgx versions to our Omniverse versions to our Cloud GPU and Graphics version so many different versions now the utilization is incredibly High the utilization of Nvidia GPU is so high almost every single cloud is overextended almost every single data center is overextended there are so many different applications using it so we have now reached the Tipping Point of accelerated Computing we have now reached the Tipping Point of generative AI people thought that gpus would just be gpus they were completely wrong we dedicated ourselves to Reinventing the GPU so that it's incredibly good at tensor processing and then all of the algorithms and engines that sit on top of these computers we call Nvidia AI the only AI operating system in the world that takes data processing from data processing to training to optimization to deployment and inference end to end deep learning processing it is the engine of AI today we connected gpus to other gpus called mvlink build one giant GPU and we connected those gpus together using infiniband into larger scale computers the ability for us to drive the processor and extend the scale of computing made it possible for the AI research organization the community to advance AI at an incredible rate so every two years we take giant leaps forward and I'm expecting the next lead to be giant as well this is the new computer industry software is no longer programmed just by computer Engineers software is programmed by computer Engineers working with AI supercomputers these AI supercomputers are a new type of factory it is very logical that a car industry has factories they build things so you can see cars it is very logical that computer industry has computer factories you build things that you can see computers in the future every single major company will also have ai factories and you will build and produce your company's intelligence and it's a very sensible thing we are intelligence producers already it's just that the intelligence producers the intelligence are people in the future we will be intelligence producers artificial intelligence producers and every single company will have factories and the factories will be built this way Why AI is the Next Era of Computing using accelerated Computing and artificial intelligence we accelerated computer Graphics by 1 000 times in five years Moore's Law is probably currently running at about two times a thousand times in five years a thousand times in five years is one million times in ten we're doing the same thing in artificial intelligence now question is what can you do when your computer is one million times faster what would you do if your computer was one million times faster well it turns out that we can now apply the instrument of our industry to so many different fields that were impossible before this is the reason why everybody is so excited there's no question that we're in a new Computing era there's just absolutely no question about it every single Computing era you could do different things that weren't possible before and artificial intelligence certainly qualifies this particular Computing era is special in several ways one it is able to understand information of more than just text and numbers it can Now understand multi-modality which is the reason why this Computing Revolution can impact every industry every industry two because this computer doesn't care how you program it it will try to understand what you mean because it has this incredible large language model capability and so the programming barrier is incredibly low we have closed the digital divide everyone is a programmer now you just have to say something to the computer third this computer not only is it able to do amazing things for the for the future it can do amazing things for every single application of the previous era which is the reason why all of these apis are being connected into Windows applications here and there in browsers and PowerPoint and word every application that exists will be better because of AI you don't have to just AI this generation this Computing era does not need new applications it can succeed with old applications and it's going to have new applications the rate of progress the rate of progress because it's so easy to use is the reason why it's growing so fast this is going to touch literally every Nvidia Grace Hopper AI Superchip single industry and at the core with just as with every single Computing era it needs a new Computing approach the last several years I've been talking to you about the new type of processor we've been creating and this is the reason we've been creating it ladies and gentlemen Grace Hopper is now in full production this is Grace Hopper nearly 200 billion transistors in this computer oh foreign look at this this is Grace Hopper this this processor this processor is really quite amazing there are several characteristics about it this is the world's first accelerated processor accelerated Computing processor that also has a giant memory it has almost 600 gigabytes of memory that's coherent between the CPU and the GPU and so the GPU can reference the memory the CPU can represent reference the memory and unnecessary any unnecessary copying back and forth could be avoided the amazing amount of high-speed memory lets the GPU work on very very large data sets this is a computer this is not a chip practically the Entire Computer is on here all of the Lo this is uh uses low power DDR memory just like your cell phone except this has been optimized and designed for high resilience data center applications so let me show you what we're going to do so the first thing is of course we have the Grace Hopper Superchip Nvidia GH-200 AI Supercomputer put that into a computer the second thing that we're going to do is we're going to connect eight of these together using ndlink this is an Envy link switch so eight of this eight of this Connect into three switch trays into eight eight Grace Hopper pod these eight Grace Hopper pods each one of the grace Hoppers are connected to the other Grace Hopper at 900 gigabytes per second Aid them connected together as a pod and then we connect 32 of them together with another layer of switches and in order to build in order to build this 256 Grace Hopper Super Chips connected into one exoflops one exaflops you know that countries and Nations have been working on exaflops Computing and just recently achieved it 256 Grace Hoppers for deep learning is one exaflop Transformer engine and it gives us 144 terabytes of memory that every GPU can see this is not 144 terabytes distributed this is 144 terabytes connected why don't we take a look at what it really looks like play please foreign [Applause] this is 150 miles of cables fiber optic cables 2 000 fans 70 000 cubic feet per minute it probably recycles the air in this entire room in a couple of minutes forty thousand pounds four elephants one GPU if I can get up on here this is actual size so this is this is our brand new Grace Hopper AI supercomputer it is one giant GPU utterly incredible we're building it now and we're so we're so excited that Google Cloud meta and Microsoft will be the first companies in the world to have access and they will be doing exploratory research on the pioneering front the boundaries of artificial intelligence with us so this is the dgx gh200 it is one giant GPU okay I just talked about how we are Nvidia MGX - Next Generation Servers for AI going to extend the frontier of AI data centers all over the world and all of them over the next decade will be recycled and re-engineered into accelerated data centers and generative AI capable data centers but there are so many different applications in so many different areas scientific computing data processing cloud and video and Graphics generative AI for Enterprise and of course the edge each one of these applications have different configurations of servers different focus of applications different deployment methods and so security is different operating system is different how it's managed it's different well this is just an enormous number of configurations and so today we're announcing in partnership with so many companies here in Taiwan the Nvidia mgx it's an open modular server design specification and the design for Accelerated Computing most of the servers today are designed for general purpose Computing the mechanical thermal and electrical is insufficient for a very highly dense Computing system accelerated computers take as you know many servers and compress it into one you save a lot of money you save a lot of floor space but the architecture is different and we designed it so that it's multi-generation standardized so that once you make an investment our next generation gpus and Next Generation CPUs and next generation dpus will continue to easily configure into it so that we can best time to Market and best preservation of our investment different data centers have different requirements and we've made this modular and flexible so that it could address all of these different domains now this is the basic chassis let's take a look at some of the other things you can do with it this is the Omniverse ovx server it has x86 four l40s Bluefield three two CX-7 six PCI Express Lots this is the grace Omniverse server Grace same for l40s BF3 Bluefield 3 and 2 cx-7s okay this is the grace Cloud Graphics server this is the hopper NV link generative AI inference server and of course Grace Hopper liquid cooled okay for very dense servers and then this one is our dense general purpose Grace Superchip server this is just CPU and has the ability to accommodate four CPU four gray CPUs or two gray Superchips enormous amounts of performance in ISO performance Grace only consumes 580 Watts for the whole for the whole server versus the latest generation CPU servers x86 servers 1090 Watts it's basically half the power at the same performance or another way of saying you know at the same power if your data center is power constrained you get twice the performance most data centers today are power limited and so this is really a terrific capability Networking for AI - The Devil is in the Details we're going to expand AI into a new territory if you look at the world's data centers the data center is now the computer and the network defines what that data center does largely there are two types of data centers today there's the data center that's used for hyperscale where you have application workloads of all different kinds the number of CPUs you the number of gpus you connect to it is relatively low the number of tenants is very high the workloads are Loosely coupled and you have another type of data center they're like super Computing data centers AI supercomputers where the workloads are tightly coupled the number of tenants far fewer and sometimes just one its purpose is high throughput on very large Computing problems and so super Computing centers and Ai supercomputers and the world's cloud hyperscale cloud are very different in nature the ability for ethernet to interconnect components of almost from anywhere is the reason why the world's internet was created if it required too much coordination how could we have built today's internet so ethernet's profound contribution it's this lossy capability is resilient capability and because so it basically can connect almost anything together however a super Computing data center can't afford that you can't interconnect random things together because that billion dollar supercomputer the difference between 95 percent networking throughput achieved versus 50 is effectively 500 million dollars now it's really really important to realize that in a high performance Computing application every single GPU must finish their job so that the application can move on in many cases where you do all reductions you have to wait until the results of every single one so if one node takes too long everybody gets held back the question is how do we introduce a new type of ethernet that's of course backwards compatible with everything but it's engineered in a way that achieves the type of capabilities that we that we can bring AI workloads to the world's any data center first adaptive routing adaptive routing basically says based on the traffic that is going through your data center depending on which one of the ports of that switch is over congested it will tell Bluefield 3 to send and will send it to another Port Bluefield 3 on the other end would reassemble it and present the data to the GPU without any CPU intervention second congestion control congestion control it is possible for a certain different ports to become heavily congested in which case each switch will see how the network is performing and communicate to the senders please don't send any more data right away because you're congesting the network that congestion control requires basically a overriding system which includes software the switch working with all of the endpoints to overall manage the congestion or the traffic and the throughput of the data center this capability is going to increase ethernet's overall performance dramatically now one of the things that very few IT'S OVER! GPUs Will Replace CPUs for AI people realize is that today there's only one software stack that is Enterprise secure and Enterprise grade that software stack is CPU and the reason for that is because in order to be Enterprise grade it has to be Enterprise secure and has to be Enterprise managed and Enterprise supported over 4 000 software packages is what it takes for people to use accelerated Computing today in data processing and training and optimization all the way to inference so for the very first time we are taking all of that software and we're going to maintain it and manage it like red hat does for Linux Nvidia AI Enterprise will do it for all of nvidia's libraries now Enterprise can finally have an Enterprise grade and Enterprise secure software stack this is such a big deal otherwise even though the promise of accelerated Computing is possible for many researchers and scientists is not available for Enterprise companies and so let's take a look at the benefit for them this is a simple image processing application if you were to do it on a CPU versus on a GPU running on Enterprise Nvidia AI Enterprise you're getting 31.8 images per minute or basically 24 times the throughput or you only pay five percent of the cost this is really quite amazing this is the benefit of accelerated Computing in the cloud but for many companies Enterprises is simply not possible unless you have this stack Nvidia AI Enterprise is now fully integrated into AWS Google cloud and Microsoft Azure or an oracle Cloud it is also integrated into the world's machine learning operations pipeline as I mentioned before AI is a different type of workload and this type of new type of software this new type of software has a whole new software industry and this software industry 100 of them we have now connected with Nvidia Enterprise I told you several things I told you that we are going through two simultaneous Computing industry transition accelerated Computing and generative AI two this form of computing is not like the traditional general purpose Computing it is full stack it is Data Center scale because the data center is the computer and it is domain specific for every domain that you want to go into every industry you go into you need to have the software stack and if you have the software stack then the utility the utilization of your machine the utilization of your computer will be high so number two it is full stack data scanner scale and domain specific we are in full production of the engine of generative Ai and that is hgx h100 meanwhile this engine that's going to be used for AI factories will be scaled out using Grace Hopper the engine that we created for the era of generative AI we also took Grace Hopper connected to 256 node nvlink and created the largest GPU in the world dgx gh200 we're trying to extend generative Ai and accelerated Computing in several different directions at the same time number one we would like to of course extend it in the cloud so that every cloud data center can be an AI data center not just AI factories and hyperscale but every hyperscale data center can now be a generative AI Data Center and the way we do that is the Spectrum X it takes four components to make Spectrum X possible the switch the Bluefield 3 Nick the interconnects themselves the cables are so important in high speed high-speed Communications and the software stack that goes on top of it we would like to extend generative AI to the world's Enterprise and there are so many different configurations of servers and the way we're doing that with partnership with our Taiwanese ecosystem the mgx modular accelerated Computing systems we put Nvidia into Cloud so that every Enterprise in the world can engage us to create generative AI models and deploy it in a Enterprise grade Enterprise secure way in every single Cloud I want to thank all of you for your partnership over the years thank you [Applause]

Share your thoughts

Related Transcripts

Nvidia Stock Drops After Earnings What Went WRONG? -- NVDA Stock thumbnail
Nvidia Stock Drops After Earnings What Went WRONG? -- NVDA Stock

Category: Education

Nvidia just reported earnings and unfortunately the stock is down roughly 7% on today's episode i want to share my overall thoughts on earnings but i don't want to hold you hostage i want to say i think earnings were great i don't think there's a reason for the selloff and i think as a long-term investor... Read more

HSBC Predicts Nvidia's Massive $30 Billion Revenue Surge | Nvidia Stock | NVDA Stock | NVDA thumbnail
HSBC Predicts Nvidia's Massive $30 Billion Revenue Surge | Nvidia Stock | NVDA Stock | NVDA

Category: Education

Also want to see how you see today's shaping up what's your wx word of the day juggernaut is my wex word for the day and i can't think of any better uh company to represent a juggernaut than nvidia and nvidia's earnings this wednesday have the potential to actually change investor psyche and also the... Read more

The Unmatched Power of Nvidia in AI Innovation | Nvidia Stock | NVDA Stock | NVDA thumbnail
The Unmatched Power of Nvidia in AI Innovation | Nvidia Stock | NVDA Stock | NVDA

Category: Education

A fresh read from the trips this morning taian taiwan semi reporting a 45% jump in july revenue compared to last year that was thanks to strong demand for the company's ai chips shares are well they're bouncing around along the flatline here but a big outperformer on the year okay one of their key customers... Read more

Should You Buy Nvidia Stock Now? A Timely Analysis | Nvidia Stock | NVDA Stock thumbnail
Should You Buy Nvidia Stock Now? A Timely Analysis | Nvidia Stock | NVDA Stock

Category: Education

Welcome back to fast market everyone kevin hs alex coffee now it's time for our cashtag segment now i want to bring in the vice president of research at like.com miss megan brantley to the show megan we're covering nvidia it's a fun topic on a day like this and actually we could turn this into the nvidia... Read more

Why KeyBanc Reaffirms Nvidia as Overweight with $180 Target | Nvidia Stock | NVDA Stock | NVDA thumbnail
Why KeyBanc Reaffirms Nvidia as Overweight with $180 Target | Nvidia Stock | NVDA Stock | NVDA

Category: Education

Why remain bullish on nvidia yeah well i think there's a lot of reasons to still like the company i mean they've driven accelerate compute about a thx over the last 10 years and they continue to be the leaders in the space um and ai is right now just too important of a area to to let up the gas and... Read more

NVIDIA Stock Set to Soar: Citi's Bold New 52-Week High Prediction | Nvidia Stock | NVDA Stock | NVDA thumbnail
NVIDIA Stock Set to Soar: Citi's Bold New 52-Week High Prediction | Nvidia Stock | NVDA Stock | NVDA

Category: Education

We still have to consider nvidia a week from today how do you see that i mean it's huge obviously um it's uh it's the most important stock in the world right now so u you know if if they laid an egg uh it would be a major problem for the whole market i don't think they will i think uh they're going... Read more

Microsoft's New Announcement: A Game Changer for Nvidia| Nvidia Stock | NVDA Stock thumbnail
Microsoft's New Announcement: A Game Changer for Nvidia| Nvidia Stock | NVDA Stock

Category: Education

[music] chip soxs losing some steam today as the nasdaq rolls over earlier gains giving up some of those earlier gains on that softer than expected manufacturing data now today's move to the downside what you are seeing play out within the chip sector follows yesterday's risk on rally actually looking... Read more

Nvidia Stock: Why Analysts Say It's a Must-Buy Right Now | Nvidia Stock | NVDA Stock | NVDA thumbnail
Nvidia Stock: Why Analysts Say It's a Must-Buy Right Now | Nvidia Stock | NVDA Stock | NVDA

Category: Education

Anticipate the market reaction to that looking like because we have seen nvidia come out in the past and beat and raise and the market has not necessarily rallied around that news because the anticipation is so baked in i would say with a pullback in the stock to around1 $100 i think a beat and a raise... Read more

How Nvidia Could Boost the Global Economy by $15 Trillion | Nvidia Stock | NVDA Stock | NVDA thumbnail
How Nvidia Could Boost the Global Economy by $15 Trillion | Nvidia Stock | NVDA Stock | NVDA

Category: Education

Nvidia is gearing up to report earnings next week in wall street well it's still feeling bullish on the ai darling despite some worry over potential delays of its blackwell chips yahoo finances madison mills joins us now with more maddie hey shaa so if you saw the nvidia sell off on thursday and got... Read more

Understanding Nvidia’s Stock Decline: Market Implications | Nvidia Stock | NVDA Stock | NVDA thumbnail
Understanding Nvidia’s Stock Decline: Market Implications | Nvidia Stock | NVDA Stock | NVDA

Category: Education

You actually have been in asia a lot and you've been looking at the factories and you've been talking to people who are the end buyers of some of these chips and the producers to understand exactly how much demand there is what have you seen that gives you such confidence that this really is a takeoff... Read more

Nvidia Stock at $150? Here's What One Analyst is Predicting | Nvidia Stock | NVDA Stock | NVDA thumbnail
Nvidia Stock at $150? Here's What One Analyst is Predicting | Nvidia Stock | NVDA Stock | NVDA

Category: Education

This comes as earnings for some of nvidia's biggest customers those large cap tech names have been a bit mixed our next guest though seeing some opportunity saying that the volatility represents a reason to buy joining us now we want to bring in antoan shakai ban he is a technology infrastructure analyst... Read more

Nvidia CEO Drops BOMBSHELL: Most IMPORTANT AI Stock | NVDA STOCK thumbnail
Nvidia CEO Drops BOMBSHELL: Most IMPORTANT AI Stock | NVDA STOCK

Category: Education

Nvidia stock investors would probably argue that nvidia is the most important company in the world but what if i told you that in a recent conference nvidia ceo pretty much confirmed these claims more specifically he mentioned how the demand for his chips are creating very emotional customers what's... Read more