08/28/24 | Cerebras' New AI Wafer, A Game-Changer in AI Hardware | Daily AI News by GAI Insights

[Music] morning everybody John spiy here for G insights with our gen news briefing on the essential important or optional news today we're going to go through a number of things anthropic publishing some new system prompts we have some fantastic new stuff from Nim and um at Nvidia in terms of building agents zerob based redesign how do you get the real cost out once you put these things in deep M and UC Berkeley are doing uh a fantastic new job on trading off inference versus uh the creation of uh better models uh Sarah Bross has a unbelievable new AI uh wafer that is um about 10 times cheaper than the options and open AI racing to put in Strawberry so those things we covering today Adam could you start us off with anthropic yeah you're gonna gonna share up the screen or who is doing that yes I'll be right there so the the the article the headline there just says anthropic publishes the system prompts that make clog tick and um it's it's cool I think it's good they're doing it um basically it's as it says they're they're publishing the system prompt it's something they're going to continue to do ongoingly in the spirit of transparency and trying to model um what they believe is the right approach to to bring these models to Market um fact is the these models have already been jailbroken uh people have published their system prompts it's not a secret per se it's just them making it more uh apparent that they're being transparent I guess you could say um but I I consider it optional news yeah I I I really do think it's optional but it's interesting to know how um through the prompts the um anthropic and other vendors are putting um an envelope around the llm telling it what to do how they're infusing the character of the llm so it's an interesting um interesting find that you look at the prompt and how llm llms think right so from that point of view it's interesting quite optional Paul did you have a feeling on this one we go with optional yeah I think this is continue you all these llms are continue to have incremental feature improvements over the next decade and this is one interesting one but not notable for our AI leaders I think if you're a practitioner on the ground you want to do this but not the leader all right so we got the Big O starting off here on a Wednesday morning morning excellent next we go to Nvidia and their Nim agent Blueprints and how they're trying to create better uh you know fundamental Frameworks if you will the creation of Agents Luda do you have a feel for this yeah um this is interesting so Nvidia is um making a real push in getting Enterprises to use their platform and building stacks of capabilities to enable them to do it faster it's a very they introduced it in I earlier this year I think in June the um Nim and not blueprints but Nim architecture so this is the inference uh model that um that that speeds up the inference right so and now they're putting that agent and um and they call flywheels the the simplified uh versions of in the three areas that you see on the screen customer service being one the pharmaceutical Discovery being the second one and the third one I can see multimodal PDF data extraction right that's what we just talking about before the call yeah you can give a picture you can give it a um a you know script audio and so this is this is all very good um given that in media is their position in the market I I kind of suggest that we do it as important uh but in general the news is an incremental Improvement so the news is optional but um I think I think important because of the presence of Nvidia and um in the market and the issue with runtime and ability to make it more efficient yes okay let me make the case for important I mean I think the everyone's breathlessly waiting for the earnings call N Video this week but I think the reality is that you Nvidia is its own death star that's exploding in this thing here the video nvidia's market cap in itself is bigger than the entire financial industry Goldman Sachs Wells Fargo UBS all of them added together are still smaller than the video and they are innovating and taking this money innovating in so many different layers it's not just some chip companies over in ch Taiwan that's sending something here they have their own large language models are innovating on the robot side they are profoundly changing in my view a whole bunch of dimensions and I think it's really important for the AI leader to view the video as that and not just an Intel like chip company uh from the old PC CPUs a so uh in that sense Luda I'll I'll support or I agree with your important yeah yeah yeah oh yeah that's what that's what I thought that it's it's necessary for the AI leaders to know about this but it's an incremental step they they certainly haven't really perfected it it doesn't seem yeah I'm totally comfortable with important this is a huge area and you know they've been working for decades now or at least a decade for building out the ecosystem they does an amazing job all right so we have that that one's going to be important let's go to our next one we've got zerob based um um zerob based redesign which I guess his first C to zerob based budgeting from three Partners out of or four sorry four Partners out of band three of them from Boston um that talk to about how you're going to get real AI cost savings Paul what did you think of this one uh you know it's it's a long article it's uh it's a bit self-serving because it's from a vendor I mean Bane's doing good research in this I found the article itself pretty long and not particularly um uh new ground um so I had it as optional yeah what do you think Luda yeah we've we've talked quite a bit about hey guys early on get a strategy don't just rush into the stuff uh you know and and experiment because it would takes money resources and the like and then you have this little dinky application that doesn't get Roi and then you will get questioned so many people have not done this and now it's time to step back and we another Point we've always said this is process reengineering thing right always said that so it's important for you to step back look at where AI can fundamentally change what you're doing and implement it in those areas that can give you the best bang for the buck so I I mean a year later we have this article so optional but it captures all of these things that we have been talking about I mean one of the things is interesting yesterday we had a very first AI leadership networking call this is a peer networking program we're U now offering for AI leaders and we had eight AI leaders chadam house rules and you know it stays in there but big companies and was very interesting was I mean we're coming Luda back to your point this is just digital transformation on steroids all the basics haven't worked or or haven't gone away you know project selection we have really good discussion on that you're a big company how many you know rocks and Pebbles do you have to do you if you don't get singles you don't have the rights to do the home runs but if you only do the home runs you don't have enough uh smaller projects for anyway so really good discussion here but um the basics haven't changed and you need leadership and strategy and AI may be interesting but it's not link to strategy why are we even having a conversation and you can't get Scaled organization why are we having conversation corre totally I remember with the CEO back when when internet was really started the CEO of a public company in the hardware space lamenting that a little startup that using internet much better is valuation wise higher than his I don't know 20 30 year old company and I said well you now can come back and do it better because you have more knowledge and yeah this was really daunting on a lot of leaders similar just one shot across the bow I want to lament about this Bane article for a second there's this part towards the end where excuse if we aren't critical what are we but but they say like that they're like these are key questions to ask yourself do we have a path to translating generative AI experiments into scale efficiencies and tangible Financial return on investment do we have a clear perspective on how our organization will evolve as generative ai's capabilities rapidly Advance it's like nobody can answer these questions of course they just want people to flock to them for for the help they need right because yeah this is big and scary and confusing and there's a lot going on so just be aware of all the pitfalls right yeah again I think a robot could have done a better job you know this is platitudes you know platitudes meet the obvious you know have a baby okay anyway um there lot of smart people at pain but whoever wrote this wasn't paying attention all right uh definitely optional all right next one coming up here we have deep mind and UC Berkeley um showing how to make the most of llm inference and let me go ahead and share this Adam what do you think here yeah I'll kick us off some of the text here probably the the most notable part of this for me is seeing that Berkeley who you know are are on the Forefront at least in terms of coining terms around uh compound systems and are certainly notable for their research um are are are in Partnership here with with Google and deep mine right because that you can imagine the sort of things that are going to come out of them doing research together um that for me was probably the most um important thing to note probably um it's pretty cool what they're doing they're they're making they're they're going to be able to train models with with much more uh higher efficiency they're finding some ways to understand what the compu is doing and when it's doing it and how they can basically um leverage that understanding to to train these models much more efficiently for future models that's that's probably going to be some great news it just speeds up the already accelerating timeline of how all these things are going to go but it's it's research and um it's something that will be impacting a pretty pretty longterm Horizon so I I considered it optional news yeah interesting I was going to make the case for important but not for the article itself but the underlying U model I mean the underlying paper which you can click to you know I think like over here on the right hand side in the paper and I only get a chance to look at it quickly but you have this lovely trade-off between model size and inference versus model so you can see in the in the smaller models you know you're you're better off uh basically giving it multiple easy questions smaller models mulle prompts as opposed to training the model so I don't know I was going to make the case for important because as the mature um AI shops you know start to think about true AI Ops you know and that you know this kind of stuff is stuff they should plan for because not only we going to have tradeoffs between large and small models but we're going to have trade-offs between training and inference and then when you start you know putting that up against what we're going to talk about later Ser CBR we have these massive wafer chips right then just how you think about where you make your Investments is going to change uh but so I was going to make the case for important but not not essential because so really yeah I I'm not sure it's this article John I agree with you I think the S oh yeah I'm talking about the underlying research paper still still it's a little researchy but and long um but it is interesting that what's happening the center of gravity is moving to inference optimization and we're seeing more even today right uh we're seeing three articles on the inference how do you make it um more efficient so and the problem is that llms take one token at a time and predict next thing right they can do it in parallel and so what people are now doing is trying to solve for that and um and and in in in runtime in inference you and there are s different approach and so what is clear this will change they already good approaches uh and tradeoffs between the pre-training versus optimization and and getting the evaluation done at runtime and only providing the right answer right it's it's sort of the um the various memory versus inference and all of those trade-offs so it is an important development I'm not sure it's this article though but but Paul and Adam all right wee we're headed toward optional any disagreements on this one can't hear you Paul mute Paul you some sometimes my wife would love to have a mute button sometimes she'd like to have a mute button off in real life but uh this optional I think the um I do want to encourage the audience to subscribe and like us on YouTube and uh share some comments if you agree or disagree uh be a ra on the comments we'd love to understand what your views are but my view this is optional all right head for optional important area to pay attention to okay we are now looking at cerebros the this whole new inference capability that they've put together on their massive wafer chip and they're claiming that they are uh as you can see here 20 times faster and I think one the cost if I remember right from down below and the basic argument here is that if you're going to get inference done you have to pass the whole model over and it's a bandwidth issue in terms of how you pass the model from sham to the compute cores as you can see here what did you think Luda this is where I uh actually thought it is essential so UMAS cerebras I'm not sure how you how you say yes way yeah the first way cerebras okay so so they they are a company that essentially the um the charter is to go head and head with Nvidia and they raised 720 million so far they uh in the series F I think in 20201 I want to say so the and now they've come up with this completely new way and new chip it's Hardware solution but it is going on the in at the inference and the memory and the speed right so that's just the discussion we just had so this is kind of a breakthrough um realize it's Hardware but still it is a real uh breakthrough because 22 times um much less cost and speed um some unbelievable times faster this is what we need and there is whole bunch of stuff in production already yeah here um and and that is essential for people to know that this is happening even though they might not shift to to here but I do do you know how far along they are with production and capacity and availability capacity I'm not sure I've tried to look look it up what they have and where they have they have uh locations in uh Korea and Japan so I don't know where their production plans are so but I'm saying it's essential not that everybody jumps just to understand what's happening know that Nvidia is not the only game and moreover pricing wise and cost-wise and the speed wise this is happening we have the case for essential here as a clear alternative to the market leader not sure exactly how much they've been deployed what do you think here Adam you know I was initi I mean I could be swayed too important I was initially um on the on the optional end of things it it's cool that they're taking a different approach and they they have big chips but um and it it's going to be a lot cheaper but um unless someone's literally looking to make purchasing decisions um for h100s or they're looking to um I I guess if they're looking to a vendor that's offering them some sort of hosted GPU computer and they're starting to find like I mean it's worth looking for this or or thinking about this if if it's if you're thinking about getting access to h100s this is something you should know about um you know even as I say it I think that that means it's essential but I was I was on starting on the optional end of things and and saying you know I could go up to important so so it is it is uh I understand that from the that is API that you can actually you know use your application and call that um those servers that they have deployed so I don't know how much capacity they have but it is available so I I really think that people need to know about it look look at this B yeah I um you convinced me I was going to put it as important but I think I I think you're right it's essentially it's such a different if it really is one5 it is in production we don't know how how many people have deployed it but uh and even grock is in here you know is way more you know uh way faster uh and more capable on inference and this is almost double so yeah so Paul you okay put this for essential sure it's certainly essential and um I don't want to say the venue there were a couple events that we were working with MIT and Nidia on and there were one or two companies that could not be on the same panel in the same room Nidia per rules and this was one of the companies and so they are on the radar screen at Nvidia and I think from that that's another reason I have them important or important or optional I'll support either one I think it's I think I think the cost thing is important it's very interesting one of the things that has come up a lot in some of the conversations I've had with big companies is where do we stop thinking about model optimization and moving the cost optimization for API cost and everything else and this is one of the many things moving to get those cost curves down and two years but it is not cost just a Paul it's inference speed corre right and and that see if you have a customer service application that has a latency because it's a complex issue that is trying to get a lot of information about you will get tired of it so I and as I said we have things serious cases deployed out there right so people will start looking and they're expensive but but also the speed is an issue so this is a capability Maybe not immediately available but the capability that would address that and so people will be less scared to put these production cases out there so I I just really think this is a I think it's great I think the other thing is we doing a lot more planning for October 7th and e8th Conference gen of AI world coming up in October it's very clear much of that agenda is really helping the AI leader think about their 2025 plan and budget and I think in particular in that L it really uh yeah it's it's important it's and it's essential to know all right we have essential sounds like a good one and fun interesting fast all right let's head to our last article which is from the information it's about open AI um pushing at strawberry reasoning engine I think is an interesting name but whatever Paul what do you think of this one uh you know the information continues to get little uh Scoops from companies talk to people under NDA or breaking ndas or uh there's a different article about open AI running around trying to figure out where all these leaks coming from here this is an article about some of the advanced research they are U particular around some of the math I thought it's modly interesting I think it's optional I think it falls into the of course everyone's trying to improve um everything I think the idea this that one solution set is starting to give more and more thinking time for these really complex things is an interesting uh um uh angle but uh optional my view okay we have optional here yeah I'm sorry go ahead go ahead go forward I'm sorry so and another point about the inference um issue so in Strawberry what they are doing is doing the similar thing that we we saw with deep mind and um and and the research article right it's they are evaluating the result before they serve it right so as as and that that in reduces the hallucinations and and um and makes it much more clear and much more precise so it's it's interesting that how we see almost like the thread through the Articles today that it's inference based Al righty so we're landing on optional here Adam any problem with that one okay so today we have uh we have optional for um the for um sorry important from prototype to prompt with Nvidia we have optional on zerob based redesign we have optional on Deep Mind UC Berkeley and their compute um we have optional on the open AI strawberry effort and essential on sah Russ's new uh AI wafer great that's that's where we ended up today any announcements for our audience uh no I think just two well actually no two things one is continue to like and share uh second if if you're an AI leader and interested in our peer networking group let me know it's AI it's Invitation Only it's VPR above uh but we had a fabulous meeting yesterday with the cohort one it's gonna meet again uh at the end of September and uh we're actively uh taking um interest in cohort 2 which will start in two three four weeks right thanks everybody we'll see you tomorrow

Share your thoughts

Related Transcripts

AI News: OpenAI Finally Released Their New Model! thumbnail
AI News: OpenAI Finally Released Their New Model!

Category: Science & Technology

Intro i just spent the last week at disneyland and of course the week that i'm gone turns out to be an insane week with tons of big announcements i'm a day later than normal on getting this ai news video out so i'm not going to waste your time let's just jump right in there was really two major major... Read more

New ways to search: Beetlejuice Beetlejuice (ft. Bob) thumbnail
New ways to search: Beetlejuice Beetlejuice (ft. Bob)

Category: Science & Technology

[upbeat music] [grunts to speak] Read more

Congress rushes to approve final package of spending bills before shutdown deadline thumbnail
Congress rushes to approve final package of spending bills before shutdown deadline

Category: Science & Technology

As the clock ticks down lawmakers scramble to pass the final spending package for the current budget year avoiding a potential government shutdown the $1.2 trillion measure combines six annual spending bills with over 70% allocated to defense sparking intense debate and negotiation the house and senate... Read more

Nvidia se suma a la financiación de OpenAI junto a Microsoft y Apple thumbnail
Nvidia se suma a la financiación de OpenAI junto a Microsoft y Apple

Category: News & Politics

Envidia se enfoca en ganancias pero parece que podría invertir en open ai y de hacerlo se estaría sumando a apple y también a microsoft microsoft es el mayor patrocinador de open ai el cual invierte unos 1300 millones en la empresa envidia invertirá 100 millones así que está bastante lejos de eso la... Read more

Government shutdown deadline: House expected to vote on key government funding legislation thumbnail
Government shutdown deadline: House expected to vote on key government funding legislation

Category: Science & Technology

As the house rushes to vote on crucial government spending tensions rise with the looming threat of a shutdown the bill covers defense homeland security and key departments vital for national security and public safety if passed the bill moves to the senate for approval adding pressure to the tight... Read more

Congress races to pass $1.2 trillion in spending before shutdown deadline thumbnail
Congress races to pass $1.2 trillion in spending before shutdown deadline

Category: Science & Technology

As congress prepares to vote on a $1.2 trillion spending package tensions rise as the deadline looms will they avert a shutdown the senate has limited time to vote on the spending package risking a partial shutdown will they beat the clock some republican senators are posing challenges to the bill causing... Read more

The iPhone 16 Is Here! CNET Editors React to Apple's 'Glowtime' Event thumbnail
The iPhone 16 Is Here! CNET Editors React to Apple's 'Glowtime' Event

Category: Science & Technology

Intro [music] welcome back to cet live coverage of the apple iphone 16 event the glow time event or now i guess we're in the afterglow um i am here with my co-host scott stein and abar alii and we have a lot to dive into so we're just going to go around the room and you know what let's sh our feeling... Read more

Unleashing AI Power: Cerebras' Giant Chip Meets Meta's LLaMA 3.1 Revolution! #shorts #viralreels thumbnail
Unleashing AI Power: Cerebras' Giant Chip Meets Meta's LLaMA 3.1 Revolution! #shorts #viralreels

Category: Science & Technology

Cerebra systems is revolutionizing the world of artificial intelligence with its massive wafer scale computer chip roughly the size of a dinner plate this innovative technology is about to take a significant leap forward as it prepares to integrate me's open source llama 3.1 onto the chip by putting... Read more

AI-Powered Teaching Assistants: Revolutionizing Education at Cornell. thumbnail
AI-Powered Teaching Assistants: Revolutionizing Education at Cornell.

Category: Science & Technology

[music] cornell university is trying something new and exciting they are embarking on a journey to revolutionize the educational experience for their students they are using artificial intelligence to help teach students this innovative approach aims to enhance the learning process by integrating advanced... Read more

Strawberry Q* SOON, Apple Intelligence Updates, $2,000/mo ChatGPT, Replit Agents (AI News) thumbnail
Strawberry Q* SOON, Apple Intelligence Updates, $2,000/mo ChatGPT, Replit Agents (AI News)

Category: Science & Technology

Openai strawberry model imminent open ai strawberry model is imminent that's our first story for today according to reuters we have open ai plans to release strawberry for chachi pt in 2 weeks and jimmy apples the only reliable leaker has mentioned it as well jimmy apples last week all quiet on the... Read more

Multi-LoRA with NVIDIA RTX AI Toolkit - Fine-tuning Goodness thumbnail
Multi-LoRA with NVIDIA RTX AI Toolkit - Fine-tuning Goodness

Category: Science & Technology

Imagine this you're an ai application developer and you need to fine-tune your model for your use case but you need to fine-tune multiple models a new technique called multil laura is now available in the nvidia rtx ai toolkit it allows you to create multiple fine-tuned variants of a single model without... Read more

OPENAI'S NEWEST GPT-O1 AI MODEL DEMOS THESE 6 NEW INTELLIGENCE UPGRADES | TECH NEWS thumbnail
OPENAI'S NEWEST GPT-O1 AI MODEL DEMOS THESE 6 NEW INTELLIGENCE UPGRADES | TECH NEWS

Category: Education

Today we'll break down open ai's new 01 ai model as we compare its six newest abilities so how smart is it now Read more