RadialSpark Radio - EP 10: Discussing The Microsoft Outages

Published: Jul 24, 2024 Duration: 00:48:42 Category: Science & Technology

Trending searches: microsoft outages
Emergency Episode Introduction hey guys and welcome to a an Emergency episode of a radial spark radio special edition I guess we call it a a breaking news special edition um however you want to pull it uh but I think unless you've The Major IT Outage Explained been living under a rock the last couple of days uh there was a major outage uh in the it ecosystem uh crowd strike turned off 8.5 million Windows machines uh which had dramatic effects across the industry and I believe it about and just you could check my numbers on this because I think you had the article you originally found pulled this up about 5.4 billion dollars of Damages right yeah um I I'll be honest if it weren't so terrifying it would be a kind of a really fun and interesting case study uh what happened here well I mean I think we're we're going to talk about this kind of like a case study and frankly uh as a wakeup call for a lot of people who are looking at who have Salesforce who have Microsoft uh systems and who haven't yet looked at Heroku uh AWS or other platforms and maybe should start looking so that's going to be the point of this discussion here but before we get into into the solution let's talk about what happened so Justin crowd strike something something blue screen to death my my Windows machine machine exploded I couldn't do my job my flight was delayed what happened give us the load down everybody's favorite bad so yeah um in a nutshell uh an update was pushed CrowdStrike Falcon: What Went Wrong to a crowd strike product called Falcon sensor that caused an operating system level failure uh effectively as a computer would boot up and try to load the crowd strike software it would keep crashing the blue screen so it could never fully initialize um to to explain how or why something like this happens very briefly it helps to understand what crowd strike Falcon is and uh in a nutshell it's an edge device protection and like antivirus software so for like so before we go yeah before we go in what's an edge device like so everyone knows what we're talking about like we're we're talking about like the devices that your business users like the boots on the ground would wear and honestly as I say that I realize I should probably correct myself because crowd strike Falcon does also protect servers and like core infrastructure but um it's like you have a Windows laptop you need antivirus on it you need to detect threat vectors that could occur on that device that's what crowd strike Falcons monitoring and trying to like intercept and handle um so in a nutshell like the the 5-year-old children's friendly version of this is crowd strike Falcon is equivalent to like a a business specialized Norton Antivirus or or McAfee um and and uh in fact Falcon sensor the specific piece of software that failed is the kernel level antivirus that's actually looking for the threats so very briefly the the update that happened because I'm I'm sure some people who are familiar with the it infrastructure like well there should have been all kinds of quality checks things like this when we're talking about antivirus software there's something you have to kind of keep in mind um a lot of antivirus Specialists that they'll talk about like zero hour zero day attacks when a malicious actor releases a virus or a Cyber attack this is this is not a company releasing a software update they're not like hey check out our new Cyber threat coming out on Thursday here's the release notes this is what you need to know it just happens and odds are like a successful attack is going to bypass a couple machines somebody's be like wait this isn't working what's happening and it ends up being manually reported like the symptoms are identified it gets sent to security Specialists who are constantly monitoring for this kind of stuff and they are reverse engineering the attack they're like okay what are we looking for what are the behavioral patterns what is this doing and then they're trying to push out updates to what this threat is to their product to all their their subscribers all their customers so all these antivirus companies that like talk about hey we protect against zero day attacks things like that what they're effectively saying is that they're monitoring for new threats every day and when these are getting reported by monitoring agencies uh by companies they're creating those uh and you might have seen this in like even just like Microsoft antivirus like they're creating those antivirus definitions there's like this is what this new threat looks like this is what you should be looking for antivirus software here's how you handle it this is how you log and report it back to us to know that it happened so there's a need for some expediency right because a new attack's coming out it's trying to go as far as it can and these guys are trying to lock it down so this is not like a software update that's going through like hey we've got days weeks months to run quality checks on this is a hey we found it we want to get this out to our consumer potentially like within within hours if possible to to protect as many people as possible and antivirus software is actually written in such a way where you have your your core thing that's like the actual software and then you basically have a catalog of all the threats that tell the software this is what you're looking for this is how you handle it so the update was an update to the catalog it wasn't actually changing the software it was just adding new like this is new configuration new things to look at right and what it what it did was it actually misidentified a core piece of functioning Windows architecture and brought the house down to keep it uh close I close but not quite uh for anybody who's a Salesforce developer null pointer exception basically pointing pointing to to something that didn't exist and the code was like we don't know how to handle this and because it's antivirus software at the operating system level it has to initialize as part of the boot up so uh an error that's unhandled at that level prevents boot up and the the what was Insidious what made this so costly and so damaging is um the antivirus software is designed to let you like push out these updates in near real time from the cloud like we've got new definitions go but your computer has to be up it has to be connected to the internet to be able to receive these so when you have a bug like this that prevents your computer from fully boo it crashes before connecting to the internet and everything there you have to locally on the computer fix the blue screen of death so that it can reconnect to the internet and then get a healthy version of the update or fix itself so it is call it a rather innocuous mistake that happens all the time in programming that prevented computers from booting so now it Specialists the world over had to go like machine by machine and fix the bsod so that the computers could Boot and then the antivirus could be fixed like that's what happened yeah and so for the most part like you're looking at a bunch of like a b a bunch of industries were affected like obviously Delta is still Impact on Various Industries feel reeling from this um and I mean a lot of what they had to deal with is all the logistic like because again like Delta's Logistics um operation get or any airlines Logistics operations get completely upset for a day it has Ripple effects throughout like so anyone that it's in um any sort of uh uh Supply Chain management or Logistics management that you guys all know like what an outage will do to your entire week month year if your product's time sensitive this kind of thing is is dangerous right because if you can't launch a flight all those passeng now need to be rebooked on other flights and so it it shifts your entire schedule and you can't even rebook them because the machines don't work and the server is down and everything it's it's it's all that yeah oh it's it's devastating so so again like this hit this hit Airlines Finance uh healthc care uh retailers supermarkets like I think those was the big ones a bunch of Fortune 500 companies and other smaller companies down the road all of which the Common Thread between all of them is that none of them are it security companies none of them are are software development companies none of them are network and security infrastructure management firms that is not their core competency they're Airlines they're Finance they're Wall Street they're they're Healthcare Providers um none of these companies make their money on it yeah yeah so like it arms and these firms all resemble a fraction of their year-to-year spent um they're a cost center at worse and maybe in some cases for some of the companies they're doing some Innovation or whatnot but for the most part like a lot of these like especially the larger Fortune 5 500 companies um you know unless you're working on bringing yourself into like the AI age or the iot age or the Big Data like any of those ages that we've seen come through the cloud age which we're going to be talking about here in about a second because a lot of these guys weren't in the club AG yet which is why they were affected so drastically and we'll talk about some mitigating factors of how to avoid some of this but they're not they're not innovating their main job isn't to detect threats they've off outsourced that to a company like crowd strike to that software for them to handle it so this type of failure Justin is is something that was unprecedented worldwide there nothing like this has brought down this many computers ever before single largest IT single and I think the thing that people really need to get their head around here is that this won't be the last time and it won't be the last time for a couple of reasons um and so I want to talk a little bit about why was Windows affected like why was this the the thing that happened because uh if this Crow strike software uh the the Falcon software is not just for Windows it also goes on Linux uh and IOS as well correct yeah and and uh like uh Mac OS iOS yeah iOS it it it's multiplatform like um not the full feature set but even some parts of Falcon will or will support Chrome OS like Chromebooks um but here's what you here's kind of what you have to remember um for each operating system like when you're developing that driver the thing that ties in to your your kernel level for the operating system each of those is going to be platform specific right so uh you're developing your Windows driver using the optimal language for Windows you're developing your Mac OS driver for your required proprietary languages for Mac OS development like everything's getting like the drivers are being Sheed separately for each platform and then these these configuration files the definitions of the threats they're getting distributed out to all of these so in this case because and again I you know like playfully called it like null pointer exception it had to do with the way that the Windows driver was interacting with the new configuration file that was coming in that was causing the misbehavior in that operating system only the other drivers were handling it correctly and and it wasn't throwing like this unhandled exception right so so I think the point that that I want to drive to is the fact that like this seem to be like a oneoff issue like it seems like someone made an oopsie there was insufficient testing but the the problem is that vendors like crowd strike are in a really tough bind where like you mentioned like zero day you know like like the virus makers aren't going like hey here's the patch notes here's all the systems we're about to compromise good luck and then there's a race it's like boom they did it something happened we need to fix today now now now now now now now it's always in response to an emergent threat so you're going as fast as you can and sometimes sometimes you get white hat white hat hackers and they spot some of these threats earlier and they're able to incorporate but like something like this was like the an emergency update like you don't have a The Future of IT Security lot of testing time systems already compromised you need to push the patch out now and this is why I wanted to talk about why this isn't going to be the last time something like this happens because in the age of AI and large language models um if you guys want to go into like especially like any of our like people who are technical like who watch watch this program uh who let's say you guys are familiar with Salesforce and Apex or whatnot um go ask chat GPT to write a bit of Apex give it some parameters give it some information a little bit of training um and have it pulled together some Apex or some lightning web components and you'll find that it's going to do a pretty decent job the publicly available like super cheap or free to use like trial version chat GPT will do a reasonable job at building some code and the thing that you guys have to understand is that Windows is the number one operating system throughout the world it's it's the most attacked one because it is the number one like a lot of people have misconceptions that like mac is somehow more secure um and they're slowly finding out that that's not the case it's not because Mac was special because of how Mac you know iOS or Mac OS was built on Linux Foundation sorry iOS wasn't but Mac OS was built on a Linux Foundation um that it was somehow more secure it's just no you know security through obscurity no one cared about it because no one was really using Max in business and personal Computing that is shifted and now you're seeing more attacks go that direction but Windows is still by and large the market leader when it comes to operating systems so they're going to continue to be attacked but now we've democratized with llm the ability to write basic code to build viruses to do attacks like this through malicious use of large language models I'm not saying like someone can go in and just use chat GPT to build me a virus a lot of these large you know corporations with LMS are going to shut that down but you're you can't tell me with a straight face that malicious act actors out there in the world aren't going to try this whether it be different countries uh know people like to use like scape goat like Russia or China or things like that um that are going to put together their own large language models to build you know hacking groups instead of just really smart people now you have really smart people plus the power of AI and computation to start quickly developing this and it's going to be an arms race and if you're looking at this arms race that is happening right now you need to understand that you're in it you're not just watching from the side going like oh uh luckily this didn't affect me right you know or if it did affect you it's like oh this is a one-time thing it's like this is the beginning of the arms race and this is what I want to talk about to the customers out there who were affected by this Why Cloud Migration is Crucial Microsoft uh uh bug who have Salesforce and who have looked at migrating away from this client server Paradigm and looked at going to the cloud it's probably now the time to start looking at Heroku like you guys need to start migrating some of these applications over to Heroku and if you do you do with the understanding that you're not there to fight the battle by yourself if you're sitting there going like I need to have a death grip on this I need full control over my Hardware over my network over my infrastructure I need to be the guy this is the this should be the wakeup call for everyone who is not in some regulated business that requires it some very specific use case that absolutely needs to have it um anyone who's sitting there going like yeah this isn't necessary for me to do it this way we just have done it this way now it's time to really think about making the switch now it's time to start migrating especially if you already have Salesforce and this is what we talked about on the program a lot like Heroku has a native data integration tightly coupled tools to your Salesforce ecosystem like if you're already using Salesforce and you've tied into these client server paradigms and you have Edge devices deployed with Windows Frameworks on there and you're using Crouch strike or a similar program and you dodge the bullet here do you need to be do those Edge devices need to be windows-based do you need to have those that client server set up uh in such a way or can you migrate that to the cloud can you build your Heroku app and can you integrate your data there because when you do that you're no longer on the front lines you now have Amazon web services a multi multi multi-billion dollar company whose sole purpose is to make sure stuff like this pretty much doesn't happen because if it does happen they lose business to Google Cloud to Microsoft Azure to to everyone else is out there that's ready to take down Amazon they had they spend more on it security than a bunch of these Fortune 500 companies that were affected who spent entirely on their it combin like it's not it's not even close they'll spend billions on do billions of dollars to prevent this stuff from happening and have risk mitigation factors when there is a breach to make sure they get up online and something like this doesn't completely shut them down you have that Ally in your corner so if you're not seriously looking at moving over to Heroku moving some of your key Mission critical apps over to Heroku like you're making a mistake and now Justin because we're like hey wait wait a minute like if I just put all my eggs in the sales for foka basket how does that solve my problem what if they get compromised what if this happens and so I'm going to pass the ball back to you and so you could talk a little technical don't go too deep but why this whole vertically like I think you called it a vertically integrated vertical integration for it why these these systems that did go down was so dramatic and why they weren't able to recover quickly whereas if they were on a Heroku Salesforce solution they could justtin go so um rather playfully I'm I'm G to say Understanding Core Competencies in IT this comes down to understanding your core competency as a business and I'm going to call it the Spider-Man principle with great power comes great responsibility so a lot of traditional it mindsets come from this idea of like I want to be able to control and micromanage every layer of my it and and this plays into this idea of crowd strike because crowd strike as a security vendor you're you're still looking at like as an example a crowd strike customer would have laptops desktops computers deployed with this crowd strike Falcon platform deployed on them then they might have their identity servers for like single sign on and for their active directory and everything deployed with the Falcon plat form on them they're custombuilt applications and servers deployed and let's assume this is all on like Azure running on Windows Virtual machines that each have these things installed what you've done is you've set this up where you're like oh I have a need identity for my employees but rather than just having an identity solution you set up a Windows machine fullon OS level access and control with security that you've had to implement and then with the identity solution deployed in it it um you have the same security provider on your server and on your clients so this Windows specific failure is now causing a server outage and a client outage so it's not just like oh specific people can't access your thing through specific devices it's like the service is down well it's it's it's so down from the because normally what would happen and start interject normally what would happen let's say like like a bunch of clients like get wiped out by this right and not that it would be no big deal it would still be a 911 fire alarm like it guys are going to be up until midnight like trouble shooting whatnot but what would happen it's like all right we're just going to re forget these machines we're just going to reimage them with the most recent backup boom it's solved we're just going to direct Hardware we're just going to take care of it like that that that's what you be do and you would do that mostly remotely like You' have lowle remote access before the machine boots to Windows you just get in there like disable stuff and just reimage the whole thing but they couldn't do that because the server was on some down the Imes oh D they so they had to have these gu they had to deploy their entire it security step to go in there to fix the server first before they can even go to the clients like and and do that and so they had to deploy everyone like around the clock to unravel this problem once they figured out what actually happened too it wasn't just there's a work it you know so the industry is affected by this like you said this is Airlines this is Banks like Wall Street yeah these people don't make money on Windows it like they they're not making their money deploying these servers deploying these devices this is Operational Support and why that matters is it means the it teams they've staffed and deployed are scaled for Operational Support just helping business bus users go about their day-to-day deal with the random uh you know day-to-day errors or issues that come up and helping those business users out so for each one of these deployed devices deployed servers you're not like oh I've got an IT person on staff for every single one of my business people it might be one to 10 one to 100 like you know big big big numbers and so when you say deploy everyone you're still saying deploying a smaller fraction of your Workforce to solve a problem that's crippled your whole business and and none of this stuff that's caused the problem or that you're having to do to fix it is remotely related to the thing you're good at like the the airline it people the banking it people these are here to help solve pseudo Tech pseudo business problems related to Airline or banking things that come up on the computers not to like let's do some crazy lowlevel OS reconfig right and mess with Kels and blah blah blah which we've already lost half the audience just like I don't even know what that is yeah yeah so we're going to stop that because this is where the vertical integration piece comes into play is if you're an airline if you're a bank like The Importance of Vertical Integration should you care about Windows versus Linux versus Mac OS versus whatever like you don't make money on the operating system level decision and you don't make money on setting up your own security and all your own stuff on top of the operating system so if you bring in somebody and like you mentioned Heroku like you bring in Heroku you're like I'm going to deploy my services my cloud-based infrastructure and I want to be careful right because we talked about like server client and how do you really get to the Cloud and people say well Cloud Platforms vs. Traditional IT like Azure is the cloud but if you're deploying Windows images and like fully manually configured Windows operating systems with your own stuff on it you're not really using a cloud platform you're just using Cloud hosted infrastructure um and so when you do that these kinds of failures that end up having to be fixed on your end because your provider can't fix it from their end that becomes your responsibility and that's where this insane amount of cost this 5.4 billion dollars like that's where this is completely unmitigated and there's nothing you can do to handle it but let's say you bring in somebody like Heroku and you're like okay we're going to use Heroku for all of our server based stuff we're just going to to build our services for Heroku on top of AWS all of this and you're like well still single point of failure like if something goes down at AWS if something The Role of Service Providers in IT Failures goes down at Heroku but the difference between them having a failure that they have to fix on their end versus like you having a single point of failure with your security provider is you don't make your money on the it side AWS and Heroku do so this outage has happens and their product teams their like their whole engineering staff not like on10th or one 100th of their staff gets rallied to fix this problem we're talking about the entire fulfillment team gets brought in and does this and pulls this because the thing that makes them money is your it and not only that but they're going to have specialized disaster recovery Disaster Recovery and Specialized Processes processes in place rolling back images platform wide because like this is the crazy thing anybody who's familiar with like the the the concept of a a a null pointer exception what that means is a programming thing probably understands like that can actually be an insanely easy thing to fix so on the crowd strike side this update gets pushed out with Challenges with CrowdStrike's Fix the break it breaks a bunch of systems they like oh no what happened happen they do the root cause analysis they find it and then they fix that issue the problem is is their fix doesn't restore the distributed systems the damage has already been done when your platform provider does a The Advantage of Cloud Providers fix they relaunch everybody's services so an AWS fix a Heroku fix brings you back live you're not responsible for relaunching your infrastructure with a healthy security platform the entire platform is being relaunched by your vendor and I think because they make their money on it they are very invested in doing it like that well and I think that's the key thing that this is what we want to talk about why there's a key difference here when people are like wait I'm just putting all my eggs from one basket to the other is that the error that happened with crowd strike is crowd strike doesn't control or have access to your it infrastructure they're just software security software yep right that's it that's all they do um so you have to have this specialist whether it be at Microsoft Azure let's say you have offshore that a partner like radial spark who's managing some of your stuff right uh or uh uh or your internal it team to be like okay well crowd strike had an outage uh they post they pushed a fix like why isn't this working again like you're a business user you're go oh my God why am I like why why aren't we live again and they just don't understand it's like yeah they push to fix but I can't even log into my machine like and I don't even have access to the ones that are blown up I can't can't just reset it like I need to go talk to Microsoft be like hey dude my entire server W it's it's it's gone I need you to take the old images and then need to move them over here it's totally exceptional process uh please help me and they're like something happened right or you have your own servers internally that you have like like you you're not in the cloud at all you're not even Azure you just have your own server farms and you're dealing with the same now you have to go in you're like oh wait we got to get the whole it we got to wipe all the server start from scratch let's go get the backups let's do all this other stuff like you got to mobilize that before you can even start fixing the problem whereas all that happens immediately over at an AWS like or gcp they're sitting there going like oh goodness this has happened we got to get on top of this where are backups let's like this was this was the um this is when you see with like Salesforce when they roll out different updates and security updates it's not like their entire platform all their server farms at once they say hey we're going to the North America region then we're going to do this region because if one of them blows up they're like all right we're just going to redirect cic to this other region just in case whereas if you're even a Delta it's like and you have a data center you don't have 80 data centers you don't have 10 data centers you don't have five D series you have one maybe I got one here I got one in Europe maybe like you've got just as much as you need whereas like Salesforce or Amazon they've got data centers everywhere they've got regions in Canada Washington Washington Indianapolis New York Virginia like all over the place Heroku same thing they're all over the country so it's like if one of these uh systems gets borked by something like this which can happen like don't don't get me wrong like this Crow strike thing is Scar and could hit people like AWS and like Heroku but the thing that I'm trying to draw attention to here is that when you are part of that Salesforce The Case for Cloud Migration trust layer and when you're when you're in that Circle and you deploy your mission critical apps there you have that entire team mobilized like Justin just mentioned like their their sole purpose in life is to make sure Salesforce works and to make Heroku work and also sell it that's it like that's all they do they mark it they sell and they make sure that stuff works so when you look at hey let me go like pull up salesforce's uh uh last quarterly report you're looking at their expense report of all the it and product staff they have and that like that number that they spend I guarantee you is probably bigger than most it budgets are there but the companies are affected and probably several of them combined because their sole purpose in life is to make sure stuff like this does not happen and to have mitig disaster mitigation plans and roll up plans and and backups on top of backups and tertiary and and you know backups there and so that's that's the thing that I like for for people who are sitting here they were affected by this and are really looking at man do I need to have this vertical integration because are use cases do are I don't want to a people off and say it's like you're idiots for not being on AWS like there's a bunch of good reasons to have your system set up like this and this was just a catastrophe and you're going to be crawling up crowd strikes you know what to make sure something like this never happens again um and then have alternative backups and whatnot to to figure out to to avoid this like everyone's going to be studying how to deal with this so this type of situation is mitigated if you're stuck here but if you're not tied to this if you're just using it out of convenience if you bought this solution from another company through some m& and they're using this type of Paradigm because that's what they've done for the last 20 years like it might be time to make the switch it might be time to start making the switch because even as an example let's say your Edge devices right you're company a and you're like we're all windows we're we're using all windows Toops because we use Excel and teams and whatnot but we've got Salesforce and we got some Azure stuff right and we built some custom apps to do whatever it is right and we access it through our Windows machines um no one's got Windows phones anymore but everyone has IOS and Android phones right and if you're trying to make a native Windows app that's client-based accessible to mobile is it kind of a pain to do like you want to First do like a progressive web app and or or do some like electron you know thing or or build something like on ionic that's crossplatform capable that it is also deployed to Windows something that's web- based technology so that you can at least like use it like outside of that right um before you even start having this uh conversation and a lot of these windows applications are all built on C sh and yeah you can access it through the internet but you know in so far as that I have my Windows machine and talks to the server if you're on Heroku it is mobile like anyone who's building on Heroku or anys or Salesforce they're all mobile first paradigms lightning is a mobile Mobile-First Paradigms and Cross-Platform Solutions first Paradigm uh the work that we do here at Radio spark is mostly no. JS Focus node angular uh Nest uh things like that like we are we are using JavaScript we are we are mobile first like like your app when it deployed on your monitor is also going to work on your cell phone like that that's the whole idea so if I'm I'm on a Windows machine right now doing this broadcast and boom I get hit with this and it goes down guess what I'm logging into Google meet on my phone like yeah I'm pulling up my app on my phone that I built that's running on Heroku and I'm continuing about my day while I'm calling Justin be like Justin my computer's down like so when his computer breaks he dials back in on his phone while I quietly jump off the broadcast to go pull my hair out well I mean I would but there's not much left to pull out um but but it's it's like we keep op we keep operating we're not dead in the water and and the thing that you got to ask yourself is if you're in this position if you have Salesforce and you're like man I've been thinking about moving more of the stuff to the cloud I've got this Legacy stuff but it works it's okay you should really be like this is the wakeup call I should not be this vertically integrated I need to start offloading maybe not all of your stuff but start thinking about some Mission critical apps that you want to offload to a firm like Salesforce like Heroku whose it budget dewars anything that you could possibly muster and still be uh uh financially viable by orders of magnitude like you need to be like let's go with an easier Paradigm and let's go with all the CRA all the uh uh creatur Comforts The Cloud brings uh the devops pipeline that Heroku brings the fact that you don't have to manage your network infrastructure the fact that you don't have to manage your security because you have private spaces out of the box the fact that you don't have to manage Hippa uh uh encryption uh concerns because you're using Shield uh and that data can then get directly integrated through to your Salesforce platform to do your CRM processes through a secure hippoc compliant encrypted connection out of the box on one platform that you know if there's going to be an incident they're going to be able to Pivot you quickly and this has happened before where they had a I think several years ago where they had that DNS outage that took out several Services they started moving people to those that were on not on that same DNS and you're able to do that whereas if you're just sitting there with a server farm and Windows apps you're at the mercy of how fast your it team can move and I guarantee you it's not as fast as you need it it's it's $5.4 billion slow like yeah that's how slow it is well and that's what that's I mean I feel like this is probably an eye opener for several organizations and you have to look at like what did this cost you because you look at what are you spending currently on your it like your staff your infrastructure all of that and what was your cost from this incident and you look at that dollar figure and then you compare that against like okay if I'm not holding on to all of this because I need to hold on to all of this and when I say need I'm talking like regulatory requirements like things where it's like in order to operate in this industry to handle this kind of data like there is a requirement to do this in a way that you are in full control over like I need to do it because it's the way we're all doing it that is not a need that's a I don't want to spend money on change so if you don't need to do this and you look at what this cost you in this instance and then you compare that to like okay what would it cost to start moving something like this over here because odds are when you look at like infrastructure pricing things like that these platforms these services like Heroku they're going to cost you a little bit more than you doing it all yourself but again Spider-Man principle with great power comes great responsibility if you don't want your smaller leaner it team to be responsible for fixing this when this happens you're like okay roku's it spend is this my it spend this and my cost goes up this much but my inent was well I would I wouldn't say like like that might be the case for for some customers some of you out there that are looking at this because you look at the list price of Heroku versus like raw Azure or AWS or or doing it on your server Farms you're like ah you know I don't want to pay that monthly spend you know yeah they get the security and all the benefits we just talked about but so expensive but the thing is is that you're now freeing up it resources to mobilize them not at managing your network not at managing your security not at managing the lowlevel maintenance of your daytoday just keeping the machines running because that's all paid for by that so all those people you've hired which I guarantee you aren't cheap salaries they're hundreds of thousands of dollars per person per year that you're spending on these people maybe not 100,000 at least $100,000 like you're in that six figure range um for a lot of these individuals to manage that kind of stuff even if you try to Offshore out Shore whatnot like again like this these like Network and security specialist Engineers do not come cheap they they straight up don't like you cannot and and you don't want the cheap ones because then you're really at risk so it's like when something yeah exactly so so when you look at that and you're look like hey there's an incremental increase in monthly licensing cost it's like yeah but look at all the money I save on not Reallocating IT Resources for Business Growth spending on payroll for people that are not advancing my business because you and I'm not saying you lay those people off but you're able to rep prioritize those people and focus them on making your business run better whether just day-to-day operational improvements or actually innovating actually doing the thing that make like building the customer facing apps that make your business spark that really make you guys shine that that like integrating and migrating and and uh uh buying different companies and consolidating those do all those Pro problems that businesses have where you're like you're you're trying to get two different organizations to work together and all the it struggles with that they're focused on that not yo Justin um yeah so the DNS over here was having an issue and I try to update you know windows and there you know I was in the reg edit and you know we were having some carel a dude it's just this just crashed I can't even get to bios and you just send a bunch of words you're just like I don't know what that has to do with how do I get this quote out quicker how do I make this happen faster how do I make this customer experience better like you need to focus on the stuff that makes you better and I think this is the time that you need to give platform like I I would just straight up say like if you're on Salesforce in particular you need to go to perok look you need to be like how do I get my apps over here so that I can focus on that so I'm not I don't have a bunch of devops people because it's done for you it's all run through GitHub you don't have to deal with it like they've solved all these problems and you take that money and either just straight up save it or you reallocate those people to making your business better faster and quicker The Future of IT and Security because again like I mentioned we're in the age of AI and it's not going to get any easier for these type of security threats so you either got to buckle up and start really investing in that or finding partners that have a proven track record of doing it right which is Salesforce and Heroku in my opinion Justin I think I had a good sign off there but if you want to add anything um before we'd call it quits do we miss anything are we dramatic enough we scared everyone well I I was gonna say I think um maybe the only thing we haven't touched on and just to like to make sure people understand like this won't be a one-off and it already isn't a one-off like you do need to take these kinds of instances very seriously because uh back in 2010 there was uh an update on McAfee actually that came out and uh incorrectly recognized a core Windows File as a virus and deleted it and stalled uh it was hundreds of thousands of not millions of Windows machines like 14 years ago not quite to this scale but still massive and I don't call this out to be like hey uh operational failures problems like when you're fighting this cyber security battle and you're trying to handle Zero Hour zero day threats getting updates out to your customers to secure their businesses CU I think here's like the one thing to call out right this outage that BSO 8.5 million machines and did $5.4 billion of damage it just took computers offline actual cyber threats like malicious actors they do this for Pro for profit like ransomware corporate Espionage like International Espionage if these businesses don't prioritize the speed and get the security out to protect you from people that will legitimately steal your stuff like that 5.4 would be even bigger so they're doing the best they can but this has happened before because the nature of this business the need for rapid response does leave a degree of exposure on the quality control side because you can't test something for months if you need to release it in hours and so it can happen it has happened it did happen it will happen again so you have to figure out as a company when it does happen what do I want my it exposure to be what am I willing to invest in handling this and that's like when you understand that and you bring that in and then you think about the partners like and I've been saying this like since we started working together Mike like know your core competency lean into your core competency if it's not a thing that makes you money if it's not the thing you're strongest at don't waste your energy on it find somebody that makes their money solving that problem for you and make it their problem because I guarantee you like Heroku and awf I don't know I haven't actually checked they might use the Linux installations for crowd strike internally and I guarantee you when things like this happened they are staring at it and they're like okay what we have to do to make sure we never go down like this people I'm pretty sure there's a couple people on the product team that you know that we met at TDX that are already sweating sweating bullets going like oh my like we but the thing is is that that's that's what they're there for like that's the only reason they're there and they have way more infrastructure way more money behind them to handle that problem whereas you at hospital a or Wall Street Company B like straight up doesn't you just don't and you won't you shouldn't I mean you know not to not to gloat but for those of you who haven't guessed we use Heroku over here at radial spark and there's a reason I've been smiling and laughing as we talk about this and it's because I haven't been crying and I haven't had to hand Michael my resignation or have a panic attack about this like he's already lost his hair though so he might be hiding something so we don't no the Hat's hiding something I'm not okay the the point is is you know if we're going to say the proof is in the pudding like yeah we are a smaller consultancy company like we're not a Delta we're not a whatever but like this happened we are uh an organization that prefers Windows machines ourself um and I haven't had a hard attack in the past couple weeks like my sailing has been smooth our websites haven't gone down our services haven't gone down my customers because we do custom development as a consultancy for for other companies they're not calling me freaking out about this outage like having Heroku as a partner has been a godsend for us during this whole thing and and so you know I guess like as a real life case study like here's the proof having the right people on your team that look out for this that know how to respond to this that their whole product strategy is around making sure that I'm not calling them freaking out because people are calling me freaking out like having that horsepower on your defensive line is huge that you I I honestly would say you probably can't put a price tag on it I would agree so since we both agree with each other um Conclusion and Final Thoughts yeah that will conclude this special episode of radio spark radio this is a bit of a weird one uh but we live in a weird age uh we'll be back in a week or two with some more articles that will be reviewing some more exciting things coming up with our traditional stuff talking about like yeah fun exciting like how to build cool and better apps and uh integrate that with your it infrastructure not like um why did my why is the magic smoke coming out of my Windows machine right now yeah so we'll leave it there guys thanks so much for joining us and uh if you guys are looking at trying to make a switch and be like hey is this feasible how much money is this going to cost or whatnot if you're one of those companies that just like hey uh like this is a wake up call for us like we want to start migrating some of the stuff over the cloud we'll give you guys a free health check uh give pton a call uh shoot him an email uh links will be in the description below uh give your Salesforce rep call as well um they'll point you to the right place and hopefully that'll also be us and if that's it uh guys take care and we'll catch you in a couple of weeks yeah

Share your thoughts