DevOpsDays Seattle 2024: Kurt Andersen - Implementing a learning team

Published: Jun 17, 2024 Duration: 00:18:26 Category: Science & Technology

Trending searches: kurt andersen
on a completely random note to start has anyone noticed the dragon that's lurking in the mountain over here I was just seeing that today while we were waiting looking at the background Godzilla okay fair enough I want to thank the organizers for scheduling a a triple play here we started with Nathan and he covered the history of Dora and underlined the importance of culture and community in all of that and then Mandy talked about the importance of culture guess what I'll give you a spoiler alert I'm going to talk about the importance of culture and conversation in making a learning team and you can use this to change from the Grassroots as Nathan pointed out uh some of the ways that maybe your company does or doesn't do things effectively or improve the effectiveness at the very least as Engineers often we have an imbalance when we approach situation we love to think about the technical side I want to encourage you to not neglect the social side of a soot technical environment so this um is a case study uh when I worked at LinkedIn we had he business metrics we'd get this report every hour it would be updated it gave you the last 24 hours the numbers are mocked up I will tell you that um essentially it gave you how things were performing this week how they were performing last week and then the Delta either increase or decrease relative on an hourly basis the challenge with this metric from the engineer's point of view is that it lagged Real Time by several hours part of this was because of complexities that went into refining the numbers from a business point of view removing some invalid traffic um removing attacks so that this represented to the business he measures of what was going on with the site engineering on the other hand looked at something that maybe you've seen something like this as well this data is on one minute intervals it lags Real Time by maybe five minutes maybe less there are multiple contributing factors that lead to this kind of a change it shows you week over week uh the prior week being the red line the current week being the green line and the operational the question being why are they different now the challenge that we faced in this case is that we had two teams we had one team that was us-based one team that was based in India guess what time these kinds of changes would always kick in inevitably it seemed like and and I realized this is a little bit of perception it seemed like it always happened right at shift change uh one team would be wanting to get out of the office go home have their dinner the other team would be coming in has no context uh they have just walked in coming in off the freeway or just connecting in on their laptops and inevitably we we get a question all the way from the CEO who if any of you have worked with Jeff Weiner or have uh encountered him by reputation uh we had in the engineering teams a sneaking suspicion that he had an army of elves who would look at this data and spot discrepancies and then say hey Jeff why don't you ask the team why this thing is discrepant uh so we get this question from the CEO's office actually from Jeff directly saying hey what's going on with x one of these key business metrics so the teams would scramble around on top of the shift change they'd be scrambling to answer the question quickly for the CEO so there had to be a better way nobody liked this scenario we were engineers the answer has to be more data more statistics and with appropriate Echo effects triple exponential smoothing we resorted to what was called the Holt winter seasonality model and you can look this up needless to say I'll cut to the Chase and say it didn't help so having realized that more statistics and more data wasn't solving our problem we went old school we adopted a model of flying lessons I don't know how many of you fly here but even if you don't fly you may have heard that you start off by having ground school and then when you get into the plane you start with an experienced pilot sitting by your side and you actually start with the experienced pilot doing all the hard stuff and then over time you learn the individual skills or to use the Dora term the capabilities of maintaining a plane in calm weather maybe Landing maybe taking off all these things factor in you start with the experienced engineer leading the Learners are following and asking questions you evolve to the Learners are leading and the experienced engineer is there to take the stick just in case with time the experienced engineer has to take the stick less and less and less often ultimately the learners have full flight certification and can go solo so we use this in our situation we created a regular weekly meeting involving all the practitioners we pulled in people from both geographies which made scheduling a bit of a challenge nobody likes working West Coast time against India now the people in India don't like it the people on the west coast don't like it but we got together weekly we reviewed last week's data we looked at all the instances where there were discrepancies we looked at all the instances where we got inquiries from the CEO and we built Common Ground amongst everybody the experienced Engineers the less experienced engineers and we looked at what's going on and we had conversation about it we focused on learning what systems were first and second order contributors to the discrepancies how did those changes in our systems the broad system context of LinkedIn affect those key business metrics and then what other environmental aspects needed to be considered because in some of these cases we were looking at the rate of signups or the rate at which messages were being answered and holidays surprisingly enough change people's [Music] behavior um even things as minor or major as the Cricket World Cup finals would have measurable impacts on our signup rates and engagement rates and so thanks to time and date we have these exhaustive lists of world holidays now honestly atri's Independence Day which I know you can't actually read on there which was on May 24th in case you were wondering didn't have a big effect on linkedin's traffic but that's most likely because we didn't have a lot of people in arria but the Queen's birthday would went for England as an example what we found I have two quotes here that I'm not going to read read to you but I'll let you read I'll give you a few moments we found in this conversation and in this analysis that we did jointly that we had to understand the problem because we couldn't solve the problem without understanding it and we'll get a little more Zen here with Bruce Lee again you need to understand the problem part of the problem was a lack of context and so this weekly meeting amongst all the participants built that context it also built a practice amongst the teams sharing information which had been one of the shortcomings previously now subsequently over the following three years maybe four years from the time we did this learning team that took about three months of weekly meetings more tooling got built we ended up with tooling this is called third eye you can find information about it on the LinkedIn engineering blog if you're curious it gave the teams ways to slice and dice data more readily it gave them the way to drill in and understand perspective of oh hey we did a release on iOS and there was a problem with it and all of a sudden our traffic from iOS device prices is down and so that additional detail could be surfaced in a response or even anticipatory response to the CEO we were able to provide data before he had to ask and tell him hey we noticed that we're down on uh messaging response rates because we had a glitch in the system and the normally scheduled 6: a. sends got delayed till 8:00 a.m. and so therefore from 6 to 8 we had a discrepancy compared to last week it gave us instrumentation to know when those data pipelines were lagging and helped to surface the likely suspects we could split by countries we could split by platforms but again it took three or four years to get there in the meantime we were able to give good answers to the CEO quickly and avoid that scramble between the teams so here are the key elements that I want to encourage you to think about as you adopt learning team approaches amongst your team work as imagined versus work as done it's important to understand at the coal phas is one term or at the keyboard or the people with eyes on glass what is their Liv experience what information do they actually have and do they need more information to do their job effectively don't don't get distracted by how you imagine things are done actually talk to people and find out how they're doing things groups can do better than individuals at problem identification and solving again conversation is key and conversation is the key to culture I've already mentioned the people at the sharp end or at the keyboard have the greatest knowledge about problems and the more effort that you put into understanding the problem the better your Solutions or outcome are likely to be and I want to highlight this bold at the bottom it isn't just a matter of talking but it's a matter of thinking and reflecting and if you look at some of the studies about brainstorming uh and how it isn't really necessarily the best approach um it turns out that shifting your approach just a little bit in brainstorming to provide that time for reflection changes the efficacy tremendously so as you're at a conference like this don't just listen to a talk and then go grab a coffee and and forget about the talk spend a minute thinking about what takeaway can I apply from this talk tomorrow when I go back to my normal job it will make your takeaways from the conference and the efficacy of this conference tremendously stronger I also want to highlight this model of learning uh because it's useful uh to think about you want to align the group together gather insights from everybody in the group go through sense making is the term in of Art in in the um in the learning field develop plans for how you're going to respond and then carry out the plans but pay attention to what happens refine your plans distill them collaborate on better approaches don't just make this a one shot oh we're going to jump in we're going to do this everything will be great afterwards look at how it actually turns out and then share not only amongst the participatory team but share it amongst the wider organization because they may not know that you and six other colleagues have come up with this great solution or adopted a learning team approach and found wow this actually makes a big difference share it in in all hands share it in a staff meeting share it at lunch when you're talking to other people who aren't necessarily part of your project or your problem group and then you can do learning team Cycles on different cadences you can do them every day in this case we had a mixture between an everyday and an event triggered mon so we used a weekly Cadence to meet but we were focused on identifying events and how we could respond to the events you can also do periodic you could do a learning Team every Porter or at LinkedIn we would take all the events all the incidents that we had had separate from this kind of an event we would do retrospectives on them and then once a week we had a roll up of all of our retrospectives where we tried to identify the main learning points that could be taken out and disseminated more broadly so think about distillation of distillation of distillation you can go to the whiskey model triple distillation gets you really po and stuff I want to encourage you that the resilience is among you your team is the core of your resilience as Mandy and Nathan pointed out the generative model of leadership where people approach problems with curiosity and where they are willing to talk about these things because they're not shot down you're not in a toxic environment where people are washed um gives you an environment where people can learn and doing this kind of a learning team even if it's only a protected group of six or eight people can help to bring some air into the room and you can you can push that toxicity out uh as Nathan pointed out we are by nature of being at this conference all of us leaders and we can make a difference within our work communities and our broader social communities by the practices that we enable amongst the people we interact with if you haven't heard about high reliability organizations I'm just dropping this in here as a Tickler for you but learning team approaches exemplify a number of key attributes of high reliability organizations and uh my wife and I have recently been B binge watching NCIS New Orleans um so I will leave you with an admonition from the key character go learn things and do it in conversation do it with the people you work with and I'm open to questions I have actually saved a little bit of time here um and I'm happy to say that we are hiring in the infrastructure team at clar so if you're interested uh please talk to me afterwards or uh drop onto this link we're hiring both in the US fully remote and also in Poland if uh any of you uh should happen to want to work from Poland

Share your thoughts