Log in to watch

Log in or create a free account to watch this video.

Log in
Las Vegas 2020
Share
Download slides

Deploy more. Sleep better. The Walmart DevOps Journey.

We've been on an exciting journey at Walmart for the past few years as we continue to find better ways to meet the needs of our customer. We've also learned quite a few things and have had some remarkable successes.


Join us as we share what we've learned and the outcomes of a dedicated strategy to improve the world's largest retailer.

Chapters

Full transcript

The complete talk, organized by section.

Bryan Finster

Hi, everybody. Thanks for coming. My name is Bryan Finster, and I lead the DevOps Dojo for a small retailer in Northwest Arkansas.

Walmart is the biggest retailer in the world, the biggest company in the world, with 2.2 million associates, over half a trillion dollars in sales, and 11,700 stores spread all over the world.

As you can imagine, our delivery platform has to scale to that as well. We have thousands of development teams worldwide with a global customer base. We deploy to the edge, private cloud, public cloud, with embedded systems. And we have everything from legacy to the latest tech.

And I want to talk to you today about what we've been doing for the past several years, the journey we've been on, to be able to support what we do today.

Now, anybody who's in my industry will tell you that Black Friday didn't start around Thanksgiving this year. Peak started in March, and we were ready for it. But it's been a journey, and I want to talk to you a little bit about that journey, because it hasn't been an overnight thing, and it has been work.

So back in 2015, I was still in supply chain, and we were focusing on a challenge. Our leadership gave us a challenge that they wanted us to deliver a main warehousing system much more frequently. An order of magnitude faster was being delivered. And we had a DevOps Day in early 2015 where Gene Kim, Gary Gruver, and Damon Edwards came to visit, and it was really inspiring. We met with Damon and Gary later, just with our area, and had conversations with them. And it changed a lot of things that I had thought that I thought were true.

And as a group of senior developers, we got together and said, "Okay, well, what we need to do is we need to make sure that we're laying out the product domains to allow us to go faster, that we're aligning product teams to product domains to stop doing projects." We designed an architecture to allow those teams to be completely independent of each other and deploy in any sequence. We put together a platform team that all they did was build the delivery platform, and we were using Jenkins at the time. They could template pipelines and make it easy for the teams to deliver.

And then we started on the pilot teams, and we focused very heavily on continuous integration. We thought that if we could understand the problems with continuous integration, everything else would kind of fall out. We'd find all the constraints in the flow. And so that's what we did. We started asking ourselves every single day, "Why can't we get to trunk today? Why can't we deliver today?" And nothing was sacrosanct. We just found the things that were stopping that from happening. And then we started working on the things. And sometimes it was how we were developing, sometimes it was knowledge of tests, and sometimes it was just process, internal and external, that we had to go and clear, working with other teams, doing all sorts of things, building relationships so that we could deliver. And sometimes we broke things, but we always tried to break things small, and then we fixed it and moved forward, just like Sam would do.

And we had a lot of good takeaways from this. The first was we really should have done a better job measuring up front. It's really important to have measurable outcomes, and we really didn't know how to measure it at the time. We felt it was faster, and it was. And the very first DevOps Enterprise Summit I came to was to learn how to measure. That's my main focus. And I brought back some ideas, and we've been expanding on that ever since. But that's very important.

The other thing we learned was that if you make the easy way the way that's the good way, then teams can kind of flow downstream to success. And when the platform team was building templates, they were building templates that really supported the idea of we want to do continuous delivery. We want continuous integration happening. And then we pressed that as a way of working.

And we also learned that it was really effective to grow the platform and the team behaviors using the platform together. We've talked about this many times since, about how that was so important: if we'd pushed CI behavior without the tooling, it would've been frustrating for the teams. If we'd pushed the tooling without the behavior, all we would've done is push garbage to production really fast. But bringing them along together allowed them to grow and allowed a partnership.

And continuous delivery is the catalyst for culture change. When you take this approach that CD is our goal, all other things are secondary to our ability to improve the flow of delivery. It's exciting. People are doing new things, and they're delivering faster, which makes people so much happier.

And that's the other thing, is that teams are happier. We learned how much happier it is to be on a team that's delivering better value sooner and safer. A teammate and I talked about this experience, this journey I just told you about, back in 2017 at this conference. And we had a slide that says, "We love development again." And it was true. And we became a really tight team. And every team that I've worked with since that's focused on this, they start really gelling as a team, and they just love working this way.

But we also knew that replicating these outcomes couldn't be done with a cookie cutter, that teams each have their own context. And so when there was a push to do this for the entire enterprise, I was recruited to move up to the platform area to help in this effort.

And we decided we needed a deliberate strategy for transformation, and that's really what I want to talk to you today. We've been on this journey, and I want to talk to you about what we've been doing for the last several years. So it was establishing clear goals, making sure we had a solid platform that everyone could use, building communities to help each other, gamifying delivery to encourage people passively so we don't have to go and talk to them all individually, and developing the technical coaching skills we need to help where people have questions or struggles.

So first, goals. We really need to define goals. And we had a challenge coming from the CTO. He said, "I want every team to be able to deliver at least once a day." Now, that's a big ask, especially on teams that are dealing with old software, but that's a challenge. Engineers love challenges. And it caused a lot of motion, right? We also want to make sure we were doing it at higher quality. It doesn't make any sense to deliver fast if it's garbage. So we really had to dig in on what causes quality and get there.

But we really wanted happiness. We want engaged teams. Happy developers deliver more secure, more stable applications more frequently. And if we don't have happy developers, the quality will suffer as well.

We also wanted to make sure we had a common context. And this is something Gary Gruver wrote this book recently and produced some training recently that we think is really important for setting that context where everybody's aligned on improving the flow of value, with making sure we're getting good tight feedback loops. And educating everybody on those concepts, so that the more we're bridging the gap to the larger audience that may not have dug in the way we have.

And make sure we have shared values and measures. That we're aligned on metrics. We have a common glossary of metrics. One of the things we did was we created a glossary of testing terms because strangely there's no domain language for testing. But we also wanted to create a glossary for metrics, so that we were all looking at them, and also training around those to teach everybody how to use those metrics to encourage the right things and not discourage the right things.

The next thing was we really need to build a delivery platform that was not only easy to use, right, but it also allowed people, like I said, to flow downstream. Now, at the time, we had platform areas all over the enterprise. And we had all these different areas that were paying teams just to support their own bespoke platforms.

So we had a deliberate strategy to consolidate these things. First thing we did was we said, "We will take on support for your existing tools. You don't have to pay for that support anymore. You can still use them while we're building things out, but we will support and pay for them." And then we started building out our platform, and as we started replacing those capabilities that those other tools were executing, and we're working with those teams as customers to replace those capabilities with tools that were honestly easier to use. Then we started deprecating those old platforms, reducing the number of tools that we had, getting it pared very much down. And we would help teams migrate from their platform to ours. We didn't just say, "Hey, you need to migrate." We had teams dedicated to helping them come onto our platform.

And as we removed that duplication, we started growing a larger group of people who were using the same platform, and that allowed us to find other holes in the platform that we didn't know existed. And also having those people help each other with any sort of struggles they had, instead of coming to us for how do we do. And we had training for it, but instead of coming to us for their edge cases, like, "I don't know. That's kind of an edge case that I haven't covered before, but I'm pretty sure that this team over here has done that." And we start building those connections across the enterprise.

And we made sure that the training emphasized continuous delivery, that the tools emphasized continuous delivery. And we are focusing on this irresistible developer experience. And here you've got actually a picture of a tool that one team built, that using our tools and the APIs built in our tools, they could safely deploy to production with a button press. And if they didn't have a green pipeline, it just wouldn't deploy, which I thought was really super cool.

And we got feedback from developers. This is a literal quote, that it's magic. That he merges a pull request, magic happens, and it goes to production. And it takes all the toil away. And by doing that, the teams were really able to focus on what they were doing instead of how they're going to deliver it. And at this point, we have probably nearly 90% of the enterprise is using this common tool set, which allows so much ability to inject compliance, inject security, and make it things that you can't forget. They just happen transparently, and you don't know.

The other thing we thought was really important, as I said, was growing communities. And we started with a continuous delivery community. Actually, my wife, Dana Finster, started it, called Continuous Chai, because there's other talks we've done about that. And that community continues to grow, continues to be active. But on top of that, we've started testing communities. We've started communities focused on particular platforms. And we're also starting communities around engineering excellence to start growing these things and building people together, getting them together to talk, building these brown-bag lunches and engaging people in Slack rooms and those sorts of things.

But the main thing is that the communities are the owners. They define the standards; they drive the standards because they should own the outcomes.

We also work to gamify the metrics, and this was one of my favorite things, playing this game: how do I adjust the score? So what we did was we brought in Hygieia, and then for all of the different things around the code-change frequency and build metrics and Sonar metrics, we gave star ratings on those things by repository. And then we created roll-up dashboards so the team could see what their aggregate score was across all the repositories. And we didn't do anything except automate the onboarding of these dashboards when they built using our tools, send them an email saying, "Here's your dashboard," and then we didn't say anything.

And then teams would come to us: "How do I improve my score?" Like, "Well, let's look." And then we'd start talking to them. "Well, if you use trunk-based development and you start integrating code daily, you'll get a five over here. If you stabilize your build, this will get much better. If you deploy more frequently, this will get much better." And then teams will go and try to solve that problem, the early adopters especially. People would actually start competing with each other who had the highest score. Then we started doubling down on that. We've built other dashboards that take the scores and use those metrics to point them to playbooks to help make those metrics better so they don't have come talk to us. It's been very effective.

The one side effect of gamifying those metrics, though, is we did have a lot of people coming to us for support, not for the tools, but how to improve. And so we created a dojo. This is something that I really admired that I saw Ross Clanton build these and Capital One talking about dojos, and I kept asking for dojos ever since 2015. And finally, they just told me that it was my responsibility because I kept asking for it, so they had me go and build a dojo.

And I think there's been a lot of misunderstandings we've seen in the broader dojo community around what a dojo is. And number one, it's very contextual to your company, but it needs to be not a center of excellence. It doesn't need to be those people that tell you what to do, and it definitely doesn't need to be the place where you take teams and run them through the grinder. A dojo is there to help teams solve problems, to help them with immersive learning so that they can use their work to learn how to improve their work by doing their work.

Provide really good examples. We spend a lot of time showing examples of different kinds of testing patterns, showing different pipeline examples, just little things to help people get started that may need a little bit more help there. We also do a lot of work on evangelism. Since we're very closely aligned to the platform, we'll see some of the requests coming in the platform where people are struggling to use the tools because they're not doing continuous delivery. They're trying to do a legacy delivery flow using our tools, and it's harder. And so we look for these problems and we say, "Hey, can we help you start working towards a continuous delivery flow? We'd love to talk to your team, see what your struggles are, and maybe give some ideas of how to make your lives better so you can sleep better at night." And we do a lot of that.

And at this point, we're the people pointed to. We don't want to be a constraint. So we're working very hard right now to push that out into the community. We really want people who know how to solve that problem out, and we're working right now to train those people, certify that you know how to run an improvement project in an area, that you've shown results, that you have the right mindset, and build a community of those. A guild of value stream architects is what we're trying to build right now.

And our leadership has bought in. We have executives who are asking the right questions. They are pushing the right things. They're supporting things. If somebody wants to start a DevOps Day or get some advertising out around continuous delivery, they'll have executives jumping up to try to help, to push. You've got a lot of partnership from the top and the grassroots going on. And you see teams being recognized for executing well. And you see teams excited about it.

And we see right now a big push around looking at those delivery metrics, pulling those metrics that we've been pushing so hard for so long, making them global, looking at them during planning meetings, looking at them constantly to see if anybody needs any help, and tracking improvement and seeing where we are instead of just do it harder.

And platform, we also — I said before how important it is for teams and platform to go together, and we really work on that partnership. When we first started building out the platform, we had early adopters coming in who really wanted to drive it and wanted to help us improve what we were doing.

And this is some user feedback, right? Is that they can go and extend our platform easily. We've built something that's pretty simple for teams to do simple things on and easy for them to do complex things on. And that we have an open-source mindset, that we want contribution. If you want to build something to help us improve, build something, help us improve. And it's like it says here: they're not just consumers, they're partners. And that they've gotten a lot of benefit from the documentation we put together, the playbooks and examples and things, that they don't have to come talk to us. Just go to our website, look at our resources, get the help they need, and then come to us when what they need just isn't there.

And this is from an area for Walmart Canada. This was feedback we got from them. And using our platform, working with us, and focusing daily on continuous delivery as a way of working has created better outcomes. They deploy 72 times more frequently than they did before. It costs 93% less to do a deploy. That's staggering. They reduce lead time by 92%. They can make changes really fast.

And we saw the results of this recently with — well, I'll just show you because it's so cool. It enables better customer satisfaction. This is literally from Reddit, where, honestly, somebody who isn't necessarily our normal Walmart shopper, they came to us because everything else was breaking. Xbox released, and everybody crumbled but us. And we saw all of these things flooding in from all these different media outlets where people were saying, "Well, Walmart's up. Walmart's up." And that's because areas like this have been focusing on how do we build resiliency, how do we get changes out reliably, consistently, use the platform, use the process, focus on CD, and stop running fire drills.

It also allowed us to respond to COVID. We were able to rapidly implement touchless checkout so people didn't have to touch the screens. We rolled out express delivery and broadened our home delivery rapidly so people didn't have to go to the stores. And this is a quote from one of our earnings calls that we had massive growth in orders per minute. And if you saw our earnings calls from the last two quarters, we had, I think, over 90% growth in dotcom last quarter, from quarter to quarter — or I'm sorry, for comp. And we stayed up. We were stable. And that's not because of massive heroics, it's because we've been focusing on it.

So things that we've learned is that why is so much more important than what. If you explain to people why they're doing something instead of telling what they're doing, they get really bought in. That we've learned that helping gives much better outcomes than directing. If you're just showering people with love and "How can we help you? These are the goals that we're going after, and we want to partner with you and help you get it done," and not "so you must do this way," people are bought in.

Ownership is better than accountability. Now, I've heard many times in my career of people: we need to hold the teams accountable. But no, you don't, because if you just hold them accountable, it's just demoralizing. But if you give teams ownership where they not only get to make decisions, but they're responsible for the outcomes of those decisions, and they are invested in the goals of their product, and they push those things forward, you get much better outcomes. And that if you give them clear goals plus that ownership, you get improvement all the time.

Make sure the goals are aspirational. Let them know they're aspirational. Make it a challenge. Give them the ownership to meet that challenge, and they'll surprise you.

Engineers want to solve problems. You just need to give them the right problems to solve. It's like I always rant about people giving engineers the wrong metric to solve a problem for code coverage. They'll solve that problem. Give them the right problems.

So the other thing is you don't want to grow a big dojo team, in my opinion. You want to spread it across the entire enterprise, right? So make sure that change is grown as a capability that people are focusing on instead of something you're imposing on them. That everybody has common principles they're going after, not best practices. I only ever use best practice ironically. There are none.

Recognize those who try. They don't have to succeed. Someone tries something new, celebrate the fact that they did. It encourages them and it encourages others.

And really, we want to embed improvement into the culture. We want to make it everyone living and breathing, "How do I do better?" To be slightly dissatisfied with where they are today.

Now, like I said, it's been a challenging year for everybody, and recently, a store associate showed some feedback that they got from a customer. And it's Lee and Olivia thanking us for keeping the stores open, for making sure they had the ability to stay healthy, that we were looking out for them. That's why I'm a developer.

So we always want to know — Gene always asks, what is it you want to know? I'm very interested in how you measure the effectiveness of your transformation. There's many things that we could point to, but I'd like to know: how do you know you're doing well? And I'm going to be on Slack. Please come back to me and talk to me about it because I'm very curious.

And thank you very much for your time. I hope this was useful. If you want to reach out to me, I've got a series of articles about some of the things I really believe in on Medium, but you can always reach me on LinkedIn. I'm occasionally on Twitter, and I will absolutely be on Slack during this entire conference. Please hit me up with anything you'd like to talk about. I love this topic. Thank you very much.