Log in to watch

Log in or create a free account to watch this video.

Log in
San Francisco 2017
Share
Download slides

Continuous Delivery: Solving the Talent Problem

As businesses attempt to accelerate their software delivery, they look to tools and practices that enable teams to ship faster. Often misunderstood is the culture required to foster dedicated talent.


Walmart prides itself on having a culture of grassroots driven improvement and the approach to Continuous Delivery has been no different.


Hear from members of a Walmart development team as they describe the actions they took to move towards CD: what worked, what didn't work, and what they've learned. They'll explain why driving to CD builds the kind of teams that are a strategic advantage to the enterprise.

Chapters

Full transcript

The complete talk, organized by section.

Brent Pendergraft

My name is Brent Pendergraft. I'm a senior developer.

Bryan Finster

My name's Bryan Finster. I'm a technical expert.

Brent Pendergraft

We work for a small retailer in northwest Arkansas, and we're here to talk to you a little bit about how we're working to solve the talent problem and get feedback.

You don't stand up here. There we go. No. So we'll...

Bryan Finster

We'll move. We'll move over here. Okay. We'll be nimble about this.

Brent Pendergraft

We'll do what we can.

So here's a few facts and figures about Walmart that you may or may not be aware of. A couple things we wanted to call out: first of all, 250 million customers, and that's a week. Also, we have 11,000 stores in 28 different countries.

In order to run an operation like that, as you can imagine, you need a pretty robust supply chain. We work in distribution systems, where we're responsible for supporting, it's actually near 300 distribution centers with a volume of three million cases an hour.

And what that works out to is about a billion pounds of bananas every year. Or, to put that in a metric people can understand better, it's 11.2 blue whales a day.

Bryan Finster

Back in 2014, we were given a challenge by management. We've had distribution systems for a long time, and they're old legacy systems. They wanted us to be able to deploy every two weeks with no outages. Considering the state of the systems at the time, it was really quite a challenge.

To give you an idea: multi-million line, and I'm talking tens of millions of lines of code, with infrequent releases, a few a year, long-duration installs, inconsistent deployment targets. If you can imagine 300-odd distribution centers, each one of those is a micro data center, right? And it would take weeks or months to level changes.

Fast-forward three years, we're in an environment where we've got teams building microservices. We're deployable on demand, require no outages for application changes, installs. We're piloting Kubernetes clusters throughout our distribution chain, and we're able to level within hours to our entire supply chain instead of weeks or even months.

Brent Pendergraft

With no outage, which is super cool.

Bryan Finster

Yeah.

Brent Pendergraft

As you can imagine, that did not come without pain. We have lots of pain we could share, but we only have 30 minutes. So we'd like to talk to you about the three points that really caused us the most pain.

The first one was the extended cycle times. It would take a long time to get an idea through from the business all the way to production. Lots of analysis to get that done. Waterfall back, well, a while back.

And organizational silos. We had become a matrix organization, which meant that to get anything done with the database team or operations required going up a silo and down another silo. That definitely impacted speed.

If you listened to Gene's presentation this morning, where he talked about the speed at which retail is moving, you can understand the need for large enterprises like ourselves to move quickly. It's a volatile space. You think about how you shopped five years ago versus how you shop today. There's a lot of change.

Bryan Finster

So we tried to make it better. First thing we did was we had enterprise buy some deployment tooling for all the teams to use. We modernized our tech stack. Tech stack modernization is really something you should do if you want to speed up, considering the age of our systems.

And we leveraged our relationships that we had. I'm a short-time associate. I've been there for 15 years. Over that time, you build up relationships. So we had friends in the other silos, and we would just bypass the organizational friction by building up advocates in those areas, helping to elevate them and get them on board with what we needed to do to go fast.

Brent Pendergraft

It turned out that deployment tooling on its own wasn't really a solution to our problems. And while it was really effective to modernize our tech stacks and leverage those relationships across silos, what we found to be the most effective was actually setting a defined goal for what continuous integration was.

Bryan used this phrase that was really interesting: continuous integration as a lever for change. So it's a big question to ask, why can't we deliver faster, right? But to ask a question like, well, why can't we integrate with master daily? Why can't I deploy to master daily? Oh, well, it's because every feature we have has a feature branch that lasts four to five days.

It allows you to break the big problem down into smaller problems and gives you something more manageable to solve.

Bryan Finster

I actually grabbed the definition of CI from the Continuous Delivery book. Jez is an awesome advocate for developers. Printed it out and stuck it on the pods, right? We are going to do this. We don't know how yet, but we're going to figure it out, and we're going to hold ourselves accountable to awesome and not good enough.

Really what that does, we found, is you have to get teamwork to get that done. You have to figure out what you don't know. You have to admit the things you don't know. You have to be trusting and honest with each other, and it helps everybody grow because we're all solving problems together.

Brent Pendergraft

Some things we learned from that part of the journey were, ultimately, tools don't solve problems. They enable problem solvers.

We've had quite a few conversations with people that have said, "Well, I'm doing continuous delivery because I have an automated deployment pipeline." Obviously, that's only a small part of what it means to be doing continuous delivery. Your automated pipeline is really kind of useless if you don't have automated tests integrated with that.

And absolutely defining things in measurable terms. Remove the buzzwords. If you talk to managers who've only heard the CD, CI, DevOps, these all sound like buzzwords, right? And DevOps, while it can be hard to define, CD and CI, there are hard, objective metrics that you can use for those. You drive down to what those are, and you hold yourselves accountable to measuring them.

We also found that these shorter cycle times, decreasing the cycle times, actually tied us as developers a lot more closely in with our users. So we're able to see how a change we did today or yesterday or even just a couple of days ago, see how that impacts our users in a really short amount of time. It's actually allowed us to understand the business problem at a deeper level and have a deeper level of empathy for our users.

Bryan Finster

Yeah, even to shorten those cycle times more, we would pilot changes and throw Slack out in the distribution centers and have the end users, not the business buffer in between us, but the people suffering through our code, communicating directly back with us with either suggestions or bugs. And we could react sometimes within hours to go and get things changed.

So, overwhelming instability. When you have aged systems where they've been just built over time to answer problems, but there was no direction toward quality, you've got massive technical debt you had to overcome. We had multiple paths to production because we had complex branching structure, where you had a release branch after you got to dev. You had hotfix branches off of release branches. You had people helping the business by doing manual installs.

You had a culture of prioritizing timelines over quality, which would just increase whatever technical debt we had. And of course, all that support burden would just keep impacting more and more our ability to deliver until we had to go and tell the business, "Hey, we have to take three months and just work on tech debt." Not a popular answer.

Brent Pendergraft

No.

And so when we looked at this from a program level, we realized we needed to do a couple of things. First of all, we had to create a single automated path to production. We did that with some custom as well as some open source tooling that was owned by a platform team.

But we also knew we had to have a metric, some sort of metric to help us understand the stability and quality of a given product. And for us, that metric was test coverage.

Bryan Finster

Absolutely. Building up a platform team in distribution systems that owned the single way for us to deliver to production was a key strategic advantage. All we as dev teams had to do was integrate with it correctly. They held us accountable to that. We spun up Hygieia dashboards that were plugged into the pipelines, and that really helped quite a bit.

We also discovered that test coverage is a terrible metric. Any metric in isolation is a terrible metric. I actually got my eval that said you must have 90% test coverage.

Brent Pendergraft

99%, yeah.

Bryan Finster

Or 90% test coverage, which we had. I guarantee we had, and I've read those tests, and they did cover 90% of the code. That is true.

But what we discovered was that testing was really a culture. You had to live and breathe test every single day in everything you did to make it a habit to get testable code.

It's not enough to tell people to do test-driven development if they don't know what the feature is they're trying to test. You have to have stories with testable outcomes, and those stories have to test whether the feature has value. Each feature has to test whether the epic and the vision have value, whether or not you need to not do this thing, because no is an answer.

I even challenge people, and I do it: test your meetings. If you have no agenda, then you have no goal. And if you don't have any meeting notes, then you have no way to test whether you met your goal. Just live and breathe test every single day.

Brent Pendergraft

We learned that ultimately quality is a complex thing, and you have to measure it that way. You can't use a single metric. Test coverage was helpful, but it was kind of like looking at only one plane of a 3D shape, right? We realized that we had to look at it more holistically. And so a few of the metrics that we found to be really helpful you'll see up here.

Bryan Finster

Yeah. And like I said, any one metric in isolation can be gamed. If you measure three or four, very difficult to game.

Also, we had to sell testing to the developers. We spun up a team in the architecture area just to study tests because we weren't good at it. Then we told the teams, "You must test this way," which they refused to do because developers are really independent and don't like being told what to do.

How many developers do we have in this room? Yes. Yeah. Don't tell me what to do, right? Give me a problem to solve.

Brent Pendergraft

Give me a problem.

Bryan Finster

And I made this mistake. We had to back up and say, "Okay, let's sell the benefits. Let's show them what can happen. Let's work in the teams, spin up some test frameworks, get some tests embedded, and see just a little taste of success to get them on board."

Brent Pendergraft

When you convince developers and teams of the benefits of testing, you start to see some really interesting conversations happen. So over the past several months, we've been talking about things like spinning up Docker containers to have a known set of data that we can test against, and then they go away. Eliminating the need to do manual data setup or even kind of fragile scripts in order to get that done.

Bryan Finster

Yeah, it's just a lot more fun when you don't have to worry about the last problem.

Brent Pendergraft

Ultimately, we realized that this kind of test automation that you have throughout your products creates these guardrails that actually make it really fun to do development. And you can try out things that you never tried out before without a fear that they're going to destroy your systems in production.

And so now we can talk about problems like what is machine learning, or artificial intelligence, or Internet of Things? What does that look like in our warehouses? Being able to have those conversations instead of conversations about, well, how do we make sure it's not broken? How do we know that it's working?

Bryan Finster

Or how do we protect ourselves from an architect getting on the team and learning a brand-new tech stack, right? The fact that we had a good, solid test framework when I started having to get in and actually, like, I should write code, it protected the team from me learning the tech stack. It gave me the confidence that I could onboard rapidly while trying to ship value to the business, and it made me a much more productive developer in the new stack much faster.

Brent Pendergraft

Yeah. It's huge.

So, Jenga architecture. You can Google this. Jenga-driven design is a thing that you should avoid. It's not designing. It's just being tactical about how you're implementing code with no thought to how to make it sustainable in the future.

We had a massively entangled monolith from years of doing this with no logical separation of business concerns, which meant that if we made a change somewhere in receiving, it could impact shipping or order filling with no way to really predict that without a lot of heavy analysis, and even that wasn't really a safeguard.

Bryan Finster

As an architecture area, we said, "Okay, let's decompose this thing. Let's slay the monolith," right? So we started with domain-driven design to really isolate what the business capabilities were, then we tied those to teams.

Then the next thing that we did that was really powerful was we stopped having a centralized application architecture team. We drove them into the teams. I went and joined a team as a developer, as a tech lead.

Brent Pendergraft

This is a really important point, because you can read the books, you can read the blog articles, you can read everything there is to read about continuous delivery, and all of that is really important.

But honestly, what we found to be the biggest motivating factor for our teams is people who had done it a different way and were living on our teams telling us what it was like to go down that road. And that actually inspired the rest of the team to avoid practices and patterns that we knew were really anti-patterns.

Bryan Finster

Yeah. We'd be going down a path and say, "Guys, if we keep going this way, you're not optimizing for my sleep, and that's not really going to fly. I've been carrying a pager for 20 years, and we're going to be safe, and we're going to be stable."

So, some takeaways. We got the domains wrong. You refactor code, that's sure, but you're going to refactor the domains, too. You're going to refactor microservices. Everything is scaling up and down that way.

We found in cases that we had to take domains and break them up. We also found that we didn't get the architects on the team as fast as we needed to because the teams were just building micro-monoliths, and so it became very difficult to decompose those domains and to hand them out to smaller teams. So we absolutely have to build to split.

Brent Pendergraft

Because of that, your architecture should really be, or we found that our architecture really needs to be, able to be split at any point in time.

We've had several experiences where we've built up a service on our team, only to give that service away to another team because the domain boundaries were drawn even more closely, taking one product and turning it into maybe three.

And so that's going to be a really painful transition if the applications aren't architected in such a way that it's easy to do that with.

Bryan Finster

Yeah. Microservices help protect you from being wrong.

The product mindset, having the teams tied to the product and being heavily involved with a certain part of the business, instilled a massive sense of pride in the application and what they were doing, a sense of mission.

Brent Pendergraft

Yeah. We found that teams are starting to create their own branding. They're making T-shirts. You have the ability, once you identify with your product, to really get behind it, and it's driven a level of ownership that we hadn't seen before.

So we talked a lot about our journey. We talked a little about fun. But what's that really have to do with solving the talent problem?

I've talked to many people this week, and they all have the question of how do I solve that gap in talent, right? Who do I hire? What do I do?

Well, what we found is that going through and solving the technical challenges, and even some organizational challenges regarding getting us to continuous delivery, we found that it's actually transformed our development culture. It's transformed our development environment, and we're not afraid to modify our code.

We're able to focus on some really interesting and larger problems instead of the same smaller problems. We're not afraid of going to production. In fact, we're really excited about it.

Bryan Finster

Solving those problems grows the team. You have people that have been coming to work, and it's just been drudgery, and you give them challenges, and they start learning new things. The learning becomes addictive. You build that learning culture, and everybody on the team wants to contribute, and you get just a massive sense of learning.

Brent Pendergraft

Ultimately, being able to work more closely with users has created a sense of real belief in what we're doing. People understand the purpose. People understand the why. And that changes the way people come to work in the morning.

Ultimately, we're happier, right?

Bryan Finster

Yeah.

Brent Pendergraft

We love development again. The things that you tend to hate about development, those things we don't do as much anymore. And we get to focus on the stuff that we really love to do, the stuff that got us into the industry to start with.

This has had some really interesting results. It's made our area within Walmart really attractive internally. We've seen a lot of people transfer to our area from other areas within the company. But it's also created a little bit of a splash externally.

Keep in mind, we're an internal-facing team. All of the products that we build are for Walmart associates. And yet, we've actually made a splash in the tech community with some of the advances that have happened within distribution systems.

You may or may not have seen some of the press that went out about Walmart's usage of Kubernetes. This blog article was picked up and retweeted by the Kubernetes team at Google.

You might think, okay, well, this is really related more to Kubernetes and infrastructure, right? But the point I want to stress here is that our journey to continuous delivery is the reason we even tried to solve any of this stuff to begin with. And so you can imagine what this has done to our posture in terms of recruiting. It's created an environment that people are starting to be aware of outside the company.

Bryan Finster

But make sure you push through the pain. Like we said, this was a lot of pain. Brent and I have argued over architecture. We've argued about ways of working. The team has done a lot of growing. It's tearing down walls to trust, and there's a lot of pain there, and you have to push through the pain.

How many managers, leaders, do we have in here? That's awesome. We built this deck for you.

Brent Pendergraft

With love.

Bryan Finster

Yes, with love.

Absolutely push through the pain. The reason I'm so passionate about this internally, and I tell people this, is I've been a developer for 20 years. I thought that pain was normal. I thought that the drudgery, the hell we go through, is just normal. And then I found out it wasn't normal, and I decided that we have to change Walmart.

And when you take developers and you show them this path, and you give them the tools to do it, and the permission to do it, the freedom to say, "Just go make it better," we'll start down that path.

And then if you get afraid, if you decide it's too hard, if you decide, hey, this timeline's too critical, we've got to go back to the old way. We've got to just push features, and you can get quality later. Whatever it is that you want to do, those developers will keep driving toward continuous delivery. They absolutely will. Maybe not with you, but they will do it.

Let them do it because it's so much better for you. It's so much better for them. Like Gene says, it's so much more humane. And it creates an environment that people want to be a part of.

Brent Pendergraft

Some stuff we're still trying to work through. This is probably a pretty common thing, I would imagine. In fact, I think there's a track this year about business buy-in. This is essentially where we're at right now.

Continuous delivery is not just a technical problem. It's an organizational change that starts with your stakeholders and goes all the way down to your development and closes that gap, closes that loop.

And so we are still working through how we build those relationships in such a way that even the people that are not necessarily on the technical side will be bought in and really want to be a part of this movement.

Bryan Finster

Because everything we do is a team. Brent and I did not put this deck together. We are a development team. We developed this deck together. We've just been fortunate enough to present it.

And we really want to make a call-out to the other developers on the team that helped with this. A special shout-out to our product designer, Kristen Aya, for designing our slide deck for us.

If you don't have UX resources, I highly suggest you get UX resources. You should see this deck before UX got their hands on it.

Brent Pendergraft

Yeah.

Bryan Finster

If you know anything about developers and graphic design, should give you an idea.

Q&A

Brent Pendergraft: We have a little bit of time. Anybody have any questions? Yes.

Q: My question is, did you ever tackle the question of career paths now that your team's gone through all this transformation? How do you retain that talent and career paths as you move forward?

Bryan Finster: So how do you retain the talent and give them a career path?

Q: Yeah. What do the career paths look like?

Bryan Finster: You give them the freedom to go out, like I was given the freedom to do, and help other teams. Right? You give them the freedom to go do more interesting things. You do more than just say, "You're staying on this team, and you're just pushing code."

Brent Pendergraft: And it's still development, right?

Bryan Finster: Absolutely.

Brent Pendergraft: I think it's development as it should be. It's a development career path. The DevOps model is development career path kind of as it should be. And so we found that kind of development is more what developers want to do anyway.

Bryan Finster: But you absolutely push them out to go help other teams. I tell everybody this all the time: if our team wins and everybody around us is slow, who cares? Right? Everybody has to win. We have to go out and educate anybody we possibly can on how to move faster so we can all move faster.

Brent Pendergraft: Yeah. Yes, ma'am.

Q: Did you achieve this despite what's going on in the enterprise, or are there enterprise-level things going on at Walmart that help with this?

Bryan Finster: Oh, I'm glad you asked that. So this effort, for the last few years, has been in pockets around the enterprise. I recently moved to the area that's doing it for the enterprise. So we are driving this hard in the enterprise.

Brent Pendergraft: Part, yeah. We're educating everybody. Heard people say before, grassroots and management. I think that's pretty much been our story.

Bryan Finster: Yeah, and believe me, it wasn't entirely grassroots. We did have executive air cover. Very important. I think you've all heard that. We still do, and it's being pushed from the top down and the bottom up right now.

Q: Contractor versus full-time employee.

Bryan Finster: Sorry?

Q: What was your contractor versus full-time employee mix?

Bryan Finster: Contractor versus employee mix. So the question was, what was our contractor versus full-time employee? We have contractors on our team. I don't care. They're teammates, and that's just the way it is. You're either a team or you're not. You don't treat your contractors differently.

Practically speaking, I am not 100% sure. We didn't bring those numbers.

Brent Pendergraft: Yeah. Sorry.

Bryan Finster: I'm not in hiring. I'm a software developer.

Brent Pendergraft: It's not super helpful, sorry.

Bryan Finster: Any other? Yes.

Q: Were there skill set gaps in the beginning as you tried to get knowledgeable, and how did you solve them?

Brent Pendergraft: Yeah, so the question was, were there skill set gaps in the beginning and how did we solve them?

Bryan Finster: Yes, but the interesting thing is, as we started down this path, people got really excited about learning it. Once we started to recognize how much better it was going to make things, there's a lot of tools and resources out there to level up and skill up. I think the motivation to do so is actually, what we've seen has actually been really high.

Plus, Brent has done a really good job of pushing knowledge out into the team. Including me. I was not a JavaScript developer when I got on the team. I am now. Right? And it's just mentoring each other, and it fills the knowledge gap.

Brent Pendergraft: Again, tests are super important here. The ability to basically make changes, experiment, without thinking about the risk of destroying your application in production, that really gives you the freedom to try out new things and really is the best way to learn.

Bryan Finster: Any other questions? Okay. Thank you very much.

Brent Pendergraft: Thanks, everybody.

Bryan Finster: We'll be around if anybody has anything.