DevOps Dojo: A Massive Internal Coaching Program
We are 3+ years in our DevOps journey. With pockets of awesome in 2014, and an important DevOps rollout across the early adopters in 2015 - where we measured the benefits - we came to the point where we needed to embark the rest of the organization.
The talk goes through the state we reached with the early adopters in 2015, how we came with an initial DevOps Dojo setup - which was largely inspired from DOES 15's Target talk, how we composed a scalable coaching program with a mix of white/green/black belt DevOps dojos - and what they are, how we use the Dojo program to identify change agents which in turn can also coach others and grow the ""can do"" crowd, and finally benefits we have seen as we scale this.
We will share few slides, lots of pictures, a-ha moments as we iterated in creating such a program, actual facts, and how we are trying to find the right balance between culture, processes and tools.
Chapters
Full transcript
The complete talk, organized by section.
Olivier Jacques
December 2014. We are here with our CIO in the room, and we are about to start our massive DevOps pilot, trying to figure out, with the 5,000 to 6,000 dev and ops population, with the 1,400 applications that we operate for our own internal IT organization, if DevOps was going to be a thing.
Fast-forward, autumn 2015, and we attend the DevOps Enterprise Summit in San Francisco. And on the main stage, we share again our experience with our DevOps transformation that we just started a year back. And then there was a presentation from Target. And Russ Clayton, at the time, mentioned in one of their slides something about what they call the DevOps Dojo.
So I didn't know really what it was about, but I really loved the term, because the dojo in the martial arts is where you practice your art. So I think this would be something really interesting for us to leverage.
I'm Olivier Jacques. I'm an IT Distinguished Technologist in DXC.
Slawek Zachcial
And I'm Slawek Zachcial. I'm IT Manager, and my team is responsible for software engineering assets in R&D IT.
Olivier Jacques
So DXC, who are we?
DXC, we have 170,000 employees. We actually have 6,000 clients, but we also are two months old. Yeah, we are very successful. Actually, DXC is coming from both the merger of the CSC consulting company, as well as HP Enterprise Services. So we merged to form this global consulting and advisory company.
DevOps transformations. This is our own journey of DevOps. This is actually our own unicorn, which is an inflatable unicorn I bought in Austin, and it's been hanging on the wall in one of our offices. They are awesome. Go get one.
But what we found back in 2014 is that we actually had teams that were really doing already continuous integration, automated testing. They were ready to do automated deployments and continuous monitoring, et cetera. So we really had pockets of awesome, and I'm sure many of you do have that as well.
We started our pilot, as I said, back in late 2014. Global pilot program, and trying to figure out with a set of applications. So we took less than 15 applications towards this pilot, and this is how we got started.
And then we are now, since then in 2015, since now 2017, in this scale-up mode.
So talking about the people, because I totally agree with the IT Skeptic, it's all about the people. In those pockets of awesome, we have gurus, right? They are the ones that are really pushing everything. Sometimes they even contribute to Docker, let alone actually doing Docker. So we got that.
Global pilot program, I think one of our dangers for us was to make sure that we were not creating a cool club. The guys that are awesome and, yeah, they are right, but we are outside of this circle or this DevOps church, right?
And now, in this scale-up mode, is how do we make sure that... Earlier in the general session, there was, how do we transform cats to dogs? And I think this is the other way around, because I have three cats. So how do we transform dogs to cats? And with the existing workforce that we have, with the existing people capital that we have, how do we enable them? How do we unlock them? And how do we inspire them?
Because funny enough, again, it's really a cultural and a little bit of a technology transformation that we are facing at this time.
So really, and this is the topic of this talk, is how do we internally enable our people? How do we spread the culture, and how do we spread also the technology practices for real?
Slawek Zachcial
All right. So the way that we envision it is, as Olivier mentioned, we call it dojo. And we envisioned it in three pillars: white, green, and black belts. So they're not really, as you'll see in a second, they are not really about the level of proficiency, but rather they are geared towards different audiences. So we'll talk some more about it in a sec.
And then in addition to the belts, we also set up a couple times a year internal DevOps Days. So the goal here is to talk about technology, what kind of technology is available to the teams for use, but also show some success stories so teams can share, okay, what worked, what didn't, and create this community of practice and this connection. So those who are embarking on the journey, they know where to go, how to find the experts.
And then hackathons. Hackathons is another way. They're more kind of around some specific topics. For example, last year we did a hackathon about Docker and how to use Docker for the R&D solutions. Not only how to use Docker to build the applications or to deploy the applications, but also for our teams that work on R&D products, how they can leverage this technology to basically make their products better, faster, cheaper, and bring all the potential that the technology such as Docker can bring.
So now, the different belts that we have.
The first one is the white belt. It's a one-day training, and it's really like two to three hours. It's geared towards executives and leaders. And the goal is really to explain to them what DevOps is, how does it break the barrier between dev and ops to kind of help them project.
I think today, a lot of leaders, they're already familiar with Agile. They understand that, okay, Agile is really about breaking the barrier between the business and IT. So we try to explain that it goes even farther when you manage to break the barrier also with the operation and security people.
We also help them to understand how is it different from the current way of doing. And then there were a lot of talks in this conference about the culture, about the transformational leadership, how to really bring people along through these changes. So they are also coached on that topic.
Then another one is the one-week Green Belt. So it's a scripted training, and I'll go into more details, but it's a one-week training. We bring to this training developers and operations. We present them with some theory. Okay, again, remind again, what is Agile? What is DevOps? What is continuous deployment? What is continuous integration? What is the test-driven development? All of these practices that they may need to face while working on implementing this DevOps transformation.
We combine it with serious games so they can really see in a smaller context how the barriers prevent them from working effectively and fast. And then through those games, we manage to show them how changing very few parameters, they can really improve this collaboration. And this is all about exposing them to a different culture and also give them insight into some technology that can enable them to move through this journey.
And the last is the Black Belt. So it's a completely different setup. Here, we really work with the application teams. We work on hands-on problems related to those applications. Again, we bring developers, operations, and we try to tackle some big problems that this application may be facing, or the big rocks, in order for them to make this first step to this DevOps journey.
Olivier Jacques
All right. So before going to the next slide, I would like to roll up a little video with music courtesy of Paul Muller.
All right, so you got really, in a glimpse, the Green Belt Dojo.
Slawek Zachcial
Yeah. So you've certainly noticed the screenshots from the application.
As I said before, Green Belt Dojo is really like a scripted training. So there is not much really room for surprises, which is kind of slightly different than the Black Belt Dojo, when you tackle real problems and you may run into real problems.
So here what we have, we have a sample application. It's an open source application. We'll give you the reference to it afterwards. It's basically a shopping website. In addition to this, we have already built a continuous delivery pipeline, which is able to basically take the code from GitHub, build it, test it, run different kinds of tests, eventually deploy it to the environment.
The pipeline itself is also available as code, so teams can really see what does it mean to do infrastructure as code. The infrastructure to which we deploy this is provisioned also through code, so they can see all these different concepts.
And then we have a story, and the story is really about the team responsible for this application. They get an email saying that, okay, we want to test this discount. But the first thing that we want to do is to implement it using this pipeline. And then, so what they would practice here is the test-driven development. So they would implement the test to show that, okay, if some configuration is present, the discount is expected. They would see that running this test, it will fail. Then they will implement the actual functionality. They will see that now it works, right?
And this way they get to the point where this gets deployed to the test environment. But it goes even farther, because the marketing guys, they say, "We are not quite sure yet whether this is the thing to do. So let's just route 20% of the traffic to this discount version of the application and 80% through the application that was there before."
And actually, here again, they are faced with the infrastructure as code. They're able to modify the configurations to say, oh yeah, 20%, 80%. So they can see this A/B testing really in action. It's not the magic anymore.
And then, once this gets deployed, then we also show them the analytics, the monitoring, so they can see that really there is like a 20% and 80% going through the different versions of the application.
And the last thing is about we simulate a failure. So it turns out that this feature is such a success that it basically overwhelmed the existing infrastructure. We run into capacity issues. So here again, they're able to fix it by changing the configuration of the provisioning and say, "Okay, instead of 10, we need 20 VMs or 20 containers," or whatever. And basically, they're able to redeploy this or reprovision this infrastructure, and they can really see that it's not anymore going ordering the servers, installing them manually, but really do all of this through code automatically.
Olivier Jacques
Yeah. So a number of concepts that we use there, right? A/B testing, infrastructure as code, blue-green deploys. Those are all of the concepts that we use.
Slawek Zachcial
Yeah. And also one thing that I haven't mentioned is the chat. So we also promote the use of chat as a collaboration mechanism. And so we can bring different...
Olivier Jacques
ChatOps.
Slawek Zachcial
...players to the same table, if you will. And then we also insist a lot on bots. So we provide also a bot, which is able to deploy the application, to do some ad hoc checks with the application health, so they can see this really in action.
Olivier Jacques
Right. Yeah, so those are the few pictures that you can see there. Again, a mix of formal conceptual trainings, very short. Serious games. The tower is one of the things.
What I really like about the tower exercise, so it's really the exercise where we iterate on... We start first with a monolithic tower. Everybody tries to have great plans about the tower, how we are going to move from dev to staging and then to production. And then they go like that with the tower, and it falls apart.
And then we start with iterations. We start with actually the technique. We can even do blue-green deploys with this mechanism. But what it does is that it introduces the mechanisms even to non-technical people, because sometimes we have project managers, Scrum masters, and the like. And also it gives the vocabulary to be able to talk about the rest of the week.
So during those sessions, we have, "It worked for me. This is your fault. You have not followed the procedure," et cetera, those things. And the actual pipeline, which is live.
What you mentioned, Slawek, and I'm just going to reiterate it, is that at some point during this one-week Green Belt, people realize that the entire infrastructure that they are using, so we have actually a set of infrastructure for each one of them, has been built with code. So what we are teaching or preaching is also what we are doing. So they realize, well, not only the application is code, but also the pipeline and the infrastructure, and they can see the source code of all that. So that's very powerful.
Slawek Zachcial
Yeah. Also, something that we do is we try to not overwhelm them with, like, oh, you need to install this tool and that tool and that tool. We try to really keep it not very deep, but broad. So they are basically able to use their browser and basically commit to GitHub using their browser. And all the training is really insisting on seeing things at breadth rather than in depth.
Olivier Jacques
Yeah. So it's a training based on this story, a number of things and all of the concepts that we touched: continuous delivery pipeline, continuous testing, blue-green deploys, automated deployment, ChatOps, which you heard, and the DevOps Kaizen, which we are going to talk more about.
I think this is the feedback that we got from one of our last sessions in Houston in February. I think what really moved me is on the far right there: "The best course I have attended in my 19 years here." That's the kind of reward that we are talking about.
Slawek Zachcial
Okay, now Black Belt, as I mentioned, is a slightly different beast. And of course, it requires much more preparation.
So we would kind of run it in two-plus phases, or it's three-plus phases really. We would start with the preparation work. So again, we would bring development and operations team together to explain, like, okay, what we're trying to accomplish, what's the context?
We try to assess a little bit, okay, how far they are in the DevOps transformation journey. Do they do continuous integration? Do they do continuous deployment? Do they even automate the deployment, or maybe not? At the end of the day, it doesn't matter. The idea is really to assess, okay, where they are.
And then also what we do is that we do the value stream mapping to better understand, okay, how they could move from idea to production, and where the waste happens, and where we have a lot of wait time. And this will then help us to define the improvement themes.
So we don't try to do like, oh, let's change everything. Instead, we would identify two, three things. We would document them prior to the workshop and would say, "This is something that we'll try to tackle during the workshop."
Then we have the dojo, the Black Belt Dojo workshop. So this is where we now work on the stuff that we prepared. And what's interesting here is that upfront, we would set up two demo sessions to the stakeholders of the application, so they can see that really when team gets together, they can really accomplish things faster, better. One would be in the middle of the week. Another one would be at the end of the week.
The Black Belt Dojo, the participants of this would be the application team, the operations teams, or at least their representative, and then also we'll bring some DevOps experts. Say, I don't know, if one of the teams was to containerize the application, we'll bring some folks who have experience with that. If the team was to automate the deployment using Ansible, we would bring some folks that know that. If the team was to, oh, how do we use the pull request in order to do the code reviews, and then how do we do continuous integration, we'd basically bring the experts who have already that experience.
And then, we would do the workshop. And often what happens is that we won't be able to complete the improvement themes. But I think that the goal is really to equip the team with what they need in order to continue this journey on their own and give them any help that we can on the way.
So once the workshop is done, then we would put in place this continuous improvement framework that we call DevOps Kaizen. And here, basically, we would set some goals every month or every month and a half or something. And then we would meet regularly with the team to understand, okay, where they are struggling, what works, what doesn't, and help them basically along the way to make sure that this thing continues.
And then also what we're looking for is to, in these dojos, we bring into those dojos folks from different teams. So we have the application team, but then the experts, we try to bring folks from different teams that already went through this kind of training so they can contribute. And this is how we plan to spread this knowledge.
Olivier Jacques
Exactly. So the value stream mapping exercise, I don't know if you have attended Alex's talk earlier, yesterday actually, on value stream mapping. So discovering what matters, what is the issue. Getting data, getting small improvement themes that can be understood. Then getting them started, and then having this continuous months-on-the-cadence really, improving. The best way to eat the elephant is one bite at a time. So this is exactly what we are doing.
Slawek Zachcial
Yeah. Now, as a result of every of those workshops, we try to capture the themes and document the themes that are the most recurring. So a lot of this information is kind of available on the web. So people often don't know where to start.
So what we did is we created a collaborative website. We call it DevOps Central, and this is where we ask team to go, okay, if you need information, go there. If you find it, great. If not, let us know. We'll figure it out, how to bring you the best.
So here we have some examples of things that we talk about. So for example, how do you do infrastructure as code with Helion OpenStack? So in this case, I believe we use Ansible. So we help the team to kind of get them started and documented how to get them started using Ansible, using Helion in the company.
So as I'm sure you know, if you use private cloud, or actually any infrastructure, you use the kind of product, but there's always something which is kind of very special to your company, that every team that's trying to use whatever article they found on the web, they're struggling with. So we try to capture...
Olivier Jacques
I can give you an example.
Slawek Zachcial
Yeah. The proxy...
Olivier Jacques
The proxy setting.
Slawek Zachcial
HTTP proxy.
Olivier Jacques
Who has issue with the company proxies here?
Slawek Zachcial
Yeah. Exactly. So we've seen this millions of times, and actually that's something that we try to document in this topic. And this is a growing knowledge base.
Olivier Jacques
And just for the record, that website is also as code. So every time there is code, I mean check-in there, there is a pull request, it gets checked. So it gets checked against English syntax. So I make mistakes. It gets checked against dead links and everything. So we actually also practice what we preach as well.
So I think one of the very interesting things, if you just look at the three pillars there, the white, the green, the black belt, is that we, as a central team, are really trying very hard not to be the bottleneck. And this is the deal that Slawek was talking about. When you come to one of our Black Belt Dojo, the deal is that we may pick some individuals to become coaches, DevOps coaches, for the next future rounds. So we are trying to generate lots of DevOps coaches. And at Orange, they are doing the same. And for that, what's better than a sticker? So we distribute stickers for them.
All right. So...
Slawek Zachcial
One last thing. So as I mentioned, the application is available on GitHub. It's an open source application. Actually, it's not something that we created. Somebody else did it. We forked it, we improved it, we fixed it, and all contributions are welcome. You guys are free to just take it, use it, contribute back.
The pipeline that we use in this Green Belt training is available as code as well. It deploys on Azure, on OpenStack, and in Vagrant.
Olivier Jacques
VirtualBox.
Slawek Zachcial
Yeah. And so does the application itself. Then the DevOps game is documented at this URL. And if you move to the next slide.
Olivier Jacques
Yeah. Help we are looking for. So definitely trying something. We were inspired with this reference implementation of the application, which is both a monolithic and microservice application with a pipeline that is really end to end. Really interesting pipeline.
So we would love to add more tools there. We leverage GitHub, but what about adding GitLab? That may make sense as an option. But there are many other things. So any contributions or discussion or fork, I mean, just take the code. It's all yours.
And really something we are struggling with is that our company is global. I mean, our people are global. And traveling to run those trainings or those Black Belt Dojos or those Green Belts is a struggle. So is there anything that you guys do to do this virtually or to avoid as much as possible travel? We are thinking, actually, we started to look at Moodle for online courses to develop. But okay, anything that's going to help there, please reach out because we are looking for ideas.
Thank you.
Slawek Zachcial
Thank you.
Q&A
Olivier Jacques: General session now. Any question? If we have one minute or two minutes. Yes.
Q: So they go into this kind of mini immersion, but what happens when they're done? Do they go back into a collaborative environment, or do they go back to where they were?
Olivier Jacques: Yeah. So the question is, okay, they go to this place and what happens when they are done with going there, and do they go back and just go to a specific environment, collaborative environment you mentioned?
So I think for us, on the outside, organization model is very classical. On the inside, we are trying really hard to rearrange with collaborative teams, et cetera. So we don't have places where people, I mean, where it's Post-its on the walls and things like that. It's still very classical.
What we do for collaboration is we use the pipeline as the collaboration mechanism.
Slawek Zachcial: ChatOps as well, very much so.
Olivier Jacques: So it's still extremely classical in the way we... They are not in specific rooms or specific setups in this area.
Q: You addressed some Agile principles when you were talking about this. Do you assume as a prereq that they're already kind of practicing Agile tendencies and, obviously with a complement, but what level of prereq would you say you need? Black Belt, you said, hands-on.
Olivier Jacques: Well, Black Belt is really hands-on practice. We don't have any prereq. We are just trying to make progress. We are not the extremists of Agile or DevOps. We are just very pragmatic and trying to make progress. So no, Agile is not a prereq, but this is definitely something that we explain.
Slawek Zachcial: Yeah. We definitely talk quite a bit about Agile to make sure that everybody's on the same page, that they understand when we talk about it. But it's not a prerequisite.
Q: I take it doing the White and the Green are prereqs to Black?
Olivier Jacques: No.
Slawek Zachcial: No.
Olivier Jacques: The question is, is there a White and Green, is it prereq to the Black? No.
Slawek Zachcial: So again, I think that they are really geared towards different audiences. I think that the fact to go through White, which is really towards executives, is how to help to drive the change, to manage the change. And I think at the teams, this may be useful, but this is really not what they are really interested in.
Olivier Jacques: True.
Slawek Zachcial: So this is where I think that they are really for different audiences.
Between Green and Black, it doesn't necessarily to be the case. Definitely for teams that never did anything like that, starting with Green and then attacking some real problem on their application to Black definitely works. But we've also had situations where, for example, we had the teams that started to experiment a little bit with various techniques, and they ran into some kind of roadblocks, and they figured, "You know what? We cannot understand it, but we would like to address this problem in particular." Then they would go to the Black Belt Dojo, and we would work with them on this.
Q: Would the content for the Green be in the Black?
Slawek Zachcial: No.
Olivier Jacques: Really, the difference is that Green Belt is completely scripted. It's a training. So it's a sample application, sample pipeline. Everything is a sample. The Black Belt, you work on the application. So you really say, if the topic for the workshop is to automate the deployment, you will automate the deployment for your application, not some sample application, your application. And basically, the goal of the workshop is really to work through the problems that you will face on the way.
Q: So the Green's a training and then the Black's more of a coaching workshop?
Olivier Jacques: Yeah, exactly.
Slawek Zachcial: Exactly.
Olivier Jacques: Green training, Black is more coaching workshop. Yeah, exactly.
Q: And what's the duration for Black again?
Olivier Jacques: So Black, we did mostly one week. We got two weeks, but again, it's something that we continuously do with the team. So we just put them on their way for continuous improvement with those teams. And if there is a new Black Belt that needs to be done again after some time, then we'll do it again.
Slawek Zachcial: Yeah. For practical reasons, I think it's easier to just get team out of their day-to-day for a week than for two weeks. Those are the teams that really work on those applications, operate. Of course, when they are on this workshop, they don't do whatever they would be doing otherwise. So I think it's an easier sell. And I think it also helps to just make the problem smaller...
Olivier Jacques: Yeah.
Slawek Zachcial: ...and attack just a smaller problem, and then come back and attack the next small problem.
Olivier Jacques: Right.
Q: Does that get to a point of low level of competency with some of the tools that you're promoting inside of the Black Belt training? You literally have people come in without Git experience, without any exposure to things?
Olivier Jacques: Yeah, absolutely. So what we do, we try also to inject mini trainings.
Slawek Zachcial: Right.
Olivier Jacques: So we've had teams that were always working with Subversion, and now we tell them, "Look, there's this great..."
Slawek Zachcial: This is it.
Olivier Jacques: "...GitHub. It has this code review capability through pull request." And they'll say, "Yeah, we want to do it," but they've never did it. So we will just help them to onboard, and again, maybe not migrate the entire code base, but maybe one module, and get them through the motion of the code reviews and things like that.
Right. Thank you very much. Heading to the general session, and if you have more questions, happy to.
Slawek Zachcial: Okay. Thank you.