Red Hat IT’s DevOps Journey
Red Hat is the world’s leading provider of open source enterprise IT products and services, with over 6,000 associates globally and annual revenues exceeding $1.5B. As of October 2014, Red Hat IT will be nearly a year into its own internal enterprise IT DevOps transformation.
Learn how Red Hat IT’s DevOps transformation tiger team, “Inception,” was chartered and staffed, and its successes, failures, and lessons learned to date. Understand Inception’s continuous integration and delivery (CI/CD) strategy, the team’s partnership across specific groups and applications in Red Hat IT’s operations, development, and governance functions, and the IT re-org that helped set the stage for DevOps. Finally, preview where the Inception team is headed in its second, and final, year.
Chapters
Full transcript
The complete talk — auto-generated from the talk's captions.
Good morning. Everybody hear me in the back? Thumbs up? All right, thank you.
Welcome. Thank you for joining me. My name is Bill Montgomery. I'm an IT manager at Red Hat.
I am currently leading our DevOps change initiative in Red Hat IT. I've worked in development and operations roles in the e-commerce and software industries. Been doing mostly operations for the last 10 years. And my goal for today is to share with you some of our thought processes, how we approach this, some of our successes and challenges, and maybe just provide a little bit of a relatability for any of you going through a similar change initiative in your organization.
So any Red Hat customers in the room? All right, great. That's wonderful. I promise no sales pitches.
We have some of our friends from our product and marketing arms of the company in the booth upstairs. I'm in the back of the house in IT. I met a lady at a neighborhood kid's birthday party this summer and introduced myself, said I worked at Red Hat. She said, "Oh, what do you do there?" I said, "I work in IT." And she said, "I know Red Hat.
I know they do IT, but what do you do? Are you sales or finance?" I said, "No, no, no, we have an IT department, too, just like everybody else, and so that's what I do." So I assure you, we're a horse like everybody else. We've got the HR and finance and expense reports and all of these boring but important systems, as well as the customer-facing systems of engagement, things like that. So some stats on us.
We've been around 20 years. We're an open source company, which is a very important aspect of our culture that permeates down into IT and really all aspects of how we run the business. So Red Hat IT. What Red Hat IT does is, like I said, run the back office systems as well as some of the things that customers interact with, like our customer portal, our marketing website.
What we don't do is develop the products like Red Hat Enterprise Linux and our cloud and middleware products that all of you Red Hat customers download and use. We basically support the business that allows them to develop and market and support those things. So about 270 people around the world. We're mostly based out of Raleigh, in IT, that's where I'm from.
But about a third of us are kind of smeared around the globe. We adopted Agile about four or five years ago and use that in the majority of our projects. So you can kind of get a feel for about the size of the department and how many things we have going on at once. So how did I get here?
Other than the great existential question about that, more specifically, this DevOps initiative. About a year ago, our CIO initiated a reorg of IT. This was fairly impactful. About 70% of our associates moved around in one way or another.
Some of the goals of that reorg were to improve overall process flow through IT, take better advantage of the cloud, move more workloads to cloud, which aligns well with Red Hat's corporate strategy as a technology vendor, and improve customer engagement. So one of the specific things that he wanted out of this was IT's ability to, quote, "scare the business." So what does this mean? Well, traditionally, like I think probably most of your organizations, if the business comes to you and says, "Hey, we want to change the way that for Red Hat subscription management is one of the key functions that we do. You download Red Hat Linux, you need to license it and activate the subscription.
We want to make a change for this new cloud product line." IT will say, "Well, it's going to take us six months to do because we've got all these impediments to making that change." And so then the business is waiting on us. What he wants to be able to do is tell them, "Okay, we can have that done next week." And the business says, "Whoa, hold on. We're not quite ready for it that fast." So basically shift the bottleneck back to the business to help Red Hat be as agile as Red Hat needs to be in order to compete in a really competitive industry. So anyway, end of the summer, fall of 2013, CIO has the all-hands, doing a PowerPoint presentation like this, gets to the slide with the new org chart on it, and under IT enablement, which is our word for IT operations, is DevOps enablement.
And so there's a lot of excitement, a lot of eyes glazed over, a lot of FUD around that. Developers saying, "Oh, hell no. I'm not going to wear a pager. What are we doing here?" But overall, a lot of excitement, a lot of wonder, a lot of interest.
And so I was fortunate enough to be asked to lead that. So got to work recruiting a team, putting together a charter, figuring out, okay, I'm supposed to do DevOps and be able to scare the business. What does that actually mean? What the heck are we going to do?
So to give you a little more context about how this team is structured, because I want to give you some insights into how we've done it and some of the strengths and challenges with that. We are a dedicated team. We call ourselves Inception. We sit inside IT enablement, which is pretty much your traditional IT operations functions, but also with the governance functions of security and architecture.
And again, we report up through IT enablement and kind of sit alongside some of the other development and support functions. We have a two-year charter. So this is not a new DevOps team. This is a temporary DevOps enablement team that has two years to move the needle and then integrate back into the organization.
So, in terms of who we brought onto the team, I was fortunate to be able to assemble a really great team of people. Jen and Ryan are here with me today. I'm very glad to have them here. And recruited for very diverse functions.
One of the speakers earlier thisEarlier this week, talked about the need to have people involved in your DevOps initiative who understand how the sausage is made. And so we wanted to make sure that we had all the different functions in the organization, to the best that we could, given our small size, represented within the team so that we had the credibility, had the relationships, and just know how things work so that we can try to change them effectively. So, we've got folks from operations, program management, information security development, system administration, release engineering. These folks have all done these jobs inside of Red Hat IT for some amount of time.
Ryan, we recruited from the outside and it's been great. He's brought a kind of a new, fresh perspective, and so we thought that was important too, but we wanted the balance of the team to have the experience and relationships and credibility inside the organization already. So, all right. We've got a team.
We know that we're doing DevOps, but what is actually our mission? DevOps, as we've heard this week, is an incredibly broad category that covers a lot of different aspects of culture and automation and sharing, and all those great things. So, here's our mission statement. This is not the one that we started with.
The one that we started with was very hand-wavy. It sounded great. We would've won DevOps buzzword bingo with it. But- ...
but didn't really say exactly what it is that we're trying to do. So we started out with, enable Red Hat IT to effectively deliver high-quality solutions the open source way by collaborating or cultivating a collaborative culture. So those are great things. I don't think anybody would argue with that, that's a bad thing to do.
But when we only have two years to affect change in a fairly sizable organization, that doesn't give us nearly enough specificity. And we had three to four months of kind of flailing about at the beginning, doing work and working hard, and producing some good small wins, but not with a whole lot of focus or direction in order to really move the needle in what's really a pretty short 24-month period. So, this is where we're at. We're still noodling on it a little bit, but that's effectively what we're trying to do.
Halve cycle time. Let me go back. We want to halve cycle time and double release frequency. Those are specific, measurable things.
We want to do that by the end of our charter. And it gives us something to rally around and a tool by which we can say no to things. There are so many good ideas to implement under the umbrella of DevOps. If we try to do all of them, we'll be successful at none of them.
This helps gives us some focus. So that's why I think it's important. So now that we know what we're trying to do, how are we going to do it? Well, DevOps, right?
Again, very broad. DevOps by itself is not a strategy. And again, as I go through these slides, this is all kind of retrospectively presenting all this in a very organized, clear kind of a way. The actual sausage-making of doing a change initiative like this, we don't start with a PowerPoint and say, like, "This is how we're going to do our strategy, and we have these very clear thought processes, and all of our conversations are very structured." That's not true at all.
It's a great way to put together a PowerPoint and sum it up for you guys, but the actual sausage-making is, that's why we use the metaphor sausage-making. So, I sometimes sit in these and look at like, "Boy, wow, they're so organized and have their thoughts so clearly laid out." I just want to be transparent that that's not actually how it went down. That's just how I get to present it to all of you looking back over the last year. So anyway, getting back to our strategy, we start with DevOps.
Great. That encompasses a lot, but how do we actually want to build a DevOps culture? So DevOps is culture. I agree with that.
I believe that statement, but I don't know anybody who can just directly go into an organization, no matter how many relationships and credibility you've got, and just say, "Hey, you guys, change the way that you interact with each other. Change the way that you have conversations. Change the way that you respond to incidents." You can't do it, I don't believe. I haven't seen it.
What we can do are implement practices and procedures and processes and tools that support those, that then lend themselves to creating the desired sort of culture. So that's what we've done. People say, "Oh, DevOps is not about tools." That's true, but also you can use tools to help influence the development of cultivation of a DevOps culture. So that's where we're going with this.
Continuous delivery provides a great set of practices that you can apply fairly prescriptively. Adapt it for your environment. But has everybody read, or has anybody read the Jez Humble and Farley book on continuous delivery? Yeah.
In my opinion, it's required reading if you're working in this domain. Great book. They present a maturity model. The back of the book, it should be in the front, it's great.
And it gives you six or seven different axes to look at your organization and evaluate your maturity in terms of continuous delivery. So we knew we had six or seven different areas we could work in, and needed to figure out, okay, we're not going to fix all of test automation and configuration management and environment provisioning and data management. We can't work on all of these at once. Where are we going to start?
We took a couple different approaches. One, we turned that maturity model into basically a SurveyMonkey and sent it out to the organization and said, "Hey, tell us what team you're on and self-evaluate." So that gave us some data. Not super scientific, but gave us a general compass heading. And then we did another thing that a couple of my team members said, "Hey, let's do a complaint fest.
We think that we should grab everybody in IT, pull them together in a room and just tell us what sucks about IT and what are all the dumb things that we're doing and processes and tools that are broken."I'm a very kind of optimistic, cheerleader, glass-half-full kind of guy. So, this made me all kinds of uncomfortable, and I was like, "Oh, God, a complaint fest sounds so negative. I don't want to do this." But I said, "You know what? We're early on.
I don't have any better ideas, so great. Let's go ahead and do it." And we did it, and I was completely wrong about the outcome and tone of the conversation. It was incredibly positive. We got a ton of good information out of it.
I think 600 lines in a notepad that took a lot of distilling down afterwards to kind of pull out the themes. But great information that came out of that. We got really candid conversations. It was fantastic.
So, between those two things and working with the team that we partnered with, coming up in a slide or two here, we decided to start on release automation. Environment management and release automation were our two biggest pain points, generally, across the organization, and it was kind of a coin toss, and release automation is where we ended up. So we know we have a mission to cut down our cycle time and increase release frequency. We know that we want to work on release automation, but we've got a baker's dozen, at least, of development teams, and where are we going to start with that?
We've got a lot of different technologies and lines of business in play here. So we knew that we wanted to create repeatable patterns for doing continuous delivery. So, we looked at our teams in terms of three different technology buckets. While our development teams are aligned with lines of business, not with technologies, the fact is, each one of them kind of tends to work between 80% and 100% in a given technology domain.
So, we broke that down into ERP apps. So we've got a big ERP system from a vendor who starts with a big O that probably everybody has somewhere in their organization. We have SaaS applications, so teams that build on top of platforms like salesforce.com, and we have packaged apps. And at Red Hat, when we say packaged apps, we mean Java code, Python code, Ruby code that's been written in-house or comes from open source upstream, gets built, tested, packaged, and rolled out on infrastructure that we manage.
Whether it's cloud or in our data center, that's what we mean when we say packaged apps. So, we evaluated those on two different axes of readiness. So how ready are the teams and the technology to implement continuous delivery and release automation, and how much value is there in doing that? For the ERP guys doing the HR and finance and really back office kind of systems, we heard about some of the different tribes that people have in IT.
Figure out the mainframe guys that are still using the 30-year-old green screen text editors. I wouldn't say that our ERP teams are there, but they're definitely towards that end of the spectrum. So not real ready for change. Like we heard earlier this morning, PeopleSoft, how do you do continuous delivery of PeopleSoft?
Our system isn't far off technology-wise. And the value, those systems don't need to iterate as much. They're the systems of record, not the systems of engagement that you're going to do a lot of experimentation with. Our SaaS apps, a lot of value there.
A lot of systems of engagement, a lot of need to iterate, but they're all very specific to that technology stack. So Salesforce is going to have the Salesforce way of doing testing and deployment, and if we figure it out really well for Salesforce, there's not going to be a lot of repeatability in those processes, so not a great place to start. Packaged apps, tons of value. These are the things that tend to be unique to Red Hat, like our subscription management and customer portal, the things that really make Red Hat, Red Hat.
A great need to go fast there, and those teams are willing and ready to engage with us and to try new things, so that's where we decided to start. So that was about a half dozen teams that are kind of working that technology domain. And so, we had to choose who to work with. And who we chose, and I apologize, this is the alphabet soup here, but I'll try to walk you through it.
Our SSE team. This is SOA Services and Enterprise Service Bus. So SOA is our service-oriented architecture, basically a set of APIs to Red Hat's business. So, if you're developing an app that needs to pull information on a customer's subscriptions and pull out SKU pricing data and use that to present something to somebody, you're going to interact with our SOA tier.
Enterprise Service Bus is basically the backbone of communication for all of our enterprise apps. That allows the CRM system to talk to the order to cash system, to talk to the customer support system so that the systems are all aware of each other. And because of what this technology does in terms of our overall architecture, they're at the intersection of every major program. So we've had about one every year.
This year it's Customer 360. So, we want to be able to give salespeople insight to what's going on with the customer from a support perspective, so that if they walk into a sales meeting, they understand that they have a bunch of unresolved tickets or vice versa. On the support side, they want to be able to understand if a customer has a big opportunity on the table. All of that is an integration problem, and increasingly those are the types of projects that we're doing are integration projects, and SSE is at the intersection of all of those.
So they're always the bottleneck on the IT side. One of our highest performing IT teams, but just architecturally, they're at the crossroads. So if we can make them go faster, then we have a great network effect of making everything else go faster. So, we developed this great partnership, started going to each other's stand-ups, worked with both SSE and our PlatOps team.
So PlatOps does actually execute hands-on-keyboards releases within Red Hat IT, and spent about four months to develop some tooling we built in-house, open sourced it for release automation. It's called Wintermute. We've got a slide with the GitHub link and stuff coming up here.We got to a successful proof of concept in July, where we actually deployed a new set of SOA Tier nodes to the QA, the canonical managed QA environment, with no human hands on keyboard other than just to say, "Process start." So that was very exciting. It was a great success, a great milestone for us.
But like Gene said the other day, "Don't tell us all about sunshine and rainbows. Tell us about the parts where you're stuck in the mud and where you had challenges." So the flip side of that is that while we had a great, enthusiastic senior manager and a very capable and willing team with SSE, they're an extremely mission-critical part of the architecture. We can't have issues with what they're responsible for. They were under a lot of pressure from a very high profile and date-driven program with Customer 360.
We had one of the developers, Vertant, on that team. At one point, we were in a hack session saying, "I just want to do this the old way. I know it stinks, but I know how to do it. I know how long it's going to take." And so that was the feeling leading up to this POC, and I think late June, we were in a planning meeting trying to figure out when are we going to do this release, what preparation do we need to do for it?
It was us and the SSE team, and the manager of the SSE team says, "Well, I just got an email this morning that the QA release for this program is going to happen on the Fourth of July weekend." And we all said, "What the... Really? We're not going live for months and months. This is for the beginning of a multi-week initial integration testing cycle.
We really can't do it the following Monday?" And so that's when we realized the impedance mismatch between a big program with waterfall program management coming from an outside big four systems integrator firm, and us trying to tack on continuous delivery practices on the end with Agile going on in the development team in the middle, and they're only one of many development teams involved. That was a turning point for us. And so we got through a successful proof of concept, but at this point, we've branched out. We've briefed at least six or seven additional teams now on the release automation tooling that we're using, most of our packaged software teams.
And so we're kind of doing what another speaker referenced with hedging our bets and working on not spreading ourselves too thin, but working on continuous delivery with more than one part of the organization. So, some numbers. I won't get into this real deep except to say, yes, we think measurement is important. If you're going to communicate the value of what you're doing up the chain, managers that are two or three levels removed don't feel any of the improvements that you're making on the ground, because they're two or three levels removed.
So we think that metrics are an important way to communicate, and also just to prove to ourselves that not only does this feel better, but we're getting the results that we want of shorter cycle times and more releases. Again, the not all sunshine and rainbows. I ran the numbers quarter to date for this a week or two ago, and they appear to have slid back to where we were four quarters ago. So, there's a little bit of two steps forward, one step back here, and that's just reality.
I wanted to share a few thoughts on just how this journey has been for me personally. I know a lot of you are leading or are considering leading, excuse me, a similar initiative in your own organizations. And some of you may be coming from an operations background like myself, but one of the things that surprised me was the stress of going from a 24 by 7, there's always something on fire, you're always dealing with the production issue du jour, and doing that for a decade, and then transitioning into a innovation and change mission. So that's counterintuitive, and it was counterintuitive for me, too.
But when you've always got a fire on your plate, and you're just triaging and picking which one you're going to do first, you can have a high level of confidence that you're doing the right things, that you're helping the organization, that you're keeping the lights on. With the innovation and change stuff, there's a lot of room for self-doubt on like, "God, are we even working on the right things? How are we going to prove that we're working on the right things to ourselves and the people that are sponsoring us? How are we going to measure that?
And does any of this really matter?" So, there was a surprising amount of just personal growth I had to do to get used to working in a completely different mode from the operations, go, go, go, pagers going off, et cetera. The other thing that was a challenge for me and might be for you if you're getting into an initiative like this is, again, coming from the ops background, being used to resolving issues in minutes, hours, maybe days if it was particularly hairy, and then again, transitioning to something where we're trying to take a fairly decent sized organization of 270 people and realign the way that they're all working within their teams and with each other. It takes time. A year into that actually isn't a tremendous amount of time.
So, just shifting my own expectations on how fast this stuff is going to go, again, that's been a challenge and a growth opportunity for me. Just a quick plug for this book. Gene recommended this to me a few weeks ago. I wish I'd read it a year ago.
Provides a great framework. It's not about DevOps, it's about doing innovation and change initiatives inside existing enterprises, and provides some great frameworks, recommendations, really compelling case studies. Again, all not DevOps, but it will resonate with you if you're trying to do this. Called "Other Side of Innovation."So we're still trying to figure out how to integrate those ideas, but a lot of good things there.
Wintermute is the tooling that we've developed and open-sourced in Red Hat IT. This isn't a saleable Red Hat product, but like I said, open source permeates the way that we do everything. So a lot of the tools that we develop in Red Hat IT for our own operations, we like to open source those. And if we can get a community built around this, then that would be fantastic.
So check it out. We're in the process of renaming from Release Engine to Wintermute, so just go to this GitHub link and you'll find it. So, some lessons learned, hopefully some things that you can build on from what we've discovered. Having a dedicated team has been totally critical here.
It's a luxury, but a necessity at the same time. And again, I want to thank Jen, Ryan, and all the guys back in Raleigh and New York. Any of the success that we've had to date has been entirely thanks to these folks. Are your teams as excited as those kids in the picture?
Only when we're playing with blue chemicals. Exactly. Yeah. So, taking an agile and iterative approach, not just to how we're influencing work, but how we're actually planning and executing our own work.
This is all a learning and experimentation thing, so trying to put down any kind of a project plan more than a couple months out into the future seems like a waste of time. I believe this is a way to do it, for sure. And using continuous integration, continuous delivery to build a DevOps culture, even though DevOps isn't about tools, this is still a way to help you get there, I believe. If I had a time machine, I would've clarified their mission statement earlier.
Like I said, it took us almost six months to get to the point where we figured out, like, "Oh, we're going to do release automation, and we're going to work with SSE, and we're going to start building this tool." Might've been able to shrink that down some with a little bit more clarity on the mission from the get-go. Hypothesis or record, this is a concept from the "Other Side of Innovation" book. Read about it. It resonated a lot with me.
And then attaching to a lower-risk project to start. There are probably some folks on my team who are too polite to say, "I told you so," but we did start swinging for the fences with SSE, and we might be a little bit farther today, armchair quarterbacking, but we might be a little bit farther today had we attached to something lower risk and then built off those successes, instead of swinging for the fences and then now coming back and saying, "Okay, let's try something a little more achievable to start with." So, to spark some of those exothermic reactions that Gene keeps talking about all week, something that I really need help with is how do we take the progress that we've made with this team, which has a finite lifespan, and make sure that we've got the appropriate level of ongoing investment into DevOps, and that the changes that we affect are lasting. The worst thing that would happen would be we put some new tools and process into place, and the day that this team disbands, it starts to rot on the vine and we stop making progress, because if you're not continually improving on this stuff, then yeah, it's going to rot on the vine.
So, I hope that we have just a minute or two for any questions. You have one minute. We have one minute. Time for one question.
Yes. You talked just a little bit about how you convinced, or how you worked with other folks in the organization that... It seems this common theme I hear through a lot of these things is just doubt. Yeah.
The people that are here are sort of the converted already. Yeah. Right? Yeah.
Great question. So, first the question is how do we work with the people in the organization who have a lot of doubt? Because the folks here in this room are kind of the converted, preaching to the choir here. I think the technique that we've used over and over is just to listen to all the criticism and take those people and get as close as we can to them, like literally take them out to lunch when they're in town and talk to them.
So by being really open and transparent about what we're doing, a lot of public demos, all of our stuff in Rally is completely public and transparent. And yeah, just engaging with those people, whether it's in MeetSpace, on IRC, and on our wiki site, just engaging with those people and listening to what they have to say because they have legitimate concerns, and so they need to be considered. All right. Thank you all.
Appreciate the time.