DevOps Who Does What
Within the IT organizational structures that have dominated the last several decades roles and responsibilities are fairly standardized. But with the dramatic changes that DevOps practices and supporting toolsets bring, many are left feeling a bit off balance - it’s no longer clear who is responsible for even things as “straight-forward” as development or operations.
In this talk I will take traditional roles that are distributed across fairly standard IT structures and sort them into a new organizational context.
What is the role of the Enterprise Architect? Who does capacity planning and how? How can change management step out of the way all while still satisfying the requirements of safe deployments? How do agile teams interface with personnel responsible for maintaining legacy systems?
I’ll leave the audience with a blueprint for a new organizational structure.
Chapters
Full transcript
The complete talk, organized by section.
Cornelia Davis
My name is Cornelia Davis, senior director of technology at Pivotal.
I've had the great opportunity of working with very, very large enterprises across all verticals in helping them make that transformation. I'm a technologist by background. I'm a computer scientist, and initially spent a lot of time talking tech at the whiteboard, but then realized that there was so much other stuff that needed to change around that.
That's what I'm going to be talking about a bit today, more around the organizational change that supports our technology needs.
My first question to you is: Is this your reality today? We have different business silos, different silos across the organization, and along the bottom there, we've got different individuals that are coming from those silos. When we have a new idea for a product, what we do is we kick off a project. And what we do with that project is individuals go into the project and they do some work.
Now, I worked with one large enterprise where I asked them about this. I was talking with somebody from their product management organization, and I said, "Well, tell me about that kickoff process around the projects." And they said, "Oh, well, the first step is we have to identify all of the project managers that are going to be working on this project."
And I said, "Okay. How many is that?"
And they said, "Oh, anywhere from 12 to 15."
And I said, "Okay. And how long does that take?"
"Two to four weeks."
Two to four weeks to figure out who the project managers are, get them allocated to the project, just so they can start this particular process.
So the first individuals from the first silos come in, they generate their artifact, and what do they do? They throw it over the wall to the next step. And notice that they leave the project. So if we have to go backwards, we have to figure out how to get them back into the project. And so it goes.
And we all recognize that this is a slow and challenging process, and if it only moved this linearly, that would be okay, but we all know that it goes backwards and forwards and circular and all of that stuff. So we all know the challenges of that. If we didn't know before yesterday, we certainly have heard that in spades while we're here.
But that's not even the biggest problem of these things. The biggest problem is that each one of these organizations are incentivized differently. My favorite example, being a technologist, is to talk about app dev and QA.
Application development is almost always incentivized: Did you release the features that you promised on time and, ideally, on budget? And if you release the features on time, way to go. You achieved your goals.
Then it moves over to QA. Well, QA, what are they incentivized on? Well, they're responsible for quality, right? So they are generally incentivized by the number of bugs that they have found and fixed.
Well, let's look at these things in combination. What happens when the application development process starts to fall a little bit behind? Well, developers start working late into the evenings. They work on weekends. They start working very unsustainable hours. And what happens? Quality suffers. But they hit their features on time.
Well, when they throw that over the wall to QA, what's going to happen now? QA is going to find more bugs. Way to go.
So we've got locally optimized metrics that do not create a globally optimized solution. That's an even bigger problem than the waterfall methodology that we saw in the previous slide.
Well, the answer's really simple, right? The answer is balanced teams. We've all heard that term, right? What we're going to do is we're going to center things around a product.
In yesterday's DevOps workshop, we talked about that project-to-product orientation, and we had great conversations around all of the different things that are actually enabled when you create these product teams. That product and the product team is incentivized to deliver value to a customer, to deliver value to some constituency.
For example, this product team, they are incentivized to provide the best access to my prescriptions when I access my pharmacy website. Or if I'm in an e-commerce scenario, I have a product team that is really about the best experience around showing product images or recommendations or soliciting reviews. Or it could be some back-office product that is enabling your suppliers. These are all the different product teams.
There's been a lot of research and a lot of discussion and a lot of proof points that product teams are really the way to go.
But this is what we have today. We don't have product teams. We have these different organizations. So these, you'll notice here, are organizations that are involved in kind of the SDLC, so very IT-centric: enterprise architecture, chief security office, infrastructure, middleware, app dev, data team, enterprise applications, those types of things.
The goal now is to take this high-level, hand-wavy thing of, "Hey, you're going to create product teams of these different disciplines coming together into a product," and what I want to talk about to you today is how we take this and reorganize it into those new structures.
So we're going to put things through the Sorting Hat.
If you have been living in a cave for the last 10 years and you don't know what the Sorting Hat is, this comes from Harry Potter. And for those of you who maybe haven't read Harry Potter or haven't seen the movies, when new students come to Hogwarts, the school of wizardry, on their first day, each one of those students places the hat on their head and they get sorted into one of four houses. And that's the house that they live in for the next seven years.
So we're going to take those roles and we're going to sort them into houses. But the question then is: What are the houses that we're going to sort into?
Let's take a little bit of a tangential ride over to the side and think about a couple of houses. I'm going to end up with four houses in the end, but I want to start with two.
This, again, is a slide that you've all seen for the last several years. This is where we were maybe 15 years ago. IT was responsible for the entire stack, from the hardware all the way up through the application.
Then VMware came along and really made infrastructure as a service a viable option, or I should say virtualized infrastructure was what VMware did. And then a whole host of people made infrastructure as a service available, Amazon Web Services, of course, being kind of the behemoth of that. That made it so that we could just get machines, EC2 machines, for example, and then we could stand up everything that we needed on those machines. Getting machines was easy.
Then, in the last five years or so, we've taken that abstraction up another level and we've created application platforms where we have individuals who can be building applications and the only thing that they need to worry about is their application code.
What's important about that application platform is that it generates a new set of abstractions. Those abstractions are at a higher level. They are fundamentally the application, or maybe some services that are in support of that application. And it allows us now to not do things like implement security by creating firewall rules at machine boundaries, but it instead allows us to implement security at the application boundary.
So this is one of the key things that's happened in platforms over the last five years, this new abstraction. That new abstraction has given us something really interesting and really important. It's allowed us to define two different teams, and it's defined a contract between those teams that allows these teams to operate autonomously.
When we hear about all of the different goals of an enterprise, they all talk about needing to bring software solutions to market more quickly and more frequently. So agility and autonomy in teams is incredibly important. We're always looking for those boundaries where we can create more autonomy, and you'll see that come throughout the rest of the presentation.
Now the application team, the team that's going to create, let's say, the next mobile app or the next web app or even some analytics app, they can focus on building that application and don't need to worry about even the middleware that sits below it.
They're responsible for creating the artifact. They're also responsible for configuring the production environment, deploying to production. They are doing dev and ops. It's not necessarily the same person, but it is the same team. They're deploying to production. They're monitoring. When they notice that they need more capacity, they're scaling so that they can achieve better performance. They deploy new versions when they need to. It's entirely up to them.
Now, there's another product team, and that is the platform team. That's the team that's providing the platform. And notice that they're doing exactly the same things. They are deploying the platform. They're configuring it. They are monitoring it. They are upgrading it when they need more capacity, or upgrading it to the next version.
So they're doing the same things, but they have their own products that they're working on. The product orientation is really key. This separation gives us the first two houses that we're going to sort into: the app team and the platform team.
Now let's take all of these roles that come from traditional organizations and start sorting them. Here's our two houses, the app team and the platform team.
We're going to do this piece by piece, and I'll explain the steps as we go along. The first ones that we're going to do is we're going to start with the purple bubble there. Notice something before I sort them. Notice that this middleware and app dev team is both. It's actually taking care of both the middleware and the application development.
In retrospect, having worked in this new world for the last five years, I find this kind of counterintuitive because why would somebody who's creating an application, i.e., using the middleware, be in the same group as the middleware itself? And to a large extent, it's because in the past, middleware required a great deal of expertise. You had to know a lot about the middleware to be able to effectively program against it. That's something that we're trying to move away from, and we're having more agile middleware platforms and so on.
So notice what happens here. We've got middleware and we've got app dev, and we break those apart. The ones that are responsible for producing and maintaining that middleware where my applications are going to run. And that's changed a bit, right? That's changed from J2EE servers to being container-based systems like Kubernetes or like Cloud Foundry.
So we put the middleware engineers inside of the platform team. They're part of that team providing the capabilities that the app team can then use. And then we take kind of a full-stack application development team and put them up in the app team. We've got front end, we've got back end, all of those individuals are there. So that one's pretty straightforward.
The next one that's also pretty straightforward is we're going to pull some of the folks out of the infrastructure team, the folks responsible for building out the servers and the networks.
You might have noticed a few slides ago that I put virtualized infrastructure and platform together in one team. Many of our customers actually keep those as separate, but in this case, it really wasn't important to make that separation in this talk.
So you could be separating the platform team into two separate individual ones as well. The thing that I would caution you is you need to make sure that you then have a very crisp contract between the platform team and the infrastructure team. And I'll be honest with you, that's a little bit harder to find at the moment. So that's part of the reason I put them together.
Again, server build-out, network build-out, they are part of the platform team, providing kind of the view of the infrastructure up to the app team.
The next one that we'll talk about here are what I like to call the control functions. So there's information security, for example, and change control.
Why did I move them at the same time? Change control was usually often coming out of the infrastructure team and information security coming out of the chief security office. I moved them at the same time because they share a common characteristic. They are functions that today can stop a deployment. They are functions that, on every release into production, they need to give their blessing.
We've seen talks even here in the last two days, where we even saw it on stage yesterday from HP that when it comes to the very end and we find problems then in information security or any other type of security, it can actually stop things. And there's a great huge ball of things that we need to check off.
There was another talk from Mika Denon yesterday, which was really great because she talked about becoming close partners with the individual in change control. What she was doing was exemplifying the behavior that I'm talking about here. She worked with that change control individual and said, "If I show you how I am doing automated deployments, and I show you and I work with you to demonstrate that your concerns are being met through those automated deployments, will you empower us to do our own?"
And the answer was yes.
So these functions here, information security and change control, should engage with your teams that are providing the platforms and the automation around the deployments to ensure that their concerns are satisfied. Their concerns are not wrong. It's just the way that we've been solving them is something that's in need of transformation.
All right. Next, let's talk about ops.
I have talked to countless organizations where operations is in the infrastructure group, and they're part of the run, plan, build, run. They're part of the run part of the organization. And they run everything. They run the platform. They run the infrastructure. They run the middleware. They run the applications.
Who's a developer that likes creating runbooks?
Oh, I see one hand.
Okay. Who is an operations person that loves the runbooks that the developers create for them?
Okay, I saw another hand, but very, very few.
What we're talking about here is really DevOps. Let's make operations part of the product teams. Again, it doesn't have to be the same exact individual. It has to be the team, though. Instead of having one operations group, let's put operations capabilities into each of the product teams so that the people who are experts in operating the platform product can operate the platform product.
And we empower the teams in the application team. We give them the right abstractions so that they can do their own operations. That doesn't mean that they have to learn the entire stack down to the infrastructure. For goodness' sake, no. We don't all become experts at everything. But we give them the tools and the empowerment to do their own operations.
So we take a function that was one function, and we split it out over the different product teams.
The next one, very similar, is capacity planning.
I was working with a very large automotive manufacturer in the United States, and I was talking with somebody from their ops team, and I was kind of poking at these roles and was trying to understand exactly what theirs looked like.
I said, "Who's responsible for capacity planning?"
And I kid you not, the individual from this organization pulled up the IT manual and said, "See? Says right here we're responsible for capacity planning." It was that rigid. There was one group that was responsible for capacity planning across this entire spectrum.
That's pretty normal. So what happens? Well, the capacity planning process goes something like this. Really early on, well before production, we have to come up with some estimate of how much capacity you're going to need. And you know what? We're lousy at that. It's impossible to come up with a really good prediction of what the capacity is that we're going to need.
And we know that we're lousy at it. So the worst thing that would happen is if we underestimate. So we overestimate, we end up over-provisioning, and we have resources that are underutilized.
The answer here is to put capacity planning in both of the places. Now, it's not as easy as that. It comes back to the contract that's sitting between the platform team and the application team.
You cannot, for example, have the app team doing their capacity planning and, remember, doing the scaling. Capacity planning is not just an estimation function now. Capacity is really capacity management. If I need more, I get more. But how do I keep the application teams from exhausting the resources that are in the platform? Well, we do that with contracts. Simple things like quotas.
Even if you're using GCP or Azure or EC2 or any of the AWS capabilities, you have quotas. Yes, it's very simple to get more, but you have that contract with AWS that says, "Here's the amount of capacity that I need." AWS, or whoever your platform team is, is going to use those quotas to estimate the actual capacity that they need to provide from the platform.
So coming up with that contract, and then each of the teams comes up with the processes that they're going to use to both provide enough capacity to their consumers and to estimate their capacity needs going down. So capacity planning and just capacity management changes as well.
All right. The next ones, you'll notice I pulled from the data team. Now, I will confess to you right now that, again, I've been working on Cloud Foundry for the last just about five years, that we as an industry have made a lot of progress on breaking things up and figuring out how to reorganize groups when it comes to application capacity, when it comes to compute. But we haven't done as well on the data side.
In most cases, we're seeing, for example, organizations creating microservices-based architectures, that if you peek behind the covers just a little bit, you notice they're all tied to the same database. The same very large monolithic database.
And from an organizational perspective, we've seen very little movement on the way that the data team is reorganized. The data team that I'm talking about here is the one that's responsible for providing any kind of database capacity into the organization. I sometimes like to say this is the group that you go to and they say, "Hi. Oracle's the answer. What's the question?"
Right?
Well, we want to break that apart as well. So whether these terms are the exact right terms or not, you'll notice that I moved the DBA, and I'm considering the DBA the individual who's responsible for providing the database servers, the database clusters, for providing that capacity. They belong as a part of the platform team.
Now, the platform team, I will tell you right now, can in fact be subdivided into smaller two-pizza teams. So you might still have a team that specializes in providing relational database capacity, another team that specializes in providing compute capacity, another one that specializes in providing graph database capacity, and so on, key-value store. But we have that team that's responsible for providing those services as a part of the platform substrate.
But then what we want to do is, just like we have enabled our application teams to build and deploy their own applications, we want to give them control about their databases and their schemas. Let them evolve their schemas. Let them version those schemas. Let them figure out how they can have multiple schemas running in parallel, all of those types of patterns. So we want to break up data into the right groups as well.
All right. Next, I want to talk a little bit about product teams needing product managers.
I'm going to bring another organization into the picture here, and that's the business.
What we've had in the past, you'll notice there under enterprise architecture that we have business analysts, and business analysts have generally been in the business of taking the requirements from the business and translating them into something that can start to launch the rest of the IT process, the development process.
What I want to do here is I want to take your business analyst and not make them the product manager. I want to pair them with somebody from the business. Because if you don't pair them with somebody from the business, then you're still throwing things over the wall. Remember the picture at the very beginning. I'm still starting with the business who's throwing things over the wall. So you're still going to have that conflict, that tension, that finger-pointing that happens when we do scope creep and all of those types of things.
Make them part of the product management team. They're now responsible for the scope themselves. There's no longer change orders.
Now, of course, it's not just the application, the consumer-facing application team, that needs product managers. The platform team needs product managers as well. And whether this is the individual, this varies. Your mileage will vary. But what we found is really, really helpful is that the folks from your enterprise architecture group are really good candidates to becoming the product managers, or maybe pairing with somebody from the infrastructure teams on being the product managers for the platform team.
And I'll say more about enterprise architecture in just a moment. You'll notice that I've put enterprise architecture as a part of the platform team, but I've left them over there in their bubble as well. You're going to see that the enterprise architects continue to be very busy in this new world.
Okay, so let's shift things around a little bit. I need a little more screen real estate. I've got two things left. I've got some enterprise architecture roles, and I've got enterprise application roles.
So DCTM, you're probably wondering what that is. That is homage to my background. I came to Pivotal from EMC. I came to EMC from Documentum. I came to Documentum from eRoom. Anybody know eRoom?
Oh, yes, a couple of hands. All right.
So that was my long legacy to Pivotal, is from eRoom all the way through. Documentum enterprise application was built 30 years ago. It's one of those monolithic applications built on top of kind of a three-tier architecture. It's got that big old Oracle database or SQL Server database at the bottom, very resilient storage systems. It's got a big old thick tier in the middle, and then it started out actually with a desktop client application, and this was pre-web, of course.
So you all have these types of enterprise applications in your organization. We have to continue to deal with that.
The first thing that I'll tell you is that I want you to start thinking about your Documentum team, or your enterprise application team, whatever enterprise applications you have, as product teams.
Now, they are not going to be able to move as agile as some of those other organizations. However, we have also heard here and at other DevOpsDays from people like Rosalind Radcliffe, who is applying DevOps principles to the mainframe. So the lessons that you're learning from her are the lessons that you should be applying here. You can do product management. You can do DevOps in these settings as well.
Now, these systems, however, will tend to still move, particularly while you're still making the DevOps transformation, at a pace that maybe is not quite the same cadence as the daily or multi-daily releases that are happening by the app teams.
So what we want to do now is we're going to create multiple product teams, multiple application teams across the top that are leveraging both the platform, the new platform, as well as connecting into the enterprise systems.
And I call that the legacy service team here. This is the team that is generating the interface that's going to mediate between the application teams on the left-hand side and the enterprise system on the right. Notice that it's a product team, just like any of the other product teams, with a product manager, capacity planning, all of those types of things.
All right, coming down to the final couple of roles here. Let's talk a little bit more about enterprise architecture.
I was at a conference a couple of years ago, and I was having breakfast with a number of individuals that I didn't know. There was somebody from actually one of the home shopping networks. I think it was QVC. There were two individuals from that organization there, and one of the individuals was telling a story where he said, "Last year when I was at the conference, I was part of one of these teams over here. This year, I'm in enterprise architecture. So last year, I was policed. This year, I am the police."
And I thought, "Ooh, remind me not to go work there."
What we want to move away from is this notion of enterprise architecture being the ivory tower. And this has been one of the most successful things that I've seen with the organizations that I've been working with. By and large, the enterprise architects love this transformation.
Here's what we're doing. I'm going to introduce a new house. By the way, the third house that I added was your enterprise applications house. The last house that I'm adding here is what I'm calling the enablement house.
And it says, take some of those functions that are in enterprise architecture and make them, instead of being ivory tower, "We'll figure it all out. You then just follow the practices." Have them in the role of enabling teams. And one of the best ways that you can have them in a role of enabling teams is actually to make them part of the team.
So you notice here that I didn't remove them from the enablement organization. These are individuals that can still stay as a part of a matrixed organization like enterprise architecture, but they actually spend part of their time pairing in the teams. They become parts of the team members. They're measured on that team's success.
It's important, though, that they still have this broad view across the different projects, because that's where you start to see about the reuse.
One final thing that I want to add into this picture is that enterprise architecture is not the only organization that I'm suggesting that stays together as an organization and is then paired into the product teams. Other organizations like that would be things like information security.
And that, in fact, is exactly what Tomer was talking about from HP yesterday, where he said, "We have our security folks, in fact, going down and working with the product teams, addressing security concerns through the entire life cycle."
So that is the final sorting that I want to do. And I will just say, let the wizarding begin.
Thank you so much.