Log in to watch

Log in or create a free account to watch this video.

Log in
Las Vegas 2022
Share
Download slides

The Rise and Fall of DevOps

Are you focusing on the wrong DevOps Toolchain? The foundations of DevOps lie throughout the organization and are not limited to “Dev” and “Ops”.Successful organizations forge 6 critical links in the chain, and if just one link breaks, improvement stalls and then crashes. The secret to staying on the rise? Don’t lose sight of the real DevOps Toolchain: Mission, Structure, Ownership, Platform, Learning, and Trust.Join us as we look at how these critical links work together to lead to successful outcomes. We’ll also tell stories from the trenches about how losing focus in just one of these areas can cause the transformation to stagnate, or even reverse success in higher-performing organizations.

Chapters

Full transcript

The complete talk, organized by section.

Bryan Finster

Right. Yes. I am Bryan Finster. I'm a distinguished engineer and value stream architect with Defense Unicorns. We deliver platforms to help teams in some of the most challenging environments -- think submarines.

Dana Finster

I'm Dana Finster. I'm a developer. I'm currently a technical lead, development in cybersecurity, and I've also had experience leading grassroots transformation through building communities.

Bryan Finster and Dana Finster

And we realized as we were getting ready for this talk that between us we have 50 years of experience delivering software in large enterprises, both badly and sometimes really well -- occasionally well.

Today we're talking about the rise and fall of DevOps. We're here to share experiences and observations about how DevOps rises and then falls, and hopefully we do more of the former.

We've both witnessed and lived this in multiple organizations. A leader presents a mission: we need to improve our ability to deliver working solutions to end users daily.

Broad changes occur to help the mission. Platforms are built to smooth the way. The growth of communities of practice to share how we do things. Hands-on training to help teams with good practices around things like continuous delivery. And then the growth of pride and ownership as people start seeing the outcomes of these results, start delivering better.

And then things change. Complacency sets in and engineering excellence and the focus on improvement is lost. And then detractors in the organization gain momentum: I mean, it's just another fad, another buzzword. The change boards and after-hours installs are reinstated because of a quality that's lost. Regression causes high performers and the frustrated to leave the organization. And then the organization is worse off than when we began.

The fall of DevOps results from the failure to sustain the rise of DevOps.

And when the DevOps momentum's going really strong, it feels good. We start thinking, we've got this. But sustaining that change is really difficult, and it can be very fragile. When we start thinking we might be done and the efforts to continuously improve the process start to slow down, it's fragile. And the improvement mindset is not baked into everything we do, and then culture starts to regress.

We start to revert back to old ways. But we still use engineering-focused DevOps tools to build and deploy faster, just with lower quality, because those tools are not what really empowers DevOps.

Today we're here to share with you what we want to represent as a better DevOps toolchain, what really holds transformation together, and that's mission, structure, ownership, platform, learning, and trust.

The links in the chain are the foundations of DevOps, and all of them are necessary for success. It's imperative to continuously monitor which of these links are missing or maybe just starting to break, and we're going to talk about each of these ways to break them as well as ways to forge them, because the first step to improve an organization must be to identify the weakest links.

That's not me, by the way. We can't start finding ways to improve the system and deliver faster without knowing what we need to deliver and why. What's the mission?

Now, you can have missions, and they can be really crappy missions, and they really screw up your chain. Improve DORA metrics. I've literally seen this as a goal in an organization. It does nothing to help the customer. All teams should use machine learning because machine learning solves everything. Or we need 100% utilization. It'd absolutely be a shame if all the coders weren't heads down typing every minute.

The mission should make us better at what we do. It should inspire action and provide better value to our customers. They shouldn't be buzzwords or internally focused.

And even a good mission, though, is meaningless if the organization isn't aligned to it. We had a mission across tech to deliver business value daily, and this was announced by the CTO just as I was ramping up for an internal DevOps day, planning to launch a continuous delivery user community.

As it turned out, this was perfect timing. Everything took off quickly. Everyone was excited about the DevOps. They're excited about continuous delivery. And it allowed developers to work together across teams and areas because they're all trying to solve the same problem together.

And by building this grassroots community, we achieved a reduction in silos and an environment that was more open, trusting, and had more open communication, which allowed us to make progress toward our mission.

Because zero defects was the mission that got us to the moon, right? No, that's just another crappy mission. Why deliver business value daily? Because it's hard. As Kennedy said, because that goal will serve to organize and measure the best of our energies and skills. That forces us to elevate the entire organization.

Of course, a mission is just a dream if we just write it down and don't have the structure to achieve it. The design of the organization has a direct impact on the ability to deliver. Ultimately, we're trying to solve a communication supply chain problem, and an efficient supply chain requires good structure.

We need a structure that reflects the outcomes we want. Conway's Law says that systems will resemble the communication structure of the organizations that build them, and I've lived this. I've lived in areas where we have random teams building features that delivered random results and terrible architecture. I've also been in organizations where we had deliberate team structure deliver better architecture. I guarantee that that allowed us to achieve greatness.

Teams should be deliberately architected for improved communication, aligned to business capabilities, become business domain experts, not just coders, and decoupled from each other to allow independent value delivery.

In one organization, we tried using an agile scaling framework to deliver better for years. All that resulted was added overhead. We replaced that with teams aligned to business capabilities. We finally achieved our goals for improved delivery. De-scaling and reducing complexity with engineering and architecture was far more effective than trying to manage complexity with process.

And accidental structure leads to accidental architecture, and accidental structure is not sustainable. It builds tech debt into the organization. We get this accidental structure and accidental architecture when we try to implement feature teams that are accountable for delivering features but don't have the context of the overall business capability, when we have functional silos that are often driven by the reporting structure. They result in heavy handoffs and overhead.

Individual responsibility for a product makes the system unsupportable. I've seen individual responsibility given that resulted in products where the code and even the technologies used were not well understood by the teams that ultimately had responsibility for it.

Deliberate structure not only improves the supply chain of communication, but it also helps us forge the next link: ownership.

Ownership means more than being held accountable. Teams that own solutions really care about the outcomes. They want to take -- they're inspired to take action. They want to be responsible.

And eliminating ownership is super easy: dictating to teams. We're using Scrum for everybody. Everybody aligned on two-week sprints with the same cadence. Assign individuals responsibility for delivering features. Treat people as resources that can easily switch between domains because it's all Java, after all. Business knowledge doesn't really matter.

And measuring individuals on output. I worked with one team where the manager had this goal: I want to make sure I have all the information I need for annual evals. And so he was measuring people based off of how many tasks they completed in Jira. He had staff engineers and entry-level developers focused on that, which meant mentoring stopped, code reviews were slow, you didn't get help with any problems, because if I helped you it hurt me, and so you do not want to do that.

And quality is actually what you really get. That's the outcome of ownership. When teams are in touch with user feedback, when they feel responsible for their users' happiness, when they own it and they're able to innovate and solve the problems, it makes them proud of the solutions that they bring. They want to ensure what they deliver is reliable, functional, secure, and compliant.

Personally, I've had operational responsibility for my software for most of my career, except when it's been stripped away occasionally, and I assure you that ownership drives quality, because at 3 a.m. I feel that quality ownership deeply.

Ownership doesn't mean teams own everything. It means they own the problems, the solutions, the outcomes. A recent article stated devs don't want to do ops. Now leave aside some other troublesome things in that article, but DevOps doesn't mean development teams own the infrastructure to deliver. That adds too much complexity, impacts business value. We need platforms to make that easy.

And having a common platform enables us to leverage the tools that help support all the other links that hold this together.

As Bryan says, platforms should make the right things easy and make the wrong things painful. Exactly. Platforms should encourage improvement.

How we implement this in platforms can have a dramatic impact on our improvement goals. Good or bad, are we acting as gatekeepers, hindering people, making sure they do the right things the wrong way? Or are we empowering better outcomes by making it harder to make mistakes and easier to do things correctly?

Platforms can harm our efforts if the focus isn't on making delivery easier and safer, if we ignore the developer experience of our customers, other developers, if we act as police to enforce policies instead of making policies easy to implement. If interacting with the platform can only be done with a support ticket, it will get crushed under its own overhead.

A common failure is building platforms to support how work is done today instead of leveraging them towards the workflow we need to meet our goals.

Absolutely. I was at an organization where there was a great focus on building a centralized platform in order to enable continuous delivery. That was the goal, continuous delivery, and we started defining the good practices, what we needed to enable to encourage people and allow them to do continuous delivery.

And as soon as we said we're going to build these pipelines so that we can help grow these practices, all that got heard was pipelines, and all of a sudden the goal shifted. So we need to onboard everybody to the pipelines. So stop evangelizing. Stop coming up with exactly the right way to do it. Let them come on exactly with what they're doing today. Let them use GitFlow. Let them have long release branches.

And this is not scalable. It doesn't encourage learning new skills and practices for effective continuous delivery. That platform that could have been an enticement to improve our engineering excellence became a checkbox. Yep. We're using a centralized platform, so we're good. Success.

Platforms are not just tooling. Hopefully we all know that by now. If we hear centralized self-service product that allows continuous improvement and hardening of the pipelines, allows us to build in security and compliance standards, which can give organizations confidence that every delivery is meeting their standards without teams being able to forget about it or even turn it off.

So the centralized platform that's built to meet the right outcomes bakes good engineering practices into the culture so that it becomes just the way we do things throughout the organization. And the only reason platform teams exist is to make value delivery easier and to make mistakes harder, to serve their customers.

And platforms enable many new capabilities if they're done right. They're changing the way that we work. So to use them, we must unlearn old habits and the way we used to do things so that we can learn new ways, because DevOps takes a different mindset.

Transformation literally means change the way you think, or a holistic change that requires learning new things.

It's really important, though, to make sure that the learning is focused on the right things. Are we collecting certifications to hang on our wall, or are we using what we learn to actually improve how and what we deliver?

We can really screw this up if we don't pay attention to it. Just a few ways to screw it up are using excuses of being too busy to stop and learn to do work better; surveying developers to see what cool new technologies they want to learn that's not related to the work they're doing; changing tech stacks because it's cool, not better for the use case, only adds cognitive load; and not offering time and opportunities to attend training and improve relevant skills as part of the job. We cannot expect everyone to be security and quality hobbyists. I saw a team once that had five languages and was adding Python because they wanted to learn Python.

And DevOps is hard because it requires looking at things very differently. This is hard to understand if we don't unlearn the old ways. We had to set aside what we knew before and not be afraid to really change. We can't wait till our product is big and shiny and perfect to deploy because by then our competitive advantage or the needs of our customers has probably changed. There's now no end, so we can't possibly inspect for quality and security at the end. And there's no time for manual auditing.

We need to make the goals of the mission part of onboarding new people. It needs to be a continuous thing. We're always communicating goals, the mission. We can't focus learning only on individuals. Learning needs to encompass how teams work together. And everyone who wants to lead at every level needs to understand that they need to learn too. It's not just on us.

Learning to shrink feedback loops is what really drives improvement through the organization and uncovers all of the problems. And learning isn't extra work. It's the job we do every single day.

And we'll have failures along the way. We have failures now. We don't have to wait for DevOps to get failures. To stay safe, we need to learn from those failures. We need to identify the problems. Blameless postmortems, so we can find the issues of our system instead of blaming the people who work in this system. Blaming people reduces quality and makes us less secure. The pragmatic approach to improving the system is for people to feel safe so that they can be transparent while uncovering the causes of the failure. Transparency requires trust.

Trust is foundational. The first DevOps conference I came to, Gene stood on the stage and talked about being the first constraint to delivery. It's the first constraint to improving outcomes. It's also the most fragile link. It takes a lot of effort to build up trust, and it takes no effort to destroy it.

And when trust is missing, we get fear. Everybody's afraid. We don't trust the mission, so we don't feel the purpose in our work. When individuals fear sharing ideas, the innovation suffers. When people hide their mistakes because they're afraid to ask questions, they're afraid they're going to get blamed, progress really slows down.

So if you want to screw it up, you know, just take ownership away from the people doing the work and make them do it your way. Of course, then you own the quality. Measure individual performance rather than the health of the system, because really HR metrics are far more important than business metrics. And set arbitrary deadlines, stretch goals, to encourage maximum effort from the teams because really we want to make sure they're really focused on getting things done, none of that learning crap.

And so we're all going to be privileged to hear from Dr. Westrum tomorrow, and his research really tells us that a generative, or a very performance mission-oriented culture, allows the people that are involved to put aside their personal issues because they're all focused on the mission. They're focused on what we need to get done, not who should be doing it, who -- they're not worried about the personal issues. This allows a higher level of trust both across the organization and up and down the hierarchy.

Fix the system, not the people, and grow trust by working on the system together. People aren't afraid because they know that they're not going to be blamed.

We need to avoid breaking these links, this entire system. None of these links are optional. And they have to have the whole chain. If the mission is not aligned to business value, if structure is not aligned to our delivery goals, improved communication; if we have accountability without ownership; if platform is infrastructure-as-service desk; and, yeah, learning not focused on system improvement, if we're just learning to, I don't know, learn something; and culture of fear or bureaucracy instead of trust.

When we first build these links, they're made of paper. I mean, you can see in our graphic we use a paper chain. But we need them to be forged in steel.

And how we can forge these links is by aligning on a meaningful mission; reduce delivery friction with deliberate structure; give teams ownership to deliver better outcomes; centralized platform solutions that make the right thing easy, that make it harder to make mistakes; improve the system by continuously learning and unlearning; and by building a generative culture of trust by collaborating on a common mission.

But the thing that I found that's most difficult to overcome in developing a culture in an organization is the constant need to measure individuals for performance evaluations. The help we need from you is: how are you gathering the information that's needed to recognize that people are excelling or struggling while being able to maintain trust that is needed for teams to excel?

You can reach me on LinkedIn or Twitter. We are around. We'll be around the rest of the conference, and feel free to direct message or hit us up within the DevOps. Yeah, we're happy to talk to you, and you can reach me on LinkedIn and Twitter. You'll find me ranting somewhat when I get irritated on LinkedIn, kind of a little.

And we have a link here for some recommended references and resources, things that we found useful over the years. It has deeper information on some of these things. That blog post, by the way, we'll keep iterating on. That's an MVP. And if these links really resonated with you, come seek us out, talk to us. We actually have stickers with that logo.

Thank you so much. Thanks.