Log in to watch

Log in or create a free account to watch this video.

Log in
London 2020
Share

Confession + Lightning Talks + Closing Day 1

A DevOps confession anonymously recounts a near-catastrophic automation failure — dubbed 'Certpocalypse' — in which a script accidentally began revoking every certificate in an enterprise's production environment, exposing the cultural and technical risks of moving too fast without proper environment segmentation. Two lightning talks follow: Kat Swetel challenges the conventional narrative that DevOps is an extension of Agile, arguing instead that it is a reaction to Agile's growth-at-all-costs mindset, and Nate Ashford shares a deeply personal transformation story connecting DevOps principles like making work visible and error budgets to a weight-loss journey and a subsequent leukemia diagnosis.


In this talk, you'll learn how the fear of blame can suppress critical lessons from production failures, why maintaining the balance between innovation and operational maintenance is central to DevOps thinking, and how the principles underpinning DevOps apply far beyond software delivery.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

In my opening remarks, I mentioned Dr. Richard Cook from the safety culture community. In a previous conference, he told us something amazing. He said there are certain types of stories that you will never be able to hear on stage. Instead, you hear them after the sessions, most likely at the bar after there have been a few drinks, and that is when the real lessons are told. Great practice comes from experience, he says, and experience comes from bad practice.

This struck me as profoundly important. The only people who got to hear those stories had to be in the right time, the right place, and hanging out with the right people. So we wanted to bring those stories so that you can hear them as well. Thus, we created the DevOps Confession format. We have collected stories that we have anonymized and will share with you.

So please welcome one of our programming committee members, Cornelia Davis, CTO of Weaveworks, who will be reading one of these stories. Over to you, Cornelia.

Cornelia Davis

Hello. My name is Cornelia Davis, and I am part of the programming committee for the DevOps Enterprise Summit. And today, I am pleased to bring you this DevOps Confession. We call this one Certpocalypse.

Several years ago, after many years of operational struggles and an inability to improve our operational environment, we realigned development and level two operations into cross-functional build-run teams. That is, our developers would be taking on some of the operational responsibilities.

The existing operations leader did continue providing some operational support, including level one help desk and other ITIL processes such as change and incident management. There were many objections, and there was a lot of background chatter to this plan, including, 'DevOps will not work here,' 'Engineers do not understand ops,' and, 'Engineers should never access production.'

After the dust settled from the reorg, we set out to improve operations by treating the operational processes as engineering problems. We began automating a lot of things. This included a heavy focus on infrastructure as code and using version control for all scripts and tooling that supported operations.

We started looking for easy wins to reduce toil on the teams who were in constant firefighting mode, working around the clock to keep up. We found that we had hundreds of certificates to renew each month. Renewing a single cert required a ticket. A team would process that ticket and return a certificate, after which we would manually apply it to production. Lead times were very long, and manually retrieving the cert and properly updating it in production was error-prone and dangerous, often causing an outage.

Fortunately, our certificate store had an API, and we felt that this was a great opportunity to automate a tedious and error-prone process. We had some of our best engineers design and implement a process that would check the certificate store, and if a cert was up for renewal, it would revoke and delete the old certificate and issue a new one to be installed into production. A script was built that would iterate through all certs and do the deletion and issue process.

During development, the engineer went about testing this for the first time, and the script accessed the store to carry out the process. Unfortunately, the script was pointed at the one and only production instance of the certificate store. The developer watched in horror as the script began deleting and revoking every certificate in the enterprise.

Fortunately, the developer, carefully watching the output scroll by, reacted quickly, killing the script when it had only invalidated a handful of certificates. Had the developer not noticed, it would have revoked and recreated every certificate in the enterprise, including those that support our communication systems and email.

The developer quickly escalated to management, and the team worked to restore the invalidated certificates, minimizing the impact of the mistake. Following the incident, we did, of course, put proper controls into place to segment the production certificate store from that of the lower environments.

We have never shared this story outside of a select few people. At the time, the organization was still raw from the reorg and would have looked to punish the individual and potentially unwind our move to DevOps.

Lightning Talk Intro (Damon Edwards and John Willis)

Damon Edwards: Well, hello, everybody. My name is Damon Edwards.

John Willis: Hey, I am John Willis.

Damon Edwards: You might know us from the DevOps Cafe podcast, which often features a lot of folks from the greater DevOps Enterprise Summit community. So we are happy to be hosting the Lightning Talk session. If you want to catch more of our podcast, you can go over to devopscafe.org and register to be notified of the upcoming next generation of the podcast.

But today, John, we are here to introduce some Lightning Talks. Can you tell us all what Lightning Talks are all about?

John Willis: Yeah. For those of you who do not know, the Lightning Talk format is basically 20 slides, 15 seconds a slide, and five minutes. What is interesting is usually it is very stressful live. Virtually, we have asked people to create that same stress level by automatically advancing, and I do not think we had any cheaters. So I think we are pretty good here.

Damon Edwards: Yeah. It is a virtual high wire act for the performers and it is a little intermezzo or a palate cleanser for everybody else. So hopefully you enjoy them. John, who is first?

John Willis: First, we have Kat Swetel and Jay Bloom, and I think it is going to be interesting. Their talks are around socio-technical, and actually, Kat has worked with Jay, so it will be interesting to see both of them tell their stories.

Damon Edwards: Yeah, I am excited. So here we go. The first two Lightning Talks of the Virtual DevOps Enterprise Summit.

Kat Swetel

Hi, I am Kat Swetel, and today I am going to be telling you an alternative DevOps origin story. I will also warn you that at any moment, my son could come busting through that door, and I probably will not even stop the recording.

So let us go back to the beginning. In the beginning, there was Agile. There were not any women, or were there? I do not know. It depends on who created the guest list. Am I right? Hashtag not my uncle.

In the beginning, there was Agile. It started with developers, but it became very clear very quickly that they would need to go out and conquer other disciplines within software development and product development. So we see the testers, the BAs, project managers, all sorts of folks join the Agile team. Why? Because we need to prioritize obtaining new information from the quickly changing market.

So we need all of these disciplines to work together to enable those ten-x developers to get that information from the market as quickly as possible. So we have to subordinate the other disciplines.

Well, what do you do after you have conquered all the other disciplines in software development, in product development? The next frontier is going to be the deployment and the operations of that software. So Agile, of course, decides now we need to further expand our Agile empire and we need to conquer ops.

Not really that difficult. You just add some words to the Agile Manifesto. It is a little bit weird because now we are doing both development and production. So we have got to come up with a new name. That is understood. That is how DevOps is born. Momentous occasion for us all, and I think that is the history that we all take for granted.

Now I am going to challenge you to consider a little bit of a different take. What if DevOps is not an extension of the Agile empire? What if DevOps is a reaction to Agile?

And to illustrate my point here, I will start by asking a question: why do companies die? If we are going to listen to our fine friends at the Santa Fe Institute, we can think about companies as an organism, a complex system that is finely balanced between the energy that it takes to metabolize new information and the energy that it takes to keep up the maintenance that keeps the company alive, keep the lights on.

So for us in working in companies, we have to balance the metabolism of new information coming from the market so we can stay alive, we can continue to meet the needs of the customers. We have to balance that with keeping the lights on, with maintaining our systems.

How do we, as technologists, metabolize new information? With working software. With code, right? Of course. So if we want to get lots of information, we have got to get lots of code. And then what happens to all that code? Well, my friends, we have to maintain it, and that is where this balance becomes extremely difficult to maintain.

So I guess what I am challenging you to consider is: what if DevOps is not an extension of the optimized-for-metabolism mindset of Agile? What if it is a new mindset that challenges us to consider the maintenance cost and the balance that we have to strike? We just have this one set of resources, of energy, of time to invest in both the metabolism of new information and the maintenance of what is already existing. So if we have to invest a lot in maintenance, we will not have very much left over for metabolism.

And I think this is really clear if we look at some of the principles that we hold dear in DevOps. They are all about being informed about the system and making really informed trade-offs.

And I know it sounds like I am giving the Agile folks a really hard time with this more, more, more mentality, but we do need that. In order to keep up with the market, we do need to prioritize the metabolism of new information. However, we also need to be really mindful of that balance, and we need to invest in minimizing our maintenance debt so that we can maximize our investment in metabolizing new information from the market.

And so I do not think DevOps came from Agile. I think it is a reaction to Agile. And that is my talk. Thank you.

Nate Ashford

Good evening, everybody. I am Nate Ashford, and this is my story. This is the story of how I changed the world, starting with my world.

I am a lot of things, but basically, I like to make stuff that works, and I like to celebrate it with the awesome people who do it with me. And sometimes we do not get to celebrate very often, and that is where transformation comes in.

The first few times I was involved in a transformation, I was filled with this eager idealism that things are going to be so much better and everyone is going to win. I do still believe in the winning and the better, but it can get a little bit messy, and I cannot always save everyone. But I can help someone. And if I can make a difference in the life of one person, that can be enough. And there is only one person I can actually change: me.

Two years ago, I was a mess. I was not having the impact I wanted to have, despite working long hours. I had a vision for change, but it was not going anywhere. I was spinning my wheels, and the stress was eating me up, and I was eating to match.

I peaked at 255 pounds. I loved martial arts, but I could not ever go to class because I was constantly injured. I was in pain. I was exhausted. I was depressed. I even hid behind the plant in the family photo.

Something had to change. But conveniently, I fancy myself a change agent. I help people change. I am a people!

So I took myself on as a client, and I started with what I know. Start with why. I want to live long enough to see my grandkids and to see them grow up and maybe even see their kids.

Core values: I decided that I like vegetables. Make the work visible: I started tracking my calories, my water intake, my exercise, even my sleep. Measure what matters: I monitored the outcomes that I wanted in weight, body fat, muscle mass.

I set a goal to lose a pound or two a week. I started with just taking a walk, half hour at a time, then an hour, then longer. As the weight started to come off, I started to mix in some running, and then more running.

I started working fewer hours and putting more time into self-care, like a daily routine. In the morning, it was planning and meditation. In the evening, reflection and journal writing. I experimented adding and removing things from my routine as I figured out what worked, and a year later, I had lost a third of my body weight, 85 pounds.

I was happy. I was loving my work. And after 13 years, I finally got my black belt. Life was good, although I did have a couple things I wanted to talk to my doctor about. I put off the appointment until the day after my black belt test so he could not tell me not to do it, and we did a simple little blood test.

Chronic myelogenous leukemia.

Those three words changed my life instantly and permanently. But had I not begun my transformation when I did, I would not have caught the signs, and I might not have found it in time.

The questions in my head: will I live? Yes, thanks to powerful drugs taken every day from here on out. Will I lose everything I have worked for? For that, I need another principle. Error budgets, big ones.

I cannot do karate right now, and I cannot run as often as I would like to. Other things slip as well. I do not always have my A-game, and that is okay.

Will I live with purpose? This is my most recent one, and I choose yes. I choose a life that is about more than managing my symptoms and my side effects.

Today, I am about to celebrate my one-year cancerversary on Saturday. I do not have it all figured out. Thank you. But my error budgets are smaller. I keep trying, pressing forward, and occasionally panicking and pulling back.

But I have learned this: my cancer is not a liability. It is an asset. I am a better person and a better coach because of it. It has taught me about myself, about empathy, about struggle, about limitations, about letting go. And today, I have more impact to create change in the lives of the people around me than I ever had in my previous life because I have changed, and transformation begins with me.

Closing (Jeff Gallimore)

Pretty amazing start to the DevOps Enterprise Summit this year, do you not think?

All right. We are about to head to a break, but I want to share some things to know before you go. Like I said, we are about to break now, and then the breakout talks will start again at 11:25 AM London time.

We have networking time and networking opportunities starting at 2:25 PM London time. That is Birds of a Feather, Lean Coffee, and Chat Roulette. Visit our sponsors at the expo and check out the games, and be back here for more keynote talks starting at 4:05 London time.

All right. Have fun. Learn lots.