DevSecOps: How to Use DevOps to Make You More Secure

Log in to watch

London 2018

DevSecOps: How to Use DevOps to Make You More Secure

Founder/Chief Security Officer · Signal Sciences

Zane Lackey is the Founder/Chief Security Officer at Signal Sciences and serves on multiple Advisory Boards including the National Technology Security Coalition, the Internet Bug Bounty Program, and the US State Department-backed Open Technology Fund.

Prior to Signal Sciences, Zane was the Director of Security Engineering at Etsy and a Senior Security Consultant at iSEC Partners. He has been featured in notable media outlets such as the BBC, Wall Street Journal, Associated Press, Forbes, Wired, and CNET.

A frequent speaker at top industry conferences, he has presented at BlackHat, RSA, USENIX, Velocity, Microsoft BlueHat, SANS, OWASP, DevOpsDays, and has given invited lectures at Facebook, Goldman Sachs, IBM, Microsoft, Carnegie Mellon University, and the Federal Trade Commission.

Chapters

Full transcript

The complete talk, organized by section.

Zane Lackey

One of the key lessons that I learned going through the shift really early on around DevOps and security, and it's kind of one of the core things that I wish I could've told myself on day one of that shift and saved myself a bunch of pain during it.

So my background and the context here: I've spent my career on the security side of the house. I started out in security consulting and pen testing. If you know NCC Group here in the UK, or iSEC Partners in the US and all over, I started out there for a number of years and then was given a very incredible opportunity to go to Etsy back in, what, 2010, 2011, and be their first head of security and build and run the security program from scratch at Etsy, which was an incredible experience at the time.

And then after a number of years, really seeing how, hey, the way that we're creating software is changing, the way that we're delivering software is changing, maybe the way that we need to secure and protect those web applications and APIs and microservices, that needs to change as well. And so we learned a bunch of lessons on that, and we actually, myself and my two co-founders, stepped out from Etsy, co-founded Signal Sciences, where we turned those lessons learned into a product. It's what the marketing folks call a NextGen WAF or a RASP or anything there. I'm not here to talk about Signal Sciences or that. We all get enough vendor pitches in our jobs every day. But that's been what I've been spending the last several years on.

So, what's this about? It's really about being at the forefront of the shift to DevOps and cloud. At the time, it was mostly Netflix on the West Coast and Etsy on the East Coast of the US that were going through this. So there wasn't much to Google, and I mostly had to make a lot of mistakes along the way and learn about, hey, how does security really change during this shift?

And so I'm going to spoil the ending for you right now, which is that security shifts from being this kind of gatekeeper along the way to actually focusing on enabling teams to be secure by default. Now, if you paid me, I don't think I could make a more cliché-sounding sentence than that, but it doesn't make it untrue.

This is really the existential shift that is going on for security right now, is that it shifts from being this blocker to actually thinking about: how do we make the organization be able to move even quicker?

So what has changed? I don't need to spend a lot of time on this for this audience, I feel like, especially on day two of a DevOps conference. But I think from a security perspective, it really kind of boils down to a couple things here.

One is that change now happens orders of magnitude faster than it used to. And really, in our old models of security, there was an implicit assumption that change happened very rarely, and so we built our whole security programs and tooling around that.

For me, point number one was made super apparent to me, where I left my last day as a security consultant, wrapping up a project at a big US healthcare company. They would make production deployments and changes once every 18 months. So that was my final day on a Friday. I left there on a Friday. I took no time off because I'm an idiot, and started at Etsy on Monday morning, and they sat me down and said, "Right, so we deploy to production 20 times a day. Figure out security. Go."

And so I immediately headed to the closest whiskey bar, and then after that, started to actually iterate and figure out, okay, how are we going to change things here?

And then number two is that the ownership of deployment and the journey that code went on to deployment, for us to make changes through our applications, it really changes. It used to be-- I mean, how many folks have lived through the Waterfall era? I would assume-- Yeah. Right. We can all buy each other drinks for having survived that at some point.

But it used to be the dev teams would write code, you'd throw that over the wall to QA, it would come back to development, you'd throw it over to security, it would come back. It'd go to the SysOps group, go to staging, come back, and eventually on to production. And along that way, that journey took 12 months, 18 months, to the point where when some issue was identified, that was a year ago that someone had written that, and they don't remember any context of that, and so it was really a nightmare.

To now today, you see code being written, checked in, and potentially deployed within even minutes of that. Maybe days, maybe hours, maybe weeks, whatever, but you're still talking about multiple orders of magnitude faster.

And so from a security side, that really changes two things. It changes the culture of how security has to interact with the organization, and that's what this quick talk is about, one key lesson on that. And then there's another side really around how do we change the tools and the individual techniques there. That's a different talk. I actually link to some of that there, but feel free to email me. I can send you the decks on those.

And so the real takeaway out of all of this is that security can no longer be outsourced to. It can no longer be code thrown over the wall to security, and security tries to take a look at everything and find all the bugs and ship it back. The existential shift in security right now is that it's going from that sort of model to focusing on how can it make other teams inside the business, the development teams, the DevOps teams, security self-sufficient? And ultimately, security is only successful if it can actually bake into the development and DevOps process moving forward.

This is something where I think security teams are kind of where IT procurement teams were about five years ago at the real start of the rise of cloud, which is IT procurement teams, the SysOps group, or the development teams would come to them and say, "Hey, great, we've got a new project for the business. We need a dozen new servers and four new co-los around the world to actually empower that project."

And the IT procurement team would say, "Great, it's going to take us 18 months to requisition those servers."

And the dev team would say, "Yeah, that's great. We just put down a credit card. We've got everything in Amazon that we need. We don't need to talk to you anymore."

That's where security is right now. You see so often the development teams and the technology side of the house saying, "Hey, we need to ship this new project by Q3."

And the security team will say, "Okay, great. We need six months to review it."

You're like, "Yeah, except it needs to be live in two weeks. So you can take your six months and do it, but we're going to be live the entire time and figure it out."

And so security really has to rethink its process and think about how does it enable those teams to be security self-sufficient along the way.

So with that, I'm going to share a picture of the first day where we tried to just bring classic security models into the DevOps shift. So this was us in our security train shouting about safety beginning with you, and then just plowing into the DevOps car.

And it didn't work. It was one of the very first light sockets that we stuck our finger into along the way of figuring out how does security actually adapt.

But what are the new things that security needs to focus on? Well, it's really visibility and feedback.

Except these aren't new concepts. Security loves to think that it's totally special and it needs to go invent entirely new things, and it really doesn't. All of these things are, whether it's performance monitoring, data analytics, A/B testing, all of these different techniques that helped make the whole DevOps movement really successful, these are all about the same core concepts. It's all about visibility and feedback into complex systems. And by getting that visibility and feedback, we enable ourselves to actually move faster.

And security, it's really in the same spot where these same hard lessons are starting to shift to security. I think that security doesn't need to reinvent the wheel on this.

I personally just really like the comparison to the performance space. So like the APM tools like AppDynamics and New Relic and Datadog, where really you took a previously highly specialized skill set that was run by a completely independent group, and the rise of those tools, one of the reasons that they've been so successful is they allowed you to bring previously highly specialized capabilities into your general technology teams. And by owning those capabilities themselves, they could move quicker as a result.

That's the same lesson that is slowly trickling into security, and it's where we really need to go with security, which is bring previously highly specialized skill sets into our core technology team so that they can own that and move faster as a result.

So let me give you a story about these from the bad old days. This and the next slide are both real, by the way. These are not Photoshops, or not Photoshops by me anyway. But this was a real thing from the '90s.

So this is how we used to have security visibility. This was an airline in the US in the late '90s. And how we had security visibility in the past is you'd have a marketing website, and it would say a bunch of great things about your company and your product and all that. And everything was fine from the security side. And then one day you'd come into work and suddenly your airline was on fire, and the logo of the page was, "So we killed a few people, big deal." This was a real defacement that happened in the '90s, by the way. Probably one of the most hilarious ones, but that's for another time.

And this was really how we had security visibility. It was kind of the way that we had outage visibility in the past, which was that, "Hey, everything's great." And then, "Wait, why are all the phones ringing and customers are really angry that the service is down?"

We only had that very binary set of visibility. Either everything was great or everything was on fire and totally down. And that's kind of where we've been at from the security side, is that it's really either, okay, we think everything's good, or The Times or The Wall Street Journal or a blogger is calling us for comment as to why all of our user credentials are up on the internet somewhere.

And we don't win with that. So how can we actually improve?

Well, I'll give you an example of this, which is which of these things sort of scales? And this is something that we really learned from the operations side and have to apply to the security side, which is take how we have, and I'll air quote, visibility right now, which is things like logs. And big mountains and mountains of data about what's going on with our systems.

And from the security side, that's kind of the starting point. We have some logs. Let's see what we can do with that. Does that give us any visibility into our system? Well, the answer is this doesn't scale at all. If you're trying to get visibility off of your logs, that's the right first step, but you run into a bunch of challenges.

And then what you start to get to is, hey, someone trying to look at the logs and alert me from there, that doesn't scale. But if I can start to think about how can I surface some visibility for our groups, like actual visibility that's consumable by the rest of the organization, this is really something that we can scale this way because we can bring this visibility to development teams, to DevOps teams, to security teams, and we can start to say, "Hey, when all those logs were flying there, was something actually happening?"

It's hard to look at logs and say that. It's much easier to say, "Wait, why is there a giant spike in the attacks graph and the anomalies graph and everything that's going on here? Let's actually take a look into what's going on."

So a big one that we had to learn out of this is surfacing security visibility for everyone, not just the security team. A mistake that we made early on is as we started to invest in getting visibility and things like this, we brought it just in for the security team. What we learned is that instead of facing this inward, you face this outward.

Sometimes that's actually in a very physical sense of having displays up on the walls in your engineering area and all that. Other times, it's really focusing on publishing this out to the organization, to the entire technology team to really focus on, hey, this is data that is useful for everyone, not just by a security team.

And so I'll give you an actual practical example of this, which is-- This was a really fascinating one for us to learn, which was when we started to look at some of the visibility we had already. So take like a standard, just HTTP 500 errors graph, right? Just, hey, there are errors happening in the application or the API or anything like that. If you ask your different teams what this implies, it was really fascinating. You would get wildly different assumptions as to what this implied.

So if you ask your particular development team that we asked, like, "Oh, we just did a whole bunch of new engineers through boot camp. Probably one of them did a bad code deploy, and we either rolled it back or we rolled it forward or something like that, and that's what those issues were."

If we asked our DevOps teams that, they're like, "Oh, yeah, it was probably this engineering team because they've shipped a bunch of bad code in the last three months and paged us every night for two weeks straight. I'm sure it was that."

If you ask your security team what that is, they're like, "Oh, hmm, that's interesting. I wonder if that's somebody actually discovering a vulnerability and trying to figure out a working exploit payload before it actually succeeded against our systems."

And if you're ever fortunate enough to actually get to ask your attackers or your security researchers, you'll often hear, "Oh, yeah. That was me discovering a real vulnerability and figuring out an exploit payload before it actually succeeded."

So the key out of all of these is really to start to bring together context that any one of those groups can look at this data and make an informed decision off of. That we can say, okay, great, there's not just the errors going on, but let's combine that with actual attacks that are happening against our given services so we can say, wait, why is the big error graph spiking at the same time that the attack graph is spiking? That's actually not what our assumption was. We should actually take a look at this, and we can page the right people. We can dig in. It gives us much more actionable context here to actually see what's going on.

So that's the visibility side. The challenge of a 30-minute talk is each one of these slides could have its own hour-long talk on the details.

But going on to the feedback loop side, first of all, I'm going to feel very old with this. Office Space is now 19 or 20 years old, which means we can actually start using it in slide decks again. It's hit that age of vintage memes that we can bring it back.

So the real thing about feedback from a security context is taking the whole game days and operational exercises that we've learned from the DevOps side and applying that to security and really saying, look, the first time that we deal with a security incident, we don't want it to be because a real incident is happening. We want to have run this exercise as many times as possible ahead of time. And so bringing those concepts to security is tremendously useful for all different parts of the organization, not just your security teams, but for your technology teams, for your legal teams, for your PR teams. Involving as many folks in that as possible is actually tremendously useful.

But the way in which most folks are doing feedback loops are things called bug bounties. Does anybody in the room have a bug bounty program running with your organization?

Awesome. Few. Okay. So I'll just spend 30 seconds defining them.

Bug bounties are where you put out basically an SLA, and you say, "Hey, security researchers that we know are going to be out there attacking stuff anyway. If you look at these predefined services of ours and you follow these rules and you act in good faith, and you report any issues that you discover to us, we will, in good faith, reward you in some way."

And so that reward might be monetary, but it might just be putting your name up on a hall of fame. It might be sending you a T-shirt or just a thank you card or anything like that. But it's a way for what was previously a lose-lose situation for both sides to actually turn into a win-win.

Because up until these, if you were a security researcher, and this is kind of crazy, but going back a number of years, if you were a security researcher that found a vulnerability in some sort of company's website or service or app or anything, and you reported that, you had 50/50 odds of getting sued. And so what a lot of researchers would do that turned out to be massively lose-lose is they'd say, "Okay, great, I want to see this fixed, but I don't want to take legal risk by doing the right thing. So I'm just going to anonymously publish the details of it to the internet."

And then you get paged probably by your CEO, CTO, saying, "Why is there something publicly about how to exploit our systems?" And everyone has to scramble to fix it. And so both sides completely lost out of that.

And out of that movement really started the whole bug bounty approach, which said, look, if you play by these rules and play in good faith, and look at these sort of services that we define in scope, we promise that not only will we not sue you, but we'll reward you or thank you in some way.

And so these have really been gaining a lot of steam over the last number of years. You name it, from large-scale internet companies all the way up through various government agencies. They've really been seeing a lot of traction.

But from a feedback side, now pulling this back into how do we use these things to really make feedback loops. In the past, all we've had for a feedback loop from security is pen testing. And the problem with pen testing, I say this as someone who's been a pen tester, who has hired pen testers, who's spent a lot of my career on either side of that fence. The problem with pen testing is it's very much kind of a point-in-time activity that the dirty secret is we all do it once a year, usually for two weeks, to satisfy different audit or compliance requirements or anything there.

And so you have your pen testers come in. I saw a lot of knowing laughs in the audience on that one. We all have lived through that.

With pen testers, we have them come in a couple of weeks, once a year, and the problem is we faced a choice that was also lose-lose, which was that we could either ask our pen testers to try to cover a bunch of things and then just be super shallow, or to focus on one particular area where we think we have a lot of risk, and then they go super deep on that, but they ignore everything else. And so either way, we're really in a bad spot on that.

So the rise of bounties. And you'll see some noise about bounties replacing pen tests and all of that. I really disagree with that. I think that the two augment each other because they each play to their own strengths and each other's weaknesses. Which is by having the combination of both, you can use your pen test to go super focused on one particular area, and now you can use bounties as a much more real-time feedback loop. Because what they're really good at is really kind of broad coverage across a bunch of different services and much more real-time and feedback there.

So by combining them, it's not a replacement, but it augments pen test. And so it really gives this real-time feedback. It starts to become the data source to give you a feedback loop on the security side.

I think a lot of security talks are all kind of nihilism and doom and gloom, right? That, oh, everything's messed up, nothing can be secure, all of that. I want to start to close out the talk with an actual good news security story of really where I think this can go and why I think the shift to DevOps actually makes us more secure and not less secure. Which is something I didn't actually believe for my first several months as a CISO in an organization going through that shift.

To be totally honest, and it's embarrassing in retrospect, but I spent the first probably six months as a CISO in an org on that, thinking like, "This is insane." Right? This is going to make us so much less secure because change is the enemy of security, and it's going to introduce more risk. Completely wrong, but I had to learn it the hard way.

And so one of the key data points in that was what happened here. So we had started to invest a bunch in the visibility bit. We had really started to say, okay, if we can take those lessons of DevOps, what's been successful there, let's apply it to security, let's think about visibility, let's think about feedback, and we started investing in that.

And we had this really cool thing happen where, at one point, we were able to detect an attacker discovering a vulnerability, and we were able to ship a fix for that before they even reported the issue to us. So they were sitting there using one of our services, they discovered an issue in it, they started working on an exploit payload, and as they were confirming everything there, it suddenly stopped working out from under them.

And we got this really cool email from someone, and they ended up doing a whole write-up that they posted up to the Reddit NetSec thread here. It's a really cool thing to go read. But they ended up writing in an email saying, "Hey, I can't imagine that it is a coincidence that suddenly my vulnerability stopped working out from under me in this obscure part of the site. So, hey, I just wanted to let you know, I promise I was acting in good faith and here's the details. And oh, by the way, I was testing from my home IP, so please don't sue me."

And so we actually got this really fun interaction with that researcher and we said, "Hey, no problem. We actually detected you early on. You can check your Etsy account. We actually messaged you right in the beginning and just said, 'Hi, we see you. If you discover anything, here's a contact email address that you can send stuff direct to us.'"

And it ended up being this amazing back and forth with them. And the reason that I share this story is not because, like, oh, we were so cool. No, not that at all. It's that this was a very crystallizing moment for me in recognizing that how the shift to DevOps, the shift to cloud, this increase in velocity of our systems actually can make us more secure.

Because what I had to realize is every development methodology that we've ever had is going to have vulnerabilities, right? Bugs are functionally infinite. We're always going to have them. And so systems that rely on saying, "Let's eliminate all the bugs," that's not reality.

So if we recognize that bugs are always going to exist and they're functionally infinite in that sense, the system that allows us to react the fastest is the one that actually makes us safest.

So if we have-- Going back to the very beginning, I remember at that big healthcare company, where we found a bunch of critical vulnerabilities. It was like, on your front page, if an attacker types these four things, they're going to get all of your healthcare data for all of your customers.

And they're like, "Okay, great. Yeah, that's super serious. We'll have a fix live in 24 months."

We're like, "Oh, God. Okay. Got it."

Versus as we're starting to embrace DevOps from our side, if we need to make an emergency deploy, that's just another deploy. If we're deploying once a week or once a day or once an hour, and we need to push in a security change as part of that, it's just another deploy. We never have to say the phrase out-of-band patch ever again, which for those of us who have lived through it, is an absolute nightmare.

So the thing is, it can only make us safer if we actually have something to react to. And so this is why the shift to DevOps makes us safer, because we can react quickly, but only if we've started to think about how do we get visibility from a security perspective, and then how do we actually test all of those systems with real feedback loops.

So thank you for coming and sticking around through snack time and amazing weather time. I deliberately held us a few minutes short so we've got time for questions, and thank you very much.

Q&A

Q: You didn't discuss the role of red teams.

A: Red teams. Absolutely. So the question was, I didn't discuss the role of red teams. Absolutely. Thirty-minute talks are super challenging for content on that reason. What I would roll up red teams as part of is the whole feedback loop.

And for a definition for anyone there, a red team would be kind of an attack simulation. You have a team, whether it's internal or external, trying to attack your services and then telling you how they did it. That used to be pen testing. It's really evolved into its own thing. We call attack simulations. I've actually got a whole talk up on my SlideShare on exactly that topic. So email me, I can send over.

But really, it's part of the feedback loop. That's the short answer, is you want to be improving the feedback loop. Right? From an operations perspective, the whole game day thing, right? We don't want our first time to test an outage to be when something, when a power supply blows out. We want to actually try unplugging that system and seeing that our hot-swap failover systems actually work. Same way, we need better feedback from the security side. Absolutely.

Other questions, please. It's also impossible to see with the lights, so I apologize. I'm missing it. Now's your chance to heckle. It's going to be fun, I assure you. Yes, please.

Q: So how do you embed those security skills into the teams?

A: Yeah. How do you embed the security skills into the teams? That is an excellent question and a topic of four days straight of talking about.

I'd say the highlights of it are, from a cultural perspective-- So there's the culture and the tech, right? The tech's the easy bit. The tech is really around thinking about technology and tooling that can be directly usable by development teams and DevOps teams themselves.

This is the biggest change that's happening over the next 20 years in the security product industry, which is that security products have always been built to be used by security experts and just by them. That cannot stand, right? What defines modern security tooling is it's directly usable by development teams and DevOps teams themselves. We see that in our own product. Half of our users are the development teams and DevOps teams. It's incredible. It's awesome.

From a cultural side, there's a bunch of different techniques, and you find the right ones that fit for your organization. Number one is just organizationally, security teams have always been set off by themselves, their own department, their own thing there. What you start to see really successful organizations doing is embedding security people directly in the development teams and DevOps teams themselves.

That might be folks who've come from the security team. It might be independent security hires that just, they have their own headcount in the engineering teams to say, "We want our own kind of security specialist inside our group as part of it."

But the commonality, I'd say, in every organization I see doing this well, is security no longer functions as an island. It's got to go into the teams there, both from a headcount and a reporting perspective, from a tooling perspective that's not just used by that island but used by the teams themselves, and really bake in as part of that rather than saying, "You must do this," but rather, "Here's how we get this done together."

I realize that's a very high-level answer to a specific question, but I'm happy to chat more on that. I'm super passionate about that one.

Absolutely. Other questions, please. We got time for one more. Let's do it.

All right. Let's go have beers then instead. Thank you so much, everyone.