Security Differently
John has over 35 years of experience, focusing on IT infrastructure and operations. He has helped early startups such as Chef, Enstratius (now Dell), and Docker navigate the "DevOps" movement. He is one of the original core organizers of DevOpsDays and has been a prominent keynote speaker at various DevOps events throughout the years. He is also a co-author of The DevOps Handbook along with Gene Kim, Jez Humble, and “the Godfather” of DevOps, Patrick Debois.
Chapters
Full transcript
The complete talk, organized by section.
John Willis
Hello everybody. I'm John Willis. I'm Senior Director of Global Transformation at Red Hat. This presentation is called "Security Differently."
I work for Red Hat, as I said. I've been there a year and a half. If you recognize some of these people, about 18 months ago Andrew Clay Shafer, on the left, invited us to come in to build this team at Red Hat, and it's been just a blast. Andrew Clay Shafer is on the left. To the right-hand is Kevin Behr. You probably know him as co-author of The Phoenix Project. That's me, the short guy, if you don't know me. And Jay Bloom has been working with Kevin Behr for years, working on his PhD in design transition. He really anchors the team brilliantly.
One of the things I've been trying to ask, this question, I've been looking at this DevSecOps. If we go back, we look at DevOps. DevOps is probably 12 years old. I like to say round it down to 10, but it's 12 years old by the name, the definition of the name. And DevSecOps is probably seven years. I like Shannon Lietz, who coined the term DevSecOps, wrote the DevSecOps manifesto in 2015.
One of the things I've been thinking about, this meta-question, is: what would DevSecOps look like if DevOps never existed? Hold on for a second. I know that's a terrible way to express this, but I wonder if all the good we've done in DevSecOps, did we try to solve a security problem using the lens of DevOps? In other words, did we take this square peg and just jam it into a round hole where a lot of things look like they work, we've advanced, but maybe we really haven't changed the behavior that we wanted?
I'd love to talk more about this if this is confusing to you or not. But as I visit companies, I look at people who have good hygiene, or what we would call high-performing organizations. I'll see a lot of companies that have high-performance traits in terms of how they deliver software, and then they've been very serious about bolting on or abstracting a security overlay on their DevOps modern delivery model. They might be doing their SAST and DAST and all those things, and I say that's a great thing. The glass is definitely half full in terms of DevSecOps.
But then those same organizations, when I talk to them, they'll talk about these structures like the three lines of defense, and sometimes it's an updated two lines, but it's lines. It's this idea where you're creating these walls or firewalls by design. There's a team that implements the control, the team that owns the control, and then if they don't catch it, if the first one doesn't get it, the second catches it, and internal audit.
I think about the original problem statement of DevOps, which was what Andrew Clay Shafer introduced at Velocity 2009. He had a presentation called "Agile Infrastructure," and he basically talked about this: development wanting change, this wall of confusion between the two from operations, and we all know the story here. We want to bust down the wall of confusion.
When I think about the three lines of defense, I think about Conway's Law. A lot of times in modern presentations, we talk about Conway's Law as moving from Waterfall, or not just Waterfall, but monoliths to microservices. But it also has an effect on general organizational design. The adage states that organizational designs and systems mirror the communication structure.
I'll give you a great example. If you ever had the chance to look at the Equifax breach, Congress did a really good report and write-up on this. There were many things that were very interesting, but one of the things was the actual CISO reported to the chief legal officer. Under testimony, when the CISO was asked, "How come you didn't report the breach to the CIO?" the answer was, "I didn't think about it." That organizational design and communication structure was set up to fail because the CISO is now reporting back to the chief legal officer, and that's the lens. Again, there were many other things there.
So if I look at the three lines of defense, I'll take this liberty of saying possibly there are walls of confusion between these two. I know in a lot of organizations this is the case. What do we want to do? In DevSecOps, we're striving to take some of the opportunities that we had and created in DevOps. But in the cases where we're falling back to not changing the security model, just putting the square peg in the round hole, we're not getting the full value of collaborative, where what we really want is internal audit and second-line-of-defense control owners all together in the initial story, requirements, and design.
Again, that's just one of many examples I see that made me raise that question: would we be doing DevSecOps differently if there wasn't a DevOps? A lot of the things that I've been working on, I work in a couple of working groups, and this thing came up in one of the working groups focused on cloud, but it's a general way to look at things. We call it the minimal viable security posture, or just minimal viable security. The question that needs to be asked is: how do you prove that you're safe? How do you demonstrate that you're secure? And how do you do both?
It seems easy, but if you think about how we do things mostly today, the way we prove we're safe is typically change records, or what I would call implicitly defined or subjectively defined information, like humans describing stuff and those humans basically verifying other humans. The way we demonstrate, really, is audits. We do audits, and those are pretty much subjective, high toil. So is there a way that we can actually get true, "I want to prove that we're safe and demonstrate it"? I'm going to propose that there are some models.
This idea I've been playing with for a while now is called modern governance. The idea in general: if we're going to do security differently, then we need to think about any place we see implicit security models, can we be moving them to explicit proof-based models? Along those same lines, in order to transition, we should be thinking about anything that we're doing from a subjective perspective and moving it to objective and even verifiable. I'll show you an example of that here.
One of the things I've worked up is, and this is going to be a terrible phrase, sorry, but I sort of think that there has to be a definition that includes something like post-cloud-native modernization. It's terrible, but it's a way for me to think about what are the problems related to this place we're in, where we're using all this new technology and we're changing our behaviors with DevOps and all those things.
I started thinking about what are really the three most important things if we're talking about this post-cloud-native modernization. It's risk, defense, and trust. Remember I told you I want to go from implicit to explicit or subjective to objective. What I really want to do is subjective to objective and verifiable.
If we look at each one of these, if we look at risk, we want to move from change management and humans telling stories in a change record, to auditors reviewing those and asking for further clarification of stories, or pointing to log records, or looking for screen prints and creating high levels of toil, to maybe a more objective, digitally signed evidence where no human is really involved except the audit is just looking at it. I like to say think blockchain, basically don't use blockchain, but something like that of all the evidence that happens, like your commits, the human things, the review on a commit or review on a pull request or pairing on a pull request, or all the things, the SAST log.
Then moving even further: if we can create objective evidence, could we then validate through some chaos engineering or continuous verification? For example, if this port should have never been opened, let's just open the port. Or if there's no way this image should have been started with this vulnerability, let's start it. Start attacking from outside in.
In defense, we move from detect and respond to building more intelligence, cyber data lakes. Then we move into something that, if you followed, Shannon Lietz spoke back in 2019 here at DOES and she had this amazing presentation, and I'm fortunate enough for her to be a mentor to me, but she talks about adversary analysis, and this is state-of-the-art, folks. Sooner or later, she's going to publish some of the stuff she does. It's heavy stuff, but it's really totally looking at defense from outside in: what are adversarial opportunities? What are metrics like adversarial retention time? Just brilliant stuff.
Then last, in trust, we move from perimeter-based to zero trust architecture. I think everybody knows that. But then we go ahead and start moving further into what I would call distributed trust models, and I'll show you some examples of that.
Basically, three novel ideas: the expansion of what we did, and I'll show you the DevOps Automated Governance Reference Architecture; automating, changing implicit to explicit; looking at how we can get cyber data lakes and intelligence built into our response, and honeypots and adversarial analysis; and then thinking a little bit differently about how we build trust models.
I'll leave you with this on the trust-model idea. Most of the trust models today, even though all this post-cloud-native modernization, I'm sorry, I'm terrible, but most of the models, even for trust today, are still north-south, and we really need to start thinking more about east-west. I'll show you some good examples. One of the things is server-side request forgery or account takeovers in the cloud world. There's a lot of shared accounts, and sometimes zero trust architectures are not enough because if I take over an account or I do server-side request forgery, zero trust is going to trust me.
Let's look at risk. What we want to do in risk is reduce audit toil. Like I said earlier, could the goal be moving subjective to objective? Could we actually turn 30-day audits into zero days where the evidence is digitally signed lists so they're immutable? The audit should be looking at the immutable evidence; nothing could be tampered with. Therefore, increasing the audit efficacy.
Today, in this post-cloud-native modernization, I don't know any better way to say this, but it's very hard to get accurate audit data. Think about it. If you're running ephemeral containers in pods that are moving in clusters, and then you've got service mesh configuration definitions, and then even if you're getting further into functions and serverless, honestly, it's what I call security and compliance theater. People's audits are not even close. There's such a gap between the modern technologies that people are using right now and the way, either the tools or the behaviors, that they use for audit.
So we move to automated, objective, immutable evidence, and like I said earlier, we start thinking about continuous audit and this idea of continuous verification. Think about security chaos engineering.
A lot of this has been my involvement in, Gene, everybody knows, you've probably seen the DevOps Forum papers. There's probably 75 or 80 of them now over the years, going back to 2014, I think. In 2015, one of the first publications was An Unlikely Union: DevOps and Audit. Then in 2018, Dear Auditor, which was an apology to auditors in about two pages, and then the rest of the pages were a promise for very detailed regulatory control. In 2019, I ran with a team of people where we actually tried to set down this first idea of objective, explicit evidence in the pipeline. It was called DevOps Automated Governance Reference Architecture.
A lot of this came from an original paper in 2017 from Capital One, where they were designing their pipelines, and they said these are the 16 gates that a development team needed to do to bypass the CAB, basically. Some of the discussions that were had were, well, if you're going to do that, why can't we use this for immutable evidence? That really created the initial discussions to create that first DevOps Automated Governance Reference Architecture.
You can see source code, optimum branching strategy, all the things that you would expect in a healthy delivery supply chain. In that 2019 paper, we defined seven stages, and we created, I think, 75 attestation definitions. It's a Creative Commons book. We really break down, if you look at the authors on there, John Rzeszotarski, Topo Pal, Courtney Kissler, I'm probably leaving some people out, Dwayne Holmes, and so on.
I won't go through all these, but these are examples of things that were in the original attestation or ones that we've picked up in the community. Setting things like all these things that you could set pass/fail or at least put in an audit store: change size or unit test execution, unit test coverage, cyclomatic complexity, optimum branching strategy. I won't go through all of these.
Some of the things you see in the build stage, again, if you get the guide, you can see the longer version with detailed descriptions. On the package stage, we want to make sure that our artifacts are versioned, there's code signing, there's container scanning, there's the right package metadata. All this hygiene stuff. And in pre-prod, there's some more.
Ultimately, what we got to after the first paper in 2019 was we realized there was a really good discussion to have about if you could have this kind of engine, could you then create an interface, like a human-readable, machine-interpretable, versionable control to actually drive those automated systems. Risk as code, or a policy DSL, whatever you want to call it, that you can inject. We'll be having more discussions about this in this new paper.
In 2020 we didn't do anything related to this, but a couple of weeks ago we started the second version. What's interesting about this is we're going to cover a lot of the stuff that we've learned. Some of the companies that have implemented this, what have we learned from the original reference architecture? We realized a lot of people were -- I thought when we wrote that 2019 paper, everyone was going to be like, "Oh my God, this is the greatest thing ever." Not because we're brilliant, but just because I thought it was a brilliant idea. We've realized that we weren't fun. So we're going to try to do a little fun this time. We're going to take a page of The Phoenix Project. We're going to create a narrative called Investments Unlimited. We're going to have some characters. It's going to start with a big failed audit, and then we're going to use a lot of lessons learned from that model that we used in 2019. It should be fun. It'll probably be out in September before Virtual Vegas.
One other thing that I think was interesting is I got fortunate enough that I was pulled in to talk to SolarWinds about how they could use automated governance. A friend of mine who was a big fan of automated governance brought me in to talk to some executives at SolarWinds. There's some really good data from CrowdStrike, and then MITRE has just brilliant documentation about how they think about it.
For those of you not aware, we're talking about the original breach in SolarWinds, not what happened to everybody else in the already breached software or the compromised software. Both CrowdStrike and MITRE did a really good job. This one actually was from CrowdStrike, where you defined, if you're familiar with the MITRE ATT&CK framework. What I just wanted to show is that automated governance doesn't solve everything. But when I looked at all the explanations of what happened in SolarWinds and how they were able to sit and find their way, sit on top of MSBuild, there were a number of things where the general policy structure of the things we've been talking about in automated governance would have helped.
First off, just pipeline as code. Basically building the infrastructure with something like Ansible or Chef or Puppet. There were a lot of examples of masqueraded logs. Again, immutable stores for this stuff, which I'll talk about with a couple of products there. Code signing. It was really interesting to see that there were all these code signing and hash mismatches in the logs. That's what you get for trying to talk so fast. You saw those all over the place. Again, I wouldn't say it would have solved everything, but these things would have turned up as red flags in control gating.
In general with risk, we look at: can we use automated governance to create these digitally signed attestations? Red Hat has a fabulous product called Ploigos, not because I'm a Red Hat employee. I can do a whole presentation on why I think this thing is cool. There are some interesting ways to do the evidence store. Grafeas is what we used in the first paper, but Sigstore is an interesting project, which is a collaboration between Red Hat and Google. It was originally built for certificate transparency, but it actually can be used for audit data. It's a Merkle tree. It's immutable.
Whether it's OpenShift or Red Hat, there are some really good compliance operators out there if we're talking about Kubernetes. Software bill of materials, I'll talk about that in a minute. Continuous verification.
One thing about SBOM is, I want to say SBOM differently. I've done a lot of research recently about the software bill of materials. These are the different ones that are most prominent: OWASP CycloneDX, SPDX from Linux Foundation, you've got the NTIA, the National Telecommunications and Information Administration, and then you have MITRE, who's trying to take all these in one place. The issue here is that it's all package- and vulnerability-focused. There are so many other things that need to be in an SBOM. I can go on, but I think the idea of using checkpoints at each step of the way might be the wrong approach. It might just create digital evidence, and then a final SBOM that actually will have the linked list of the evidence along with license and all this. Anyway, I'll be talking more about it. I'm going to write a blog about this pretty soon.
Just quickly, I want to talk about defense differently. Here we want to reduce the same thing, reduce the toil related to our defensive posture and increase the efficacy. Are we doing things that are really creating high efficacy? One of the groups I spent a fair amount of time on is an ONUG group, where we're really focusing on SIEM and SOARs from multi-cloud providers and creating intelligent data lakes. Then you have deception technology and, like I said, Shannon Lietz's work with adversary analysis.
One of the papers that we did last year was automated cloud governance, part of ONUG, where we got some really large-cap companies together to look at these multi-cloud event problems. Again, you can see the list. You'll have the slide deck of really big companies. We're working on the second version now, which is interesting. We've got a fair number of the cloud providers now. We have IBM, we have Google, we have Microsoft, and hopefully we'll have Amazon at some point, trying to create this unification of cloud control.
The problem statement is that you get all these different events that really are the same, have different context, different ordering, and it's very hard not only to unify that, but then add layers of common metadata. It's a really interesting problem. We call it the Cloud Security Notification Framework. Go to ONUG, take a look at it. It's really cool. It's actually modeled after how SNMP solved the original networking problem. We're not using SNMP, but we're thinking this is the way to solve all this complexity from cloud.
Here's an example: we created Decorator. Decorator is interesting because the idea of Decorator is creating these common event structures. One of the cool things is we would anchor the Decorator with NIST definitions and then the MITRE ATT&CK framework. So you get this whole picture. It's a work in progress. Again, go look at ONUG to see where we're at. It's really interesting. Or ping me and I can fill you in.
Ultimately for defense, what we're looking at is intelligent data lakes, the work we're doing with automated cloud governance, CSNF. The MITRE ATT&CK framework just fits everywhere. I'm really a big fan of that. If you haven't seen SCAP, OpenSCAP, there are some interesting models for creating metadata around that. Then cyber ranging, if you haven't looked at building honeypots, and then here again, custom. But there are really no tools for adversary analysis. This is stuff Shannon has worked on.
Finally, I want to talk about trust differently, sort of the third anchor of the three things I talked about as post-cloud-native. Modern trust: certainly zero trust architectures, I showed you that slide in that transitioning slide. We want to move, but we also need automated control-based assessment. If you haven't looked at OSCAL, one of the things is it's very heavyweight, but for people who have that FedRAMP, different technologies, OSCAL is very helpful. It's a really good self-documenting system. It's still a little heavy. Distributed secrets management and distributed trust.
In the trust model, you have NIST 800-207, zero trust architecture. One thing I want to say about zero trust, I said this earlier. It's a trust architecture, but if you're getting compromised, you still have a problem. If somebody gets into a cluster or a structure and they're able to do a server-side request forgery or account takeover, or there are many shared accounts like the metadata server in the cloud and different structures, there are by-design shared structures for authentication and authorization.
Two things I'm really interested in. SPIFFE is an interesting project in that it was born out of the service mesh sidecar model for containers and node-to-node ephemeral authority, or short-lived certificates. Also, there are a lot of people looking at this for a possibility of using it for a new version of secrets manager. Vault is the hot, cool thing, and I think Vault's a great product, but there are discussions now about could we actually be using these models of distributed trust or distributed identity, distributed trust from a security and identity perspective.
Also, I told you about Sigstore. Again, not because I work for Red Hat, I think Sigstore is the, I've got to say, the shit. I think it has incredible potential, certainly for audit, for immutable audit. It's a way to -- it's like everything that you'd want out of blockchain from the automated governance structure I talked about earlier, but much lighter weight, cleaner. Think of all the things in the Merkle tree: it's immutable, cannot be mutated.
It has this other thing that's really cool. It has an event-driven architecture on the large structure. Now you can really do some cool stuff, like looking for denial of service or any type of anomalous behavior. I think there's the possibility, again, this is all emergent, that you could also do secrets management in this model. It's just very similar to what we're doing with SPIFFE.
Anyway, that's my presentation. I know I went really fast. Either I give you a presentation where I cover one subject in gory detail, or I give you a survey where I can at least teach you some of the things I learned. In the chat during the presentation, I'll list all the links, and I'll be there in the chat as well. As always, I'm very reachable. Just tweet me @botchagalupe on Twitter, or jwillis@redhat.com, or for those of you who really want to know my personal, it is botchagalupe@gmail.com.
Thank you so much. Again, if you have any questions or you want to discuss this, I'm really interested in the idea of security. Gene always asks us to ask. I want to start more discussions about security differently. I think what I'm hearing from CISOs is we're doubling and tripling our cyber defense budgets, but we're not getting better. This just positions us for disaster. I'd love to have more detailed conversations. I'm already having conversations with some really interesting clients, or really friends. So be my friend. Anyway, thank you so much.