Audit Ready Pipelines
Auditors want proof that the right people did what they were supposed to do, when they were supposed to do it, and they want that information on demand. Developers tend to run for cover whenever release managers show up with an audit request in their hands. Audit-ready pipelines aim to address both of these problems.
The idea behind audit-ready pipelines stems from trying to merge software delivery automation, CI, and CD together into something where you can get increased visibility and understanding of your entire process, and what it means to different stakeholders.
This session is presented by Cloudbees.
Chapters
Full transcript
The complete talk, organized by section.
Anders Wallgren
Hi, I'm Anders Wallgren, VP of Technology Strategy at CloudBees, and today I'm going to talk about audit-ready pipelines: how we can use automation and orchestration to make our audit processes more predictable and easier to deal with.
And what we really want to get to, in terms of a desired state here -- I'm just going to start with that -- is really thinking about our pipelines, our end-to-end pipelines, our release pipelines, as being the core, the heart of our audit trail, and building from there. What we want to achieve, and what I'll talk about how to achieve today, is really using that connected toolchain with immutable objects as the core of how we do this orchestration. Then build in both automated and manual approval gates, collecting evidence along the way. The outputs of all of our commands that we're running as part of our software pipeline release process are going to be captured in a common data repository so that's available later for closer inspection. All of the actions that we're taking, all of the automations that we're running, all of the artifacts that we're producing are going to be governed by centralized role-based access control or access control lists.
Part of our goal here is to provide end-to-end visibility over not only what is our process, but what were the results of running our process, and collect data, metadata, evidence links, all of that kind of stuff along the way, so that at the end, we have a one-click automated audit report with data that we trust.
The current state of audit for many of us is a little bit more complicated and manual and fraught with uncertainty than that. We typically will get an email at the end of the quarter saying, "Guess what? It's time to do audits. Yay." Then you have to leap into action.
If you have a software pipeline that's fragmented into islands of automation, as I like to talk about, then you've got to manually retrace all of your steps and manually go between all the different tools that you use in your toolchain in order to build, test, qualify, release, and deploy your software. That's a little bit of a challenge, right? You've got to go into your issue tracking tool to figure out what the change request is that's involved here. I've got to go look at the production systems and see what did we deploy into production. I've got to look at the deployment tools I'm using and make sure that everything worked fine there. Look at my CI tooling, figure out what went into the build, look at my SCM system to figure out what code changed, all of those kinds of things. It's a manual process.
That leaves us with a challenge, because with this disconnected process, disconnected sets of tools, the data is scattered all over the place. We've done a pretty good job the last 10 or 15 years with Agile, with DevOps, with DevSecOps, and so on, at uniting the cultures a little bit more and tearing down the cultural silos, the organizational silos that we built up over the years in the old ways of doing things. But now what we still have is we're still challenged by this tool silo, the data silo, where we have to go hunt and peck around to find all of the relevant data that we need, whether it's for audit processes or process improvement or what have you. That makes it very difficult to get a really nice traceable set of data, proof other than the attestation methods that we use these days, which is basically just somebody saying, "Yeah, I did it." As long as you trust that person, that's great, but attestation is probably not good enough these days. Particularly, we can do better. We have the ability to collect data from a specific tool saying, "Here's what we did, here's the configuration we used to do it, here's the output," and automatically prove and decide that we've passed an audit or control requirement and move on to the next stage in the pipeline, if you will.
The risks that we have in this disconnected space that we have -- and by the way, the problem is not that we're using a lot of different tools here. The problem is that there isn't one way that we can collect and manage and govern all of the things that happen around that. This is not an argument for, "Oh, just put everything into one tool and you'll be happy." We absolutely, I believe certainly, that best-of-breed tooling for all of the spot places where we need tools and platforms to build and deliver our software is definitely the way to go. But we do need to have this overall end-to-end approach, an uber-orchestrator, if you will, to make sure that we have one pane of glass through which we can look at all of this data. When I show you a little bit what that looks like in real life, you'll see a little bit more what I mean about that.
The risks that we have in our current processes: we're going to spend more time on this than we want. We spoke to a release manager, one of our customers, who said they collect audit data weekly. They analyze it every couple of weeks to produce a monthly audit report, and that process takes them about 18 hours to do. In other words, several days of time is spent doing this. Most of it is just the manual grunt work of collecting the data, collating the data, making sure we didn't miss anything, all of those sorts of things. Then that introduces errors, that introduces cost. If you have to go chase down data for things that happened one week ago, two weeks ago, three weeks ago, you're going to piss people off, right? You're going to have disgruntled developers who are being distracted from their task at hand or disgruntled ops people who you're pulling off the task at hand and asking them to answer questions about what happened two weeks ago. Nobody wants to do that. As a result of that, you may end up failing audits. You may end up failing to satisfy your control requirements because you might have incomplete data or suspect data, and that's not the place we want to be.
We really want to be using the pipeline tooling, the orchestration tools, the orchestration platforms that we're using to create and use audit-ready pipelines. I'm going to go through a little bit of a real-life, or more real-life, example here.
First, just a quick review of what it is that we're looking for here. We want to have a way to connect our toolchains together. We've got 50 or 100 or more tools that we use in our software delivery pipelines in order to build, test, qualify, release, and deploy our software. What we want to do is connect those together. We want to automate the overall orchestration of this, including all of the approval gates, whether they be manual or automatic. If they're automatic, we want to collect the right kind of metadata, the right kind of information so that we can prove that we passed a particular control or a governance requirement.
If we have a manual process, a manual step in there, that's okay. Well, it's not ideal, but it's okay, and the reality says we still have a lot of that going on. What we want to do is account for that in our process, collect that data, collect that information as well. If we are using the manual attestation process for doing governance controls for part of this, then at least collect the evidence link, collect as much information as we can as part of that, and be able to collect a little bit of metadata around that. Again, all of this stuff ends up in a common data repository where we can look at it in one place. We don't have to go hunt and peck through all the various systems, some of which we may not even know where they are, some of which we may not have access to and have to get it through other people, those kinds of things.
We're able to have access control across this pipeline, across the artifacts involved and the metadata involved in this. This starts to give us a much better picture of end-to-end visibility across our entire pipeline processes. Towards the end of that, we end up with one-click automated audit reports with data that we know we can trust. That's where we want to end up in.
Conceptually, what we're talking about here is taking the release pipeline and orchestrating across our different toolchains that we use all the way from the development CI processes, source code management tools, Jenkins CI tools that are out there, QA processes, doing test automation, continuous testing, those types of things, across into our artifact management tools, our ticketing tools, those sorts of things. As we go into the higher-stage environments, into pre-prod, staging, and production, collect all the right information all the way throughout that process so that it's available towards the end.
I'm going to go away from the conceptual picture and show you more of an actual screenshot of this type of process happening. What you see here is we've got, from left to right, our development, QA, pre-prod, staging, production environments, and all of the various tooling that's being orchestrated: our CI tooling, our ticketing systems, our SCM tooling, our quality assurance-type tooling, SonarQube, Jenkins, Git, all of these kinds of things are being orchestrated here. We're collecting metadata along the way, both in terms of the evidence that we want to collect for our audit purposes, but also for things like duration reports and so on, which we'll see here in a minute. You can see already here in our dev stage, we've started collecting information which will be useful to us later on. We know what issues we've got related to the CI pipeline that we're running here. We've got our changelog report from our SCM repository and various other information from our CI systems and so on. We're still just in the dev stage here.
Now, when we get to where we're done and we're now looking at what are all the audit information that we collected here, you can see here in an example approval report, auditors want to know who did what in a release. The approval audit gives us visibility into all of the automated and manual approvals and gates that we passed through as we promoted the pipeline from stage to stage to stage, and then finally on into production. What's interesting here to me is we collect not just the information from the automated gates, but also from the manual steps, if we have any. Even though we're doing some things manually, we're still collecting all the right metadata from it, and we can use that moving forward to decide where to make improvements.
Along the way, we're collecting evidence links: links into other systems, links into other reports. But again, we're putting it all into one place where we have the one pane of glass where we can see all of the evidence links that we need to verify that we have a process, we followed it, we have the data that we need to collect, here it is, and we're able to collect all of that.
Another interesting thing that we get, which is maybe not always considered to be directly related to auditing, is that we can collect duration information on this as well, because you almost certainly have internal goals around performance and cycle times and those sorts of things. That's not completely unrelated to audits, because experience tells us that when we start to layer controls in on processes, they have a tendency to slow things down. When we then further, if we have manual processes or manual controls that need to happen, that's often where we start to introduce delays into our pipeline.
Being able to produce an audit report that contains duration and cycle time data from our pipelines, we now have visibility into how long the release took, breakdowns for each task, and whether that's an automated task where, okay, it took an hour, it would be better if that took 10 minutes, or, hey, it took two minutes, that's awesome. Or we have a manual step somewhere where they were out to lunch and it didn't get done until the afternoon, so we had a three-hour delay in our process, or somebody was out sick, or somebody was on vacation, or somebody was overworked and just couldn't get to it, those sorts of things. We now have data that we can collect on this, which is very important along with all of the audit and governance and evidence links that we've collected.
If we look forward from here, auditing the new way: automation is auditing. That's a little bit of a simplification, but basically auditing at its very simplest is document what you do, prove that you do what you document. By putting an orchestration layer on top of all of our tooling silos, all of the individual tools that we're managing, we now have a single pane of glass where we can manage the security around this, because it's important to have a secure pipeline as well as making sure that our software itself is secure. We have a centralized mechanism for reaching all of the metadata that's been collected by this uber-orchestrator, if you will, along the way.
To just sort of summarize and best practices around how to get here: get all of your key stakeholders involved in this. Do a value stream mapping so that you know what this overall process pipeline looks like, so that you know what all the stages and steps are. Take a holistic approach to how we're doing this. Software delivery is going across organizations. We know that. It's going across toolchains. We know that. Let's take that into account.
Obviously, watch closely for vulnerabilities in your software. Run all the right scans, all of those things. You need to understand what you have in production so that when new vulnerabilities are discovered post-deployment into production, you know what's there. That's very important. Secure your pipeline as well, because it really does you no good to do a security scan if any Jack or Jill can go in and change the way that you're doing security scans to maybe just not do much of that scan in the first place.
Prioritize culture and don't boil the ocean. Two pretty important things. Think about this as a cultural problem as much as a process problem and a tooling problem. Agile and DevOps have, to a great degree, brought and is shining the light on culture as something that's important to get right in order for these things to function well. Don't try to boil the ocean. Take small steps here. Look at where your biggest pain points are, solve those, and move on to the next one, the next one, the next one.
That's a little bit on audit-ready pipelines, and thank you very much for listening and watching my talk today.