The Myth of Productivity vs Compliance: How To Have It All
Which enterprise development practices are associated with excellent productivity outcomes? How do the top-performing companies approach compliance with security and legal requirements? Can an enterprise attain class-leading performance in both security and productivity?
We present the outcomes of the latest State of the Software Supply Chain Report, the result of a year-long research collaboration between Gene Kim (researcher and co-author of The Phoenix Project, The DevOps Handbook, Accelerate), Dr. Stephen Magill, (expert in software security and program analysis), and Sonatype (maintainers of the Maven Central Repository).
In this study, we surveyed over 500 enterprise developers and analyzed the practices that help high performers stand out. We will dive into the data and guidance that emerged from this analysis and explain the cultural and workflow practices common to high-performing organizations, which see 26x faster detection and remediation of vulnerabilities and 15x more frequent deployments than low performers, all while reporting class-leading security and quality outcomes.
Chapters
Full transcript
The complete talk, organized by section.
Stephen Magill
Hi, I'm Stephen Magill. I've been doing academic research in software analysis, security, and programming languages for more than 15 years, first as part of my PhD work at Carnegie Mellon, and then at other universities and industry research labs. Over the last few years, I've gotten more and more interested in the practice of software: open source development practices, how enterprises approach software, how they use open source, and really how to best contribute to these communities by improving tools and practices. I'm going to be talking about some really cool research Gene and I have done in that space.
Gene Kim
Awesome. My name is Gene Kim, and I've been studying high-performing technology organizations for 21 years. One of the funnest projects I've gotten to work on is the State of DevOps Report, with Dr. Nicole Forsgren and Jez Humble. This was a cross-population study that spanned over 36,000 respondents over six years, and it gave us a great glimpse into what high performance looks like and what behaviors create high performance.
Stephen Magill
In this work, we're going to be looking at open source and usage of open source. It's probably not a surprise to hear that almost everyone is using open source. Nat Friedman, CEO of GitHub, claims that 99% of new software projects include open source components. That means they inherit any open source vulnerabilities. If you're using open source, you want to be remediating those vulnerabilities as they arise. Even better, you'd like to stay ahead of that vulnerability curve and avoid being in the position of having to remediate.
How do you do that? You have to consider that when you use open source, you're not just importing some code, you're adding developers to your team. Those open source project contributors become contributors to your software. This will help you, increase agility, and import great security practices. Or will it? Will it import great security practices, or is it going to hold you back? Is it going to be a source of vulnerability? What are these extra developers bringing to your team? That's what we dove into: what practices lead to good security outcomes, and what should you look for when choosing open source components?
Gene Kim
Stephen and I were sitting in the GitHub Universe session. We heard Nat Friedman say that, and we thought, when you invite these developers into your house, are they going to help you build your kitchen, or are they going to trash your kitchen? Are there ways to tell?
I mentioned the State of DevOps research. It was such a fun study because it linked cultural aspects with technical practices and architecture. Some friends reached out to me some years ago to talk about this potential research project and being able to look at data from the Maven ecosystem. I jumped at the chance because it was a way to look at what update behaviors look like in the wild.
For those who don't know what Maven is: Maven is to Java what npm is to JavaScript, what PyPI is to Python, what RubyGems is to Ruby. As someone who benefits so much from the Java ecosystem and Maven, because my favorite programming language is Clojure, it was an irresistible opportunity. When I looked at the data set that was being made available to us, I immediately reached out to my friend Dr. Stephen Magill and asked if he'd be willing to collaborate and jump into the data and see what we could learn.
Stephen Magill
While I spend most of my time in the Haskell ecosystem, it certainly gives me an appreciation for functional languages and the role that Scala and Clojure play as functional languages targeting the JVM. It was super exciting to dive into this space and see what we could find in this wealth of data.
The key questions we asked were twofold, some about the open source side and some about the enterprise side. On the open source side, we wanted to know how projects manage their own supply chains. The nice thing about open source is that all the dependencies and transitive dependencies are open source as well, so you have a great amount of data that you can aggregate, label, and analyze. That was our focus in the first part of the work. We wanted to look not just at summary statistics about how many projects and vulnerabilities are out there, but at deeper analysis that describes behavioral patterns and ultimately leads to guidance and insight for people who consume open source.
On the consumption side, we wanted to look at how enterprises manage their open source supply chains: what governance tools and practices they employ, and what impact that has on security and productivity. In the course of looking at both sides, we observed interesting facts about the value of working together, having enterprises deeply involved in open source, and welcoming community contributions back into corporate open source projects as well.
As Gene said, this started two years ago when we learned about what you could learn from Maven Central data and the Maven ecosystem. That led us to do a deep dive on update and security practices. That turned into chapter three of the 2019 State of the Software Supply Chain Report. We started the work in 2018 and published the report in 2019. We had such a great time with that research that we teamed up with Sonatype again to do further analysis for the 2020 report. This time, we focused on the consumer side of the equation, looking at how enterprises manage those supply chains. But first, we're going to summarize the 2019 results.
In that work, we looked at Maven Central. There is an amazing amount of projects, components, and data there: 310,000 components and 4.2 million versions of those components, so individual JAR files. Each has its own vulnerability history, dependency graph, and API changes with each version, which can cause problems. Almost 7,000 of those also have associated GitHub repos, giving us another set of data to correlate against, including metadata on team size and commit frequency. We can look at individual commits and the history of code changes. Of these components, about 9% had a vulnerability associated with them. When you look at the dependency chain and transitive dependencies, that increases to 47% of components having some sort of vulnerability that impacts them during the period when that component is the current version.
Gene Kim
This is how we wrote up the findings, and this went into chapter three in the State of the Software Supply Chain Report. I want to thank Dr. Magill and the amazing team at Sonatype who helped make this happen: Bruce Mayhew, Ghazi Mahmoud, Kevin Whitten, Derek Weeks, and Matt Howard. It was fun to take a look into the components that some of us use every day and that leverage the Maven ecosystem.
One of the things we looked at in the first year, and are pulling into the second year, is that we took the State of DevOps reports and the IT performance metrics and started to think about the analogs in the open source community. The IT performance metrics are deployment frequency, code deployment lead time, mean time to repair, and change success rate. We linked those to release frequency: when do open source projects release a new version? In terms of organizational performance, we linked that to popularity, potentially measured by GitHub stars, forks, or downloads per day within Maven Central. For mean time to repair, we looked at how long it takes to remediate vulnerabilities when they are disclosed through a CVE disclosure. This shaped our thinking about dependent variables: we lay out the independent variables and see which ones perform better.
Stephen Magill
We structured our research around a number of hypotheses. The first was about the "is faster better?" question. We find that's the case in the enterprise, in the State of DevOps and Accelerate research. Would we find the same thing in the open source community? We did. Projects that release more frequently were two and a half times more popular in general. If people using your open source software is your goal, which I think it is for most projects, they're having better outcomes. They also had larger development teams, more contributors, more active projects, and were more likely to be foundation-supported. They were more secure. The fastest 20% of projects by release frequency also update dependencies 18 times faster than other projects, and that update cadence correlates strongly with security outcomes. A range of great outcomes result from releasing more frequently and moving faster with deployments on the open source side.
Gene Kim
Hypothesis two was the notion that projects that update dependencies more often are generally more secure, and we found that was the case. Those that were the most secure tended to update 1.5 times more frequently. They had 530 times faster remediation times, and they were 173 times less likely to have out-of-date dependencies. That means they were not only updating themselves, but making sure all of their dependencies were up to date.
A great example of why this matters is the PrimeFaces vulnerability that came out in 2017. That issue had actually been fixed years before, so if you had stayed current, you never would have had the vulnerability. Those who didn't were suddenly taken over by Bitcoin miners and had to do drastic things to update those dependencies in production. This validates the notion from Jeremy Long, founder of the OWASP Dependency-Check project, that one of the best ways to stay secure is to stay up to date on your dependencies, and that means updating in daily work.
Stephen Magill
As part of staying up to date, you want to make sure you're pulling in dependencies that are themselves good about staying up to date. What do you look for? How can you find those projects? One factor is general update frequency: how often do they release? Update behavior and remediation behavior tend to track each other, and release frequency is correlated with both. Good outcomes come from projects that release more frequently.
For projects with larger development teams and higher code commit rates, they are also generally better at keeping dependencies up to date. The top 20% of projects by size had 50% faster update times and released 2.6 times more frequently. They were also more likely to be foundation-supported, which reveals that foundation support is an important aspect and something you can look for that influences project quality.
Gene Kim
Hypothesis three seemed obvious. We thought projects that have fewer dependencies would have an easier time staying up to date. It makes sense that if you have a smaller surface area, it would be easier. That turned out not to be true. Components with more dependencies had better mean time to update; in other words, they updated dependencies faster.
Stephen pointed out a startling observation in the last finding: the most popular dependencies, the ones that are most secure, have more developers. That means there's a link between the number of dependencies and the number of active developers, as measured by the number of people making commits in a given month. That brings up the question: does increasing the number of dependencies cause you to have more developers, or is it the other way around, that when you increase the number of developers, they pull in more dependencies? It was surprising that components with more dependencies were the ones with better MTTU.
Stephen Magill
Even more surprising, our most surprising finding from that year's research was hypothesis four. We expected popular projects to generally be better: better about staying up to date and more secure on average. We found that was not the case at all. We found a lot of strong statistically significant differences between different factors and their impact on security and update performance. Popularity was one of the few that did not have that impact.
If you take one thing away from this part of the talk, let it be that you can't just lean on popularity as a proxy for quality.
Gene Kim
One unsettling thing is that, in my own experience, whenever I want to solve a problem and look for a component to solve it, I generally use this heuristic: I look for the project with the most stars and forks. It turns out that's a very bad heuristic. The question becomes what heuristic we should use instead for which open source components to use.
Stephen Magill
Looking at release frequency is one of the key things you can easily access and evaluate when looking at projects. A popular project that's not releasing frequently is falling behind in some respect when it comes to its transitive dependencies.
Given the importance of updating and staying up to date, why isn't everyone completely up to date? Why aren't all dependencies brought to the current version on every release? We looked into this.
Gene Kim
One of the best papers describing how problematic staying up to date is came from a group of researchers in Brazil. They monitored 400 open source projects for 116 days, and during that period they detected 282 potentially breaking changes. We did the math: the breakage rate is sufficiently high that, given enough time, your probability of having some breaking change approaches 100%.
This resonates with anyone who's been afraid to update dependencies. It suggests you're afraid for a reason. Updating dependencies so often breaks your code. This explains one reason people don't update dependencies in daily work: it is potentially problematic.
After seeing that data, we put together a survey to understand the psychographics of higher performers who update dependencies in daily work. What we found was astonishing. We clustered them into a high-pain cluster and a low-pain cluster. The organizations that associated updating with high levels of pain were three times as likely to strongly agree with that pain, but they were also two times less likely to consider patching to be painful. They were 10 times more likely to schedule dependency updating as part of daily work, six times more likely to strive to use the latest version, whether N or N minus one, 11 times more likely to have a process for adding dependencies, 10 times more likely to have a process for removing problematic dependencies, and 12 times more likely to use automated tools for enforcing policy around updates.
This was a startling finding. When you see multiples like this, you know there is something very different between the high, medium, and low clusters, much like in the State of DevOps research. This focused where we wanted to go in this year's study.
One other thing that guided us was looking at the data to see migration behaviors from version to version. For Hibernate, each arc shows a migration from a source version to a destination version. There are almost two distinct populations, one on the left and one on the right, and they almost never meet. The extent of the changes is so vast that you tend to stay on one island versus another. This is what you don't want to see. If you want to stay up to date, you have to take a very painful change and jump the chasm. Some people stuck on the island on the left will never make it to the right. Once they stop releasing security patches, those people will be forever left behind.
Stephen Magill
This made us realize and appreciate that there is a difference between components. Some update friction is due to component choice and the choices libraries make about when and how they migrate from one API to another.
In contrast to Hibernate, the Spring Framework shows a very different graph. There aren't two separate populations. Everyone is able, from whatever past version they're on, to successfully migrate to the latest version. There is a higher density of arcs landing in the latest or almost latest versions. Clearly the framework is structured such that it's easier to update, and the community as a whole tries to stay on the current version of the software. You do see some red arcs: people stuck in vulnerable versions of the library, making progress toward recent but not all the way there.
Gene Kim
The red arcs indicate where the target state ended up in a vulnerable component. They updated, but updated to one that was actually insecure.
Stephen Magill
Maybe they should have jumped a little farther.
Gene Kim
Here's a different archetype: Joda-Time.
Stephen Magill
This shows a homogeneous set where people can seemingly upgrade from any version to any other version. Our hypothesis is that these versions have very few breaking changes; you can go to any version and it's not going to break functionality. When I think about what it would look like for organizations to easily, quickly, dependably, and reliably switch from whatever version they're on to the latest version, this is the sort of distribution we would like to see. We didn't have the chance to explore this as fully as we wanted, but it will hopefully guide future research.
I think we see the role of security vulnerabilities in pushing update behavior here too. In Joda-Time, there are no vulnerable versions involved, so update behavior is driven more by wanting to update or needing new features. In Spring, there are vulnerabilities against older versions pushing everyone forward. In Joda-Time, there is a more uniform distribution of versions in use.
Stephen Magill
In the 2020 report, after looking a lot at the open source side of the equation, we wanted to focus on the consumer side: enterprise usage of open source. We wanted to see what practices are associated with better security, compliance, and productivity outcomes. We interviewed over 500 developers, actually 528 developers, from a range of companies across all industries, asking about their security and performance outcomes and their DevOps practices to see which contribute to high achievement. We asked questions like: Do you centralize your CI infrastructure? Do you automate software governance? Do you contribute to open source? Are you confident in the security of your deployed applications?
The hidden question was: can you have it all? Can you be more productive and more secure at the same time? Can you simultaneously advance the objectives of security and development? Can you have both?
Gene Kim
The answer is yes.
Stephen Magill
Yes. We found that not only can you achieve both good risk-management outcomes and high productivity gains simultaneously, but a remarkably large percentage of the companies we surveyed are managing to do just that.
To see that visually, I want to describe the diagram that is the centerpiece of this year's report. It shows all the companies we interviewed plotted on a two-dimensional grid based on the x-axis, their self-reported level of developer productivity, and on the vertical axis, their risk-management outcomes. In the upper right is a group of companies that performs extremely highly on both dimensions. Not surprisingly, these companies tend to adopt core DevOps principles of automation and consistency. In the lower left are companies at the opposite end of the spectrum: poor productivity outcomes and poor risk-management scores. These might best be described as DevOps Padawans, early in their journey with a lot to gain by adopting better practices. In the upper left are stereotypical risk-averse companies that focus solely on risk mitigation and achieve it via mostly manual and inefficient workflows. They attain good security outcomes, but at the cost of productivity. In the lower right are the move-fast-and-break-things companies, prioritizing productivity above all else, often at the cost of risk management.
Gene Kim
What are the different colors here? This is the coolest thing. They're colored not just based on the quadrant they're in, although it looks like that. We didn't directly measure productivity and risk management. We asked 11 questions about various aspects of risk management and productivity, clustered companies in that 11-dimensional space, and then projected down onto these two dimensions. It shows the importance of these two meta-dimensions: everything collapsed into a spectrum of productivity and a spectrum of security and compliance.
The question that always comes up is how we picked the clusters. Stephen touched on the fact that we didn't. You plot these in 11-dimensional space and create centroids to minimize the distance between clusters. Then it's projected into 2D. When we saw this graph, it was one of those exciting aha moments because it beautifully explains the behaviors we see.
Stephen Magill
The other thing we can do now is dig deeper. Not only are these clusters well defined and segmented, but we can compare across clusters and ask what the differences in practices are from low performers to high performers. First, focusing on high performers versus security-first: they're both achieving great security outcomes, but why is one achieving substantially better productivity than the other?
Gene Kim
Much like we did with the State of DevOps Report, we measured the difference numerically. The high performers against security-first are 50 times more likely to use software composition analysis tools. They are 77% more likely to automate the approval, management, and analysis of dependencies. They are a third more likely to enforce governance policies within their continuous integration system rather than through a manual process. A footnote: it was not that they were all centralized or all distributed. That did not pan out. They're 51 times more likely to maintain some sort of centralized bill of materials, and 96% more likely to centrally scan all deployed artifacts for security and license compliance.
The goal of science, they say, is to explain the most observable phenomena with the fewest principles, confirm deeply held intuitions, and reveal surprising insights. I think this absolutely does that. Stephen, does that resonate with you?
Stephen Magill
Yes. It speaks to the importance of automation and uniformity and establishing consistent workflows in achieving these great outcomes productively.
Gene Kim
One more color commentary: when I see these, what it says is that security objectives are being integrated into developers' daily work and into tooling and automation. There is also some centralization in terms of consolidating the best-known knowledge of how to do that.
Now I want to compare the high performers and low performers, the two extremal points in the clustering. This is where the differences get stark. The stats are 15 times more frequent deployments, 26 times faster detection that vulnerabilities exist, 26 times faster remediation of those vulnerabilities, and six times more likely to have developers be productive when switching teams.
Stephen Magill
To have developers be productive when switching teams: they actually get up to speed faster. There's more uniformity in the software development process.
Gene Kim
This is our way of exploring whether teams are standardizing and have a high degree of portability, or whether they are all creating bespoke ways of doing things that make it difficult for developers to switch teams. They also have 26 times faster approvals to use a new open source dependency. You can create processes to add new dependencies without being burdensome and slow.
Stephen Magill
The groups clearly have different focus areas, but if we close back up and look at the full data set, the graph shows the centroids of each group. Looking at just the centroids, the security-first cluster is on average not only more secure than the low performers but also slightly faster, slightly more productive. The productivity-first group is on average still less productive than the high performers, but it is achieving a slightly better security outcome. Each group is a stepping stone toward the goal of high productivity and good risk management. You can get there by starting with security or starting with productivity, but everyone wants to trend into the upper-right quadrant.
Gene Kim
This is a cool treatment of the clustering data. I've never seen contour maps used to see where we reside in the clustering spaces.
Stephen Magill
As a final bonus, we looked at something we called open source enlightenment. This was a subset of questions that touched on aspects not just of open source usage, but of support and involvement in the open source community: executive support for open source, contributions back to open source. We combined those into one factor called open source enlightenment and looked at companies with high levels of it.
One thing is that support for open source and involvement in open source leads to substantially higher job satisfaction as well as better security outcomes. The job satisfaction makes sense: community support and positive organizations have great principles and care about employees' engagement in the broader community. The security outcomes were surprising, although on reflection they make sense. When you're deeply involved in an open source project that you're using, you'll be more aware more quickly when vulnerabilities are discovered there. When you need to move to a new version, it is probably easier because you're more familiar with the code base. It pays not just to make use of open source, but to get involved in open source.
Gene Kim
You probably have a better sensibility of what's coming because you're more aware of the roadmap. One thing we found surprising was that we thought one counter-marker of performance would be the extent to which organizations maintain an internal fork of an open source version. That did not pan out, and I think we misworded the question. The intent was to find people stuck on an island and having to backport security patches. But in order to contribute, you have to at some point maintain an internal fork, even if only for a couple of days.
Stephen Magill
Let us know. Come tell us how you engage with open source and how you manage that in your internal workflows, because that would help inform next year's survey.
Gene Kim
The summary finding is that you can be more productive and more secure at the same time. So much of DevOps is people who believe that what's good for development is good for operations and vice versa. What we're finding here is evidence that what is good for security can be good for development and vice versa. It leads to faster, more productive developers, more secure components being used in production, and happier developers as well. We know happiness is strongly linked with organizational performance.
Stephen Magill
If you're interested in reading more, please go download the report. You can send an email to sscr@muse.dev. I've got an autoresponse set up that will email you a link to the report. If you have thoughts or questions, please email there or email me or Gene. Thanks again to the Sonatype team, including Bruce Mahowald, Ghazi Muhammad, Derek Weeks, and Matt Howard. It was great working with them and they were a ton of help in pulling all this data together.
Gene Kim
Help we're looking for, Stephen?
Stephen Magill
Any additional hypotheses to test. Those can come in the form of anecdotes. Often we learn a lot from anecdotes, and that informs questions we can ask in the survey and get a broad sense of whether patterns hold more generally.
And stories about how you choose components. What components are easy? What components are hard to update? How do you take that into account when you choose new components? We want to dive deeper into what makes some libraries much easier to stay up to date with versus others.
Gene Kim
Absolutely. Describe the intuition and heuristics you use given a broad choice of components: which ones do you choose and why? We would love to know.
By the way, if you're interested in these topics, you will love the closing keynote of the conference, which is Eileen Uchitelle. She's a principal software engineer at GitHub, and she describes the heroic journey of upgrading Rails on GitHub, the seven-year journey to go from Rails 2 to Rails 5. It's an incredible story, and she has phenomenal lessons for leadership. It's phenomenal for so many reasons, Stephen. Thank you.
Stephen Magill
Thank you so much.