Presentation by Dr. Stephen Magill & Gene Kim
Presentation by Dr. Stephen Magill & Gene Kim
Chapters
Full transcript
The complete talk, organized by section.
Dr. Stephen Magill
Hi, I'm Stephen Magill. I've been doing academic research in software analysis, security, and programming languages for more than 15 years, first as part of my PhD work at Carnegie Mellon, and then at other universities and industry research labs.
Over the last few years, I've gotten more and more interested in the practice of software: open source development practices, how enterprises approach software, how they use open source, and really how to best contribute to these communities by improving tools and practices. I'm going to be talking about some really cool research Gene and I have done in that space.
Gene Kim
Awesome. My name is Gene Kim, and I've been studying high-performing technology organizations for 21 years. One of the funnest projects I've gotten to work on is the State of DevOps Report, with Dr. Nicole Forsgren and Jez Humble. This was a cross-population study that spanned over 36,000 respondents over six years, and it gave us a great glimpse into what high performance looks like, and what behaviors create high performance.
In this work, we're going to be looking at open source. We're going to be looking at usage of open source, and it's probably not a surprise to hear that almost everyone is using open source. Nat Friedman, CEO of GitHub, claims that 99% of new software projects include open source components. This means they inherit any open source vulnerabilities. So if you're using open source, you want to be remediating those vulnerabilities as they arise. Even better, you'd like to stay ahead of that vulnerability curve and avoid being in this position of having to remediate. So how do you do that?
You have to consider that when you use open source, you're not just importing some code; you're adding developers to your team. Those open source project contributors become contributors to your software. This will help you. This increases agility, imports great security practices -- or will it? Will it import great security practices, or is it going to be holding you back? Is it going to be a source of vulnerability? What are these extra developers bringing to your team?
That's what we dove into: what practices lead to good security outcomes, and what should you look for when you're choosing these open source components? By the way, Stephen and I were sitting in the GitHub Universe session. We heard Nat Friedman say that, and we're like, "When you invite these developers into your house, are they going to actually help you build your kitchen or are they going to trash your kitchen?" There's other ways to tell.
I mentioned the State of DevOps research, and it was such a fun study because it linked cultural aspects with technical practices and architecture. In terms of setup, when some friends reached out to me some years ago to talk about this potential research project and being able to look at the data from the Maven ecosystem, I jumped at the chance because it was a way to look at what update behaviors looked like in the wild. For those of you who don't know what Maven is, Maven is to Java what npm is to JavaScript, what PyPI is to Python, RubyGems is to Ruby. As someone who benefits so much from the Java ecosystem and Maven, because my favorite programming language is Clojure, it was an opportunity that was irresistible. When this came up and I looked at the data set that was being made available to us, I immediately reached out to my friend Dr. Stephen Magill and asked if he'd be willing to collaborate and jump into the data and see what we could learn.
Dr. Stephen Magill
While I spend most of my time in the Haskell ecosystem, it certainly gives me an appreciation for functional languages and the role that Scala and Clojure play as these functional languages targeting the JVM. It was super exciting to dive into this space and see what we could find in this wealth of data.
The key questions that we asked were really twofold: some about the open source side and some about the enterprise side. On the open source side, we wanted to know: how do these projects manage their own supply chains? The nice thing about open source is all the dependencies, all those transitive dependencies, are open source as well. So you have a great amount of data that you can aggregate, label, and do analysis on. That was our focus in the first part of this work. We wanted to look not just at summary statistics on how many projects are out there and how many vulnerabilities, but at deeper analysis that describes the behavioral patterns we see, and ultimately leads to guidance and insight for people who consume open source.
On the consumption side, we wanted to look at how enterprises manage their open source supply chains, what governance tools and practices they employ, and what impact that has on security and productivity. In the course of looking at both sides, we observed some really interesting facts about the value of working together, having enterprises deeply involved in open source, and welcoming community contributions back into corporate open source projects as well.
As Gene said, this all started two years ago when we learned about this data, learned about what you could learn from looking at Maven Central data and the Maven ecosystem, and that led us to do this deep dive on update and security practices. That turned into chapter three of the 2019 State of the Software Supply Chain Report. We started that work in 2018 and published the report in 2019. We had such a great time with that research that we teamed up with Sonatype again to do further analysis for the 2020 report. This time, in 2020, we focused on the consumer side of the equation, looking at how enterprises manage those supply chains.
First, we're going to summarize the 2019 results. In that work, we looked at Maven Central. There's really an amazing amount of projects, components, and data out there: 310,000 components, 4.2 million versions of those components, so individual JAR files. Each has its own vulnerability history, its own dependency graph, and its own API changes with each version, which can cause problems. Almost 7,000 also have associated GitHub repos, which gives us a whole other set of data that we can correlate against, including metadata on team size and commit frequency. We can look at individual commits and the history of code changes.
Of these components, about 9% had a vulnerability associated with them. But when you look at the dependency chain and those transitive dependencies, that increases to 47% of components having some sort of vulnerability that impacts them during the period when that component is the current version.
Gene Kim
This is how we wrote up the findings, and this went into chapter three in the State of the Software Supply Chain Report. I really want to thank Dr. Magill and the amazing team at Sonatype who helped make this happen: Bruce Mayhew, Ghazi Mahmoud, Kevin Whitten, Derek Weeks, and Matt Howard. It was so fun to take a look into the components that some of us use every day that are all leveraging the Maven ecosystem.
One of the things that we looked at for the first year, and pulled into the second year, is that we took the State of DevOps reports and the IT performance metrics and started to think about what the analogs might be in the open source community. The IT performance metrics are deployment frequency, code deployment lead time, mean time to repair, and change success rate. We linked them to release frequency -- when open source projects release a new version; organizational performance, potentially as measured by number of GitHub stars, forks, or downloads per day within Maven Central; and mean time to restore, looking at how long it took to remediate vulnerabilities when they're disclosed through something like a CVE. This shaped our thinking about what dependent variables to look at: we would lay out all the independent variables and see which ones perform better.
Dr. Stephen Magill
We structured our research around a number of hypotheses. The first was really about the "is faster better?" question. We find that to be the case in the enterprise, in the State of DevOps research, in the Accelerate research. Would we find the same thing in the open source community? We did. We found that projects that released more frequently were two and a half times more popular in general. If people using your open source software is your goal, which I think it is for most projects, they're having better outcomes in that space. They also had larger development teams, more contributors, more active projects. They were also more likely to be foundation supported. And they were more secure: the fastest 20% of projects by release frequency also update dependencies 18 times faster than other projects. That update cadence correlates strongly with security outcomes. A range of great outcomes result from releasing more frequently, moving faster with your deployments on the open source side.
Gene Kim
Hypothesis number two was the notion that projects that update dependencies more often are generally more secure, and we found that this was indeed the case. Those that were most secure tended to update 1.5 times more frequently. They had 530 times faster remediation times, and they were 173 times less likely to have out-of-date dependencies. That means they were not only updating themselves, but making sure all their dependencies were up to date.
A great example of why this is so important is the PrimeFaces vulnerability that came out in 2017. It turns out that the issue was actually fixed years before. If you had just stayed current, you never would have had this vulnerability. But those who didn't suddenly were taken over by Bitcoin miners and had to do drastic things to update those dependencies in production. This validates the notion from Jeremy Long, founder of the OWASP Dependency-Check project, that one of the best ways to stay secure is just to stay up to date on your dependencies, and that means updating them in daily work.
Dr. Stephen Magill
As part of that staying-up-to-date process, you want to make sure you're pulling in dependencies that are themselves good about staying up to date. What do you look for? One factor is general update frequency. How often do they release? Update behavior and remediation behavior tend to track each other, and release frequency is correlated with both of those. Really good outcomes come from projects that release more frequently.
Further, projects with larger development teams and higher code commit rates are also generally better at keeping their dependencies up to date. The top 20% of projects by size had 50% faster update times, and they released 2.6 times more frequently. They were also more likely to be foundation supported, which reveals that foundation support is also an important aspect and something you can look for that influences project quality.
Gene Kim
Hypothesis number three seems pretty obvious: we thought that projects with fewer dependencies would have an easier time staying up to date. It makes sense that if you have a smaller surface area, it will be easier. That turned out not to be true. Components with more dependencies have better mean time to update; they updated their dependencies faster.
Stephen pointed out a startling observation in the last finding: the most secure dependencies were more popular and had more developers. That means there's a link between the number of dependencies and the number of active developers as measured by people making commits in a given month. It brings up the question: does increasing the number of dependencies cause you to have more developers, or when you increase the number of developers, do they tend to pull in more dependencies? But it was surprising. It turned out not to be the case that fewer dependencies were better; those that have more dependencies are the ones with better MTTU.
Dr. Stephen Magill
Even more surprising, our most surprising finding from that year's research was hypothesis four. We expected popular projects to generally be better about staying up to date and more secure on average. We found that was not the case at all. We found a lot of really strong statistically significant differences between different factors and their impact on security and update performance. Popularity was one of the few that did not. If you take one thing away from this part of the talk, let it be that you can't just lean on popularity as a proxy for quality.
Gene Kim
One unsettling thing is that in my own personal experience, whenever I want to solve a problem and look for a component to solve it, generally I use this heuristic: I look for the project with the most stars and forks. It turns out that's a very bad heuristic. The question becomes what heuristic should we use instead for what open source components to use?
Dr. Stephen Magill
Looking at release frequency is one of the key things you can easily access and evaluate when you're looking at projects. A popular project that's not releasing frequently is falling behind in some respect when it comes to its transitive dependencies.
Given the importance of updating and staying up to date, why is everyone not completely up to date? Why are all dependencies not brought to the current version on every release? We looked into this.
Gene Kim
One of the best papers that describes just how problematic staying up to date is comes from an amazing paper by a group of researchers in Brazil. They monitored 400 open source projects for 116 days, and during that period they detected 282 potentially breaking changes. We did the math: the breakage rate is sufficiently high that, given enough time, your probability of having some breaking change is approaching 100%. This resonates with anyone who's had the experience of being afraid to update dependencies. It suggests you're afraid for a reason: updating dependencies so often breaks your code. This explains one reason people don't update dependencies in daily work: it's potentially problematic.
After we saw this startling data, we put together a survey to understand the psychographics of higher performers that update dependencies in daily work. We found that we could cluster them into a high-pain cluster and a low-pain cluster: organizations that associated updating with high levels of pain and organizations that did not. The low-pain cluster was two times less likely to consider patching painful, 10 times more likely to schedule updating dependencies as part of daily work, six times more likely to strive to use the latest version, whether N or N-minus-one, 11 times more likely to have some process for adding dependencies, 10 times more likely to have a process for removing problematic dependencies, and 12 times more likely to use automated tools to enforce policy around updates. When you see multiples like this, you know there's something very different between the clusters, much like we saw in the State of DevOps research. This focused where we wanted to go in this year's study.
Another thing that guided and excited us was looking at the data to see what migration behaviors are from version to version. For Hibernate, the arcs show every migration from a source version to a destination version. What you see is almost two distinct populations: one on the left and one on the right, and they almost never meet. The extent of the changes is so vast that you tend to stay on one island versus another. This is what you don't want to see, because to stay up to date you have to take a painful change. You have to jump the chasm. This suggests that some people stuck on the island on the left will never make it to the right, and once security patches stop, they'll be forever left behind.
Dr. Stephen Magill
This made us realize and appreciate that there's a difference between components. Some update friction is due to component choice and the choices certain libraries make about when and how they migrate from one API to another. In contrast to the Hibernate example, the Spring Framework graph looks very different. There are not two separate populations. You see everyone is able, from whatever past version they're at, to successfully migrate to the latest version. There's a higher density of arcs landing in latest or almost-latest versions. Clearly the framework is structured such that it's easier to update, and the community as a whole tries to stay on the current version of the software. You do see some red arcs: people stuck in vulnerable versions of the library, making progress toward recent but not all the way.
Gene Kim
The red arcs indicate where the target state ended up in a vulnerable component. They updated, but they updated to one that was actually insecure.
Dr. Stephen Magill
Maybe they should have jumped a little farther. Another archetype is Joda-Time. It shows this homogeneous set where people can upgrade seemingly from any version to any other version. Our hypothesis is that these versions have very few breaking changes; you can go to any version and it's not going to break functionality. When I think about what it would look like for organizations to easily, quickly, dependably, and reliably switch from whatever version they're on to the latest version, this is the sort of distribution we would like to see. We didn't have the chance to explore that as fully as we wanted to, but it will hopefully guide future research.
We see the role of security vulnerabilities in pushing update behavior too. In Joda-Time, there are no vulnerable versions involved, so update behavior is driven more by wanting to update or needing new features. In Spring Framework, there are vulnerabilities against older versions that are pushing everyone forward. In Joda-Time, there's a more uniform distribution of versions in use.
Next, in the 2020 report, we wanted to focus on the consumer side of the equation: enterprise usage of open source, and what practices are associated with better security, compliance, and productivity outcomes. We interviewed over 500 developers -- actually 528 developers -- from a range of companies across all industries, asking about their security and performance outcomes as well as their DevOps practices, to see which practices contribute to high achievement in these areas. We asked questions like: do you centralize your CI infrastructure? Do you automate software governance? Do you contribute to open source? Are you confident in the security of your deployed applications?
The hidden question we were trying to answer was: can you have it all? Can you be more productive and more secure at the same time? Can you simultaneously advance the objectives of security and development? Can you have both? The answer is yes.
We found not only that you can achieve both good risk management outcomes and high productivity gains simultaneously, but that a remarkably large percentage of the companies we surveyed are managing to do just that. To see that visually, I want to describe this diagram, the centerpiece of this year's report. It shows all the companies that we interviewed plotted on a 2D grid: the X axis is self-reported developer productivity, and the vertical axis is risk management outcomes. In the upper right is a group of companies that performs extremely highly on both dimensions. Not surprisingly, these companies tend to adopt core DevOps principles of automation and consistency.
In the lower left are companies at the opposite end of the spectrum, with poor productivity outcomes and poor risk management scores: DevOps Padawans, early in their journey. In the upper left are the stereotypical risk-averse companies that focus solely on risk mitigation and achieve it via mostly manual and inefficient workflows. They attain good security outcomes, but at the cost of productivity. In the lower right are the move-fast-and-break-things companies that prioritize productivity above all else, often at the cost of risk management.
The coolest thing is the colors. They are colored not just based on the quadrant, although it looks like that. We didn't directly measure productivity and risk management. We asked 11 questions about various aspects of risk management and productivity, clustered companies in that 11-dimensional space, and then projected down onto these two dimensions. It shows the importance of these two meta-dimensions: everything collapsed into a spectrum of productivity and a spectrum of security and compliance.
Gene Kim
The question that always comes up is how did we pick the clusters? As Stephen touched on, we didn't. You plot these in 11-dimensional space and create centroids to minimize the distance between clusters, then project them into 2D. When we saw this graph, it was one of those exciting aha moments. It beautifully explains the behaviors we see.
Dr. Stephen Magill
The other thing we can do now is dig deeper. Not only are these clusters very well defined and segmented, but we can compare across clusters and ask what the differences are in practices from low performers to high performers within these groups. First, focusing on the high performers versus security-first: they're both achieving great security outcomes, but why is one achieving substantially better productivity than the other?
Gene Kim
Very much like we did with the State of DevOps Report, we started to measure the difference numerically. When you look at the high performers against security-first, they're 50 times more likely to be using some sort of software composition analysis tools. They're 77% more likely to automate the approval, management, and analysis of dependencies. They're a third more likely to enforce governance policies within their continuous integration system as opposed to some sort of manual process. It wasn't that they were all centralized or all distributed; that was something that didn't pan out. They're 51 times more likely to maintain some sort of centralized bill of materials, and 96% more likely to centrally scan all deployed artifacts for security and license compliance.
The goal of science is to explain the most observable phenomena with the fewest number of principles, confirm deeply held intuitions, and reveal surprising insights. I think this absolutely does that. Stephen, does that resonate with you?
Dr. Stephen Magill
Yeah, that's right. It really speaks to the importance of automation and uniformity and establishing these consistent workflows in achieving these great outcomes productively.
Gene Kim
One more color commentary is that when I see these, what it says is that security objectives are being integrated into developers' daily work. It's integrated into tooling and automation, and there's obviously some centralization in terms of consolidating the best known knowledge of how to do that.
Dr. Stephen Magill
Now I want to compare the high performers and low performers. These are the two extreme points in terms of the clustering, and this is where the differences get stark.
Gene Kim
The stats here: 15 times more frequent deployments, 26 times faster detection that vulnerabilities exist, 26 times faster to remediate those vulnerabilities, six times more likely to have developers be productive when switching teams -- actually get up to speed faster because there's more uniformity in the software development process. This is our way of exploring whether teams are standardizing, where teams have a high degree of portability, or whether they're all creating bespoke ways of doing things that make it very difficult for developers to switch between teams. And there are 26 times faster approvals to use a new open source dependency. This is the notion that you can create processes to add new dependencies without being burdensome and slow.
Dr. Stephen Magill
The groups clearly have different focus areas, but I want to close back up and look again at the full data set. This graph shows the centroids of each group. When you look at just the centroids, you see that on average, the security-first cluster is not only more secure than the low performers, but also slightly faster, slightly more productive. The productivity-first group is on average still less productive than the high performers, but it is achieving a slightly better security outcome. Each group is a stepping stone toward that goal of high productivity and good risk management. You can get there by starting with security or starting with productivity, but everyone wants to trend into that upper-right quadrant.
Gene Kim
This is such a cool treatment of the clustering data. I've never actually seen contour maps used to see where we reside in the clustering spaces. That's awesome.
Dr. Stephen Magill
As a final bonus, we looked at something that we called open source enlightenment. This was a subset of the questions that touched on aspects of not just usage of open source, but support and involvement in the open source community: things like executive support for open source and contributions back to open source. We combined those into one factor called open source enlightenment.
One thing we found is that support for open source and involvement in open source leads to substantially higher job satisfaction as well as better security outcomes. The job satisfaction makes sense. People get community support. Those are positive organizations. They clearly have great principles and care about their employees' engagement in the broader community. The security outcomes were surprising, but on reflection, it makes sense. When you're deeply involved in an open source project that you're using, you'll be more quickly aware when vulnerabilities are discovered there. When you need to move to a new version of that project, it's probably easier because you're more familiar with that code base. It really does pay not just to make use of open source, but to get involved in open source.
Gene Kim
You probably have a better sensibility of what's coming because you're more aware of the roadmap. One surprising thing was that we thought a counter-marker of performance would be the extent to which organizations have to maintain an internal fork of an open source version. That didn't pan out, and I think we misworded the question. The intent was to find people who are stuck on an island and having to backport security patches. But of course, in order to contribute, you have to be maintaining an internal fork at some point, even if only for a couple of days.
Dr. Stephen Magill
Let us know. Come tell us how you engage with open source and how you manage that in your internal workflows, because I think that would help us inform next year's survey for sure.
Gene Kim
The summary finding is that you can actually be more productive and more secure at the same time. So much of DevOps is those people who believe that what's good for dev is also good for operations and vice versa. What we're finding here is evidence that what is good for security can be good for development and vice versa. It leads to faster, more productive developers, more secure components being used in production, and happier developers as well. We know that happiness is also strongly linked with organizational performance.
Dr. Stephen Magill
If you're interested in reading more, please go download the report. You can send an email to sscr@muse.dev. I've got an autoresponse set up there where it will email you a link to the report. If you have thoughts or questions, please email there as well, or email me or Gene. Thanks again to the Sonatype team, including Bruce Mayhew, Ghazi Mahmoud, Derek Weeks, and Matt Howard. It was great working with them and they were a ton of help in pulling all this data together.
Gene Kim
Help we're looking for, Stephen?
Dr. Stephen Magill
Any additional hypotheses to test. Those can come in the form of anecdotes. Often we learn a lot from anecdotes, and that informs questions that we can then ask in the survey and really get a broad sense of whether these patterns hold more generally.
And stories about how you choose components. What components are easy? What are hard to update? How do you take that into account when you choose new components? We really want to dive deeper into what makes some libraries much easier to stay up to date with versus others.
Gene Kim
Absolutely. Describing what intuition and heuristics you use, given a broad array of components to choose from: which ones do you choose and why? We would love to know that. By the way, if you're interested in any of these topics, you will love the closing keynote of the conference, which is Eileen Uchitelle. She's a principal software engineer at GitHub, and she describes the amazing heroic journey of upgrading Rails on GitHub, the seven-year journey to go from Rails 2 to Rails 5. It's this incredible story, and she has some phenomenal lessons for leadership. It's phenomenal for so many reasons, Stephen.
Dr. Stephen Magill
Thank you.
Gene Kim
Thank you so much.