Perception vs Reality: A Data-Driven Look at Open Source Risk Management

Log in to watch

Las Vegas 2022

Perception vs Reality: A Data-Driven Look at Open Source Risk Management

Vice President, Product Innovation · Sonatype

In this session, we’ll present the findings of Sonatype’s new 8th annual State of the Software Supply Chain Report. Over the past year, we empirically studied dependency update patterns for thousands of open source projects, analyzed hundreds of survey responses, and took a critical look at commonly-held beliefs about effectively managing security risk.

Our research has uncovered a vast chasm between perceived security and reality, a number of new trends in open source consumption, and surprising benefits to certain development team structures. Come see which practices are backed up by data and learn how to efficiently manage your open source software supply chain.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

All right. The next speaker is Dr. Stephen Magill, who I've known for many years. We have a common love of functional programming languages, and we got to work on such cool projects, including the early work on the Software Supply Chain Report. I'm going to let Dr. Stephen Magill tell us all about that.

[Walk-on music plays.]

Stephen Magill

Great. Thank you, Gene. It's so great to be here. It's great being back in person at DOES. As Gene said, I'm Stephen Magill. I'm going to talk to you about perception versus reality in open source risk management.

This is really the latest in a progression of research that started back in 2019, when Gene and I first started working with Sonatype on their Supply Chain Report. Sonatype had been publishing this report for a few years. They were looking for a fresh perspective, looking to do some deeper analysis. Gene and I got to dive in, look at all the data they had available, think about what questions we wanted to ask, what hypotheses we might be able to validate or invalidate with this data, go find new data sources when we had gaps. It was just a really fun experience.

It also established this expectation that with each year's report there would be some sort of deep dive into some new area of risk management, trying to learn something about practices, what's effective, what's not, and how people are managing their supply chains. That's continued with this year's report. I'm really excited to announce that the report is out now, today. It's launching alongside the conference and this talk, and I'm going to share with you some of the findings, some of the highlights that I think will be most relevant to everyone here. But I'd encourage you to check out the report to see the rest of the details.

All right, so I'm going to start with a couple of definitions. I'm a scientist, so I love definitions, but there's only two we're going to cover here. There's production. This is the process of producing open source. This is the open source maintainers and developers, the individual contributors that are producing those libraries that then feed into the consumption side, which is the development teams at your organizations. It's everyone who's pulling in this open source and using it as the basis for their business applications.

On the production side, the first perception that I hear a lot about is that open source is risky. You see this in a lot of different ways. In government, in some areas of government, you couldn't even really use open source until recently. A lot of organizations have onerous approval processes and security reviews that kick in when you want to bring new open source components into your development pipeline.

I think the origin of that are statistics like this: 35% of releases are vulnerable. This is on Maven Central, and this is true. Over a third of releases sitting out there on Maven Central have a known vulnerability in them. That's 3.5 million vulnerable releases, just to give you an idea of scale. It's not like those are just sitting on the shelf not being used. There are 1.2 billion downloads of those vulnerable components each month.

When you see things like that, it's easy to understand why people might get worried about building software on open source. But the reality is that open source software can almost always be used securely. What do I mean by that? Ninety-six percent of projects have a safe version available. If you go look at those vulnerable releases, for the vast majority of projects, those are old versions that have been supplanted by newer versions that patch those vulnerabilities. If you're using those projects, there's something you can move to to get rid of that vulnerability issue.

Most of those vulnerabilities in those projects are patched before the vulnerability becomes public. That's even happened for Log4j. The Log4Shell vulnerability was privately communicated to the Log4j dev team. In about two weeks, they had a patch available. That patch launched before there was broad public awareness of the vulnerability. There was less time between the release of that version and publicity around the CVE than would have been ideal, but there was a fix available from the beginning. The whole remediation scramble largely took place on the consumption side.

What do we see on the consumption side in terms of perceptions and reality? It's basically the flip of what we saw on production. There's a perception on the consumption side that we're pretty good at this. We've got a handle on this. We know what to do. We're doing okay.

Why do I say that's the perception? It's because we do a survey every year as part of the Supply Chain Report, where we poll developers, managers, various people involved in software development at enterprises. We ask them a number of questions about their practices and the results that they're seeing. This year we got 662 responses, and the responses came back overwhelmingly as, "we got this."

Remediation in particular: we ask about various categories of software development practices. This idea of remediating vulnerabilities and pushing out a fix across your application portfolio, people consistently report that that is the area where they're strongest. The belief that we're good at this has just been growing year over year. It's an even stronger, more mature view of remediation than we even had last year.

You might be thinking, well, I bet Log4j opened their eyes. However, this survey was actually conducted after Log4j happened. This is the view that people are reporting now, even after that incident. So if nothing else, we're a very optimistic bunch.

What is the reality? What we see in reality is mixed results. It's not as good as you would think from the self-reported data. What we see when we look at what's happening in reality on an application basis is 68% of applications use some component that has some vulnerability. Some of these, hopefully lots of these, might be low-severity vulnerabilities. I'm not saying 68% of applications can just be owned right now. But that number is very much at odds with what we see in the survey data, which is 68% of individuals report that they are confident they're not using any vulnerable versions across any of their applications. Those two numbers just don't line up.

What's really interesting is the sort of rose-colored glasses that the survey respondents are wearing when they answer these questions. They're even rosier for management. If you look at managers' responses, they were 3.5 times more likely to say that their organization could quickly remediate vulnerabilities. That's a big problem because these are the people who are making decisions about funding, what to focus on, what areas need improvement, maybe new tools and processes to put in place. If they don't have accurate data about the state of the organization when it comes to open source risk, they can't make good decisions.

I said it's mixed, and I've said a lot of negative things. What's the positive side of consumption? It is that we can do a good job when we need to. Log4j got a lot of press, and we saw very good remediation behavior in response to that. This chart is showing remediation curves for three different vulnerabilities. One's the Log4j vulnerability, that's the blue line. There's SpringShell, that's the green line. And then another vulnerability that didn't get a lot of press is the orange one. You can see a remediation response that goes along with the amount of attention these vulnerabilities are getting. So the ability to remediate quickly is there; it's just not being applied broadly across the board.

What that results in is this picture. This is the one-slide summary of why open source risk currently is primarily a consumption problem. If you're struggling to communicate to people at your organization that open source maintainers do a good job, the problem is how we're consuming open source, take a picture, bring this back, show this to them. At the left here we have just a small sliver of projects being vulnerable. These are projects that have no current patch for a known vulnerability, a very small percentage there. But on the consumption side, again, the majority of applications have some sort of vulnerability that they're pulling in.

One reason for that is the chart on the left is showing individual projects, and the chart on the right is applications that use, in general, a lot of open source. We see an average of 150 dependencies being pulled in and built up as part of an average business application. If you think about the fact that each of those dependencies releases on average 10 versions per year, that's 1,500 updates that you have to consider applying for one application. That's a huge workload, and it's not surprising that this allows vulnerability to creep in. If some percentage of those 1,500 updates are security relevant, they're patching a CVE, you don't have to miss many of those before you have a real security problem.

What we wondered is, could you help with this by picking better components? In particular, if you could pick components that are less likely to have a vulnerability, then maybe you don't have as much to keep track of. If you fall behind a little bit, you're in a better place from a security perspective.

To do this, we went to the OpenSSF Security Scorecard data. I really like this system, and Brian from the Linux Foundation gave a great talk on it yesterday, where he talked about the origins and what they're trying to do with Scorecard. I definitely recommend checking that out if you're interested. What this is doing is measuring for open source projects the extent to which they implement development best practices. Things like: are they using code review? Are they applying fuzz testing? Are they releasing signed versions of their software?

We looked at this Scorecard, paired it up with vulnerability data, and applied some machine learning to basically determine: if you look at projects that have a clean vulnerability history, they haven't had security issues, and you look at their Scorecard results, what patterns are there? What Scorecard results are associated with good security outcomes? And how well can we do that labeling? How well can we predict from Scorecard data whether a project is likely to have a vulnerability?

We were able to do that with 78% accuracy, which is not perfect. But there's a lot of randomness in the security review process. Different projects get different amounts of scrutiny. Whether a security researcher manages to bang on a project in just the right way to discover some vulnerability, that's a fairly random process. So that 78%, that's encouraging to me. That tells me that there is some signal in this Scorecard data.

Then we looked at the model to say which of these Scorecard attributes, which of these checks, are most important for security. Code review came out as the most important practice, which I think is just amazing and great because code review has for a long time been recognized by the development community as probably the best thing you can do to improve your code quality. To see that validated in this study is really great. This might be the largest-scale validation of the security relevance of code review. We analyzed over 30,000 projects in putting this data together, so there's a lot of data feeding into this.

Then we wanted to think about, this is really interesting, the Scorecard data is interesting, evaluating projects based on that seems to be useful. How can we get this out to the world in a way that everyone can look at it and participate in making this better?

What we did is we built a slightly larger model. We took Scorecard and threw in mean time to update, which is a metric that Gene and I developed in 2019 when we worked together on that year's report. From that model we extracted a score. That score is now available in Maven Central and on OSS Index for projects where we have the data to generate it. You can go and see how a project is scoring. There's a form for feedback. If you see something weird or you have ideas for how to improve this, let me know. Fill out that form.

It's not perfect. We get 86% accuracy when we add MTTU to the mix, but it definitely tracks vulnerability in a way that I haven't seen other quality metrics track it. If this is a graph of the results, you can see over at the left is the low end by this safety rating. If you look at the bottom 20%, they're mostly vulnerable. There's a few blue dots in there, those are non-vulnerable projects, but by and large, it's red. On the right-hand side of the graph, the top 20%, it's overwhelmingly blue. We're working on improving the accuracy of this. We have some ideas for that. We're working on pulling in other measures of project quality. Like I said, I view this as a community effort. That's why we're putting it out there. If you have thoughts, please engage on that.

There's all this information I just presented, things we found about the state of the world and the consumption-side problems in particular. So what can people do? On the production side, implement the Scorecard best practices. Those are clearly useful, clearly important development practices. And keep your dependencies up to date. Those are two practices, and whether you're an open source producer or your team in an enterprise producing software, these are just good software development practices.

On the consumption side, choose projects with a high safety rating, and let us know how that works out for you. Let me know if you find that valuable. I love that I can give that as a piece of advice because in previous reports we've developed things like MTTU and said these are really useful metrics to consider when you're choosing and deciding what open source to pull in. But we haven't had a way to get that out there. These were hard to compute, and we didn't have a central store of these results. It's great that that's out there now and I can point to it.

Use tools to flag and fix vulnerable libraries. Another finding from this year's report was we did an analysis of the effectiveness of dependency-management tooling and really did find a strong effect there on remediation behavior.

Then get a realistic view of your organization's performance. There's a lot of managers in the audience here. Go back to your organization and think: do I have an accurate view of where we are from the perspective of open source risk? What questions could I ask to determine if my view of the state of the world at our organization is accurate? You really need that accurate view to make the right decisions, and it sort of worries me that there's this disconnect that we see in the survey data.

Thank you.