Are We Forever Doomed To Software Supply Chain Security?
The adoption of open-source software continues to grow and creates significant security concerns for everything from software supply chain attacks in language ecosystem registries to cloud-native application security concerns. In this session, we will explore how developers are targeted as a vehicle for malware distribution, how immensely we depend on open-source maintainers to release timely security fixes, and how the race to the cloud creates new security concerns for developers to cope with, as computing resources turn into infrastructure as code.
This session is presented by Snyk.
Chapters
Full transcript
The complete talk, organized by section.
Liran Tal
Hi everyone, and thank you for joining my talk, "Are We Forever Doomed by Software Supply Chain Risks?"
If you joined this talk, then it means you care about software and you care about software security. Most of all, you are curious like I am: how does open source software supply chain security impact all of us -- you, me, and everyone else?
Let us take a moment to reflect on this picture. When you look at this photograph, what comes to your mind? Is it a futuristic outlook of the world? Perhaps the uprising of our robot overlords?
For me, I ask myself: how much does the robot learn about my child, and where is this information stored? Is it stored safely? I think about where the robot gets its software updates from, and can this upstream source be compromised? What can happen when it gets compromised, not just if? What is the probability that someone hacks in and can watch the entire video feed and interact with my child?
These are some of the things that keep me up at night. Today, I would like to share with you real-world stories of how developers play a fundamental role in recent and growing security incidents, why you should care about software supply chain security, and leave you to think about where you put your trust.
In case you had a doubt, we are seeing more and more open source software being developed. Year after year, open source software repositories are growing with more software footprint. The applications we build -- you and me, everyone else -- are ever-growing in their dependency on open source software. More of us software engineers are accustomed to this habit, to the ways of open source software communication and contribution, and more of us are becoming maintainers of open source software.
The growth of open source software does not come without risks. We are continuously witnessing the growth of security vulnerabilities in open source software ecosystems, like npm and Java and others. These are everything from CVE-based official reports of security vulnerabilities that need to be addressed by those of us who are using and impacted by those package versions in our applications, and other incidents like malicious packages hitting the software supply chain and targeting us as developers who rely on open source software packages.
These are my stories of open source. Let us rewind back in time to get an early glimpse of how one developer perceived the risks of open source software. In 1984, Turing Award-winning Ken Thompson wrote a short essay titled "Reflections on Trusting Trust," in which he describes how he added a backdoor to the Unix login program. Then he continued and added a backdoor to a C compiler. Then he further continued this chain of attack by backdooring the compiler that compiles the compiler.
In his revelation of how software can be taught to learn specific traits and pass them on to their spawns, other programs that they create, he explains how software can remain without a trace of a Trojan horse because of the dangers of trusting code that we did not entirely create ourselves. Anything from the application source code all the way up to the compiler, the assembler, the CPU unit, the hardware, everything else: if you cannot trust it, what do you do at this point?
As we learned by Thompson's Trojan horse story dating back from 1984, developers have been targeted as a vehicle to distribute malware and backdoors for a very long time now. Let us explore some of those more recent events.
In 2018, the JavaScript ecosystem witnessed its first high-impact, spearheaded surgical attack, targeting maintainers and developers working in open source and themselves being used as an attack vehicle to distribute malicious JavaScript code designed to decrypt itself and run in a specific environment, targeting developers of Bitcoin wallet applications. This was the well-known EventStream incident.
EventStream existed on the npm registry since 2011. Very long time. It practically did not receive any new releases in the last two years, which we will see in a second is important, but it gained millions of downloads per week. Out of the blue, a person shows up on the GitHub repository, opening an issue and wanting to help, as is customary in open source. They further contribute code with several pull requests, and then they create a pull request that adds one of the pieces of code into a new dependency, a new module that gets added to EventStream's own package dependency tree with supposedly genuine intent to improve the code base for EventStream and use this in a modular way in the npm ecosystem.
Yet a few weeks later, they add a code payload inside that dependency that they own and created and had now added into EventStream. That extra piece of code, added in a new version, injects malware into a specific Bitcoin wallet application called Copay. The Copay wallet application used EventStream as part of its build process and was now tainted by this malware. This incident had gone unnoticed for almost three months, resulting in two versions released of the Copay Bitcoin application that included the malware.
Why was it important that EventStream itself had not had any release in the last two years? An academic research paper published in 2019 investigated the properties of language-based software ecosystems. It compared the npm ecosystem with PyPI, the Python ecosystem, and found that 61% of open source packages on npm could be considered abandoned because they did not receive any release in the prior 12 months. The EventStream package would have been classified as an abandoned package in such research just as much.
A year later, as if we did not learn anything from the EventStream incident, we had the electron-native-notify incident. What happened? At one point in time, a user adds the electron-native-notify package, and it has no malware in it. Then the user adds it as a dependency to a popular package, Electron. Then the user releases a new version of that Electron package, which now includes the malware, and the result is the Agama Crypto Wallet is now built with the most recent version of the EasyDEX package that now includes the malware.
This sounds familiar? Exactly. It is the exact same thing that happened with EventStream, and as if we did not learn anything from that, these software supply chain security concerns are all around us.
The thing is, those supply chain security concerns are not just open source related. We have witnessed them extend to the mobile application store ecosystem. A Snyk-led security research effort in the CocoaPods mobile application ecosystem identified malware in the Mintegral ad SDK. The Mintegral mobile SDK is used for advertising attribution for mobile applications and was downloaded more than 1.2 billion times a month. It is integrated into thousands of applications. These are downloads from all of us. This is you and me, all of us using mobile applications from the App Store, downloading various applications from an official application store, not even open source.
What Snyk found interesting is that this SDK had a proprietary obfuscated piece of code that hooked into sensitive and careful system APIs on mobile, which it should not commonly do. This alerted a potential suspicious behavior of what was going on with this Mintegral ad SDK. What did it actually do? One of the things that we observed is it intercepted all of those HTTP requests, and it had behavior where, to avoid detection, the SDK would detect when debuggers and jailbroken devices are used, then maybe hide itself. Lastly, Snyk observed that the SDK installs a backdoor used as a command-and-control instance by remote actors.
How much thought are we giving to the security of our own development infrastructure, transcending from open source software supply chain, moving over from the mobile ad applications that we are all using, over to the development side of things: the tooling, the resources like cloud instances, your staging environments, your build and continuous integration tooling? How much security mindfulness are we putting into those services that are used to build and ship all of that application code that we are building to our customers, to our end users?
In January 2021, not so long ago, a security researcher broke into the Microsoft Visual Studio Code GitHub repository, essentially providing him with the capabilities of making code modifications to the popular and very well-loved IDE that many developers use. Due to a command injection flaw made possible because of a flawed regular expression, this allowed him to potentially open a new code pull request upon which the researcher was able to execute code that the VS Code CI scripts were running without requiring any authentication or authorization checks. All of those things happened there in that open source repository publicly; anyone can see what is going on there.
This led to remote reverse shells on the CI servers and, from there, the ability to basically gain push and write access to the repository's source code. Fortunately for us, the researcher responsibly reported this flaw to Microsoft before advanced threat actors could actually exploit it. Yet unfortunately for us, not all software supply chain attacks end up in a responsible disclosure like this. Some of them hit us pretty hard and painfully.
Such was the case with April 2021, just recent weeks ago, of the disclosure of Codecov's supply chain security incident. On April 15, 2021, it was reported that unauthorized access to a Google Cloud Storage key allowed a malicious actor to alter the version of Codecov's batch uploader script. Codecov is a code quality tool used by many developers to assess code coverage and other code quality metrics.
What we can learn about the extent of this incident from the SecurityWeek opening-note article was that security response professionals were now scrambling to measure the fallout from a software supply chain compromise of Codecov's batch uploader that went undetected since January and exposed sensitive secrets like tokens and API keys and different kinds of credentials from different organizations around the world that use this in their CI ecosystems for developers.
How did Codecov learn about the security incident involving the malicious batch uploader version? On April 1, a customer reported that the SHA-1 integrity signature used to sign the batch uploader was not matching that of a downloaded file, four months later after this was happening.
How much do we know about the current state of open source security and what it entails? In an effort to explore the security awareness of the Python and JavaScript open source communities, a group of researchers set out to investigate how maintainers work in the open source community with regard to their ability to mitigate security vulnerabilities. One of the research questions was: how quickly do open source maintainers mitigate a newly published security vulnerability, one that has a CVE and now needs to be fixed?
The research found that it takes about 100 days on average for both JavaScript and Python maintainers to scramble away and start migrating or mitigating to a public vulnerability. Is that fast enough for us as consumers of a library? Can we really ask more of open source maintainers who are voluntarily doing their best anyway?
If we examine the number of commits that mitigate a vulnerability out of the total number of commits, we can see that Python has a more consistent track of security mitigation, whereas the JavaScript community was largely inactive with this regard until 2018. This demonstrates the low levels of AppSec awareness around the JavaScript open source community in those early years before 2018.
As a case study, we can refer to Marked's own security vulnerability from several years back. Marked is a Markdown parser for the web. It is downloaded millions of times a week, one of the most popular libraries for this purpose of parsing Markdown and using it in the server-side or back-end JavaScript and Node.js ecosystems. One day, a security researcher opened a pull request that reports and fixes an XSS vulnerability, a cross-site scripting vulnerability that impacts Marked on JavaScript applications. This pull request included tests for future code regressions and proof-of-concept examples, so the maintainer would have the ability to reproduce the problem.
But as is with open source software, maintainers are really just trying to do their best. They cannot be there all the time, and there are no legal or contracting obligations for them to support you or me or any of us. They are really just trying to do their best. So this vulnerability was left out in the open with no fix for a year. No fix, but everyone knows about this vulnerability. This security issue and proposed fix was open in 2015, but was only merged and made available as an official release in 2016 for a package that gets millions of downloads a week and powers many Node applications.
When we are all so very much dependent on open source software, we cannot ignore the question of where we put our trust and what our mitigations and security controls are to cope with the risks involved.
In 2017, a security researcher working with the Node.js Foundation conducted research in which he wanted to assess the state of weak npm credentials used by maintainers and developers pushing code into the npm registry. I am going to take a long and deep breath here because his work revealed the devastating truth of developers' lack of security hygiene.
This security research was able to find how this person was able to gain publish access to 14% of npm's ecosystem modules on the registry. Some of these modules were downloaded tens of millions of times a week. They were all powering an essential key of this thriving JavaScript ecosystem.
The problem was rooted in insecure passwords. They were chosen by the maintainers of these accounts, by the contributors who have access to publish them. For example, one account used the word "password," literally the word "password," for the account password of the maintainer of a package that has millions of downloads. This is insane. This is unheard of. This is unthinkable. What could have happened if this person was not a security researcher working with the Node Foundation but was a malicious actor? I will let you ponder upon this and hope it does not happen.
But if our code packages can reach thousands or millions of developers, should we not have more protections in place? We are all citizens of this open source community. Some of us use open source software. There are developers who build open source or contribute to open source software. We are all in this world together. Can we do better for our own accounts' security hygiene?
It is one thing if we did not have support for more of this account hygiene, but we do. npm, the largest registry of open source software packages, spanning 1.5 million packages to date, has supported two-factor authentication since the end of 2017. Long time ago already. Despite all of those security incidents and compromised-account stories happening throughout the years, in 2019 only 7.1% of npm package maintainers had enabled two-factor authentication. Only 7.1%. Did it get better? Unfortunately, not enough. The software supply chain security story is not resonating enough with developers, because a year later in 2020, 2FA-enabled accounts had only grown by merely 2% to approximately 9% of developer accounts on npm.
People are not enabling two-factor authentication for critical infrastructure, for where the code lies, for the software supply chain security that compromises all of us if something goes bad. Should we really be surprised by the fact that users are not enabling two-factor authentication? Should we be surprised that they pick bad passwords too? I do not know. Let us see.
In 2012, LinkedIn suffered a massive data breach in which one analysis of the data showed that 35% of 65 million accounts in the leaked data set were accounts that reused the password from a previous leak. There was a previous leak, and they reused the same password over and over again. How bad are we talking about this problem of choosing passwords and managing them properly? More than 750,000 accounts in that data breach simply had their user's password set to the string "123456." Almost a million of the accounts on LinkedIn used that as a password. Almost 2,000 of the accounts set their password to literally the word "LinkedIn" to log in for their account. That is it. That is all you had to know to log in on their behalf.
What can we say more about this? As cybersecurity expert Bruce Schneier said in his book "Secrets and Lies," humans are often representing the weakest link in the chain.
A term coined as Linus' law back in 1999 by Eric Raymond in his work "The Cathedral and the Bazaar" shifts us toward a different story and perspective of open source security and the open source ecosystem. In this article, he explored the differences between software development as executed within the open source movement, disorganized, and that of enterprises, formally organized companies. Eric stated that given a large enough number of developers and users of a software, such as is common with open source communities, software flaws will be quick to detect. And I ask you: is this always the case?
In January 2021, it was discovered that sudo, a common utility installed on many Linux distributions, had a security vulnerability existing for years. Specifically, any unprivileged user could gain root access just based on the default sudo configuration. What is so daunting about this vulnerability was that it was hiding in plain sight for a decade. For 10 whole years, it was just hiding there until someone found it.
We have reached a point where we take open source for granted. Open source registries are open in their nature and allow developers to openly push their packages to them. We have become accustomed to opening an issue in a project source code repository, asking for help, asking for a feature. But what happens when maintainers pull the rug out from under our feet and stop maintaining a library, or worse, completely remove a library altogether from the registry?
This is exactly what happened in 2016 when a maintainer pulled tens of their open source packages from npm. One of them was a pivotal package in the ecosystem, and failing to download it resulted in widespread breakage of CI and install processes all over. At the very least, this incident showed us two things: first, the weakness in how businesses failed to manage their open source software in a responsible way, and second, how registries did not even foresee this as a problem. They were not designed to handle this kind of circumstance of a maintainer pulling their libraries from a registry. Why would they do that?
Can we really deny being part of an open source ecosystem as consumers, as contributors, maybe even as maintainers? We all play a part in a world where 90% of our application's code is made up of open source components. What kind of malicious activities and assets can we track back to open source ecosystems? Time after time, we find more and more malicious packages hitting the npm ecosystem. You may have been a victim of one of these packages if you maybe misspelled the package, known as a typosquatting attack when you try to install it, or perhaps someone planted one of those malicious packages in a dependency tree that your application is dependent on.
Malicious packages are not just a thing in the JavaScript ecosystem. In March 2021, more than 3,000 malicious packages were published in bulk on the PyPI registry. To further show us how attackers can harness this open source ecosystem and registries and package managers to their own advantage, Alex Birsan published his research in February 2021 about how he exploited design flaws in package managers, registries, and human error to infiltrate corporations such as Apple, Microsoft, and others.
I would like to leave you with the following questions to ponder. Are we going to have less or more software in the future? Are we going to use less or more open source software specifically? And who do you trust?
I am Liran Tal, a developer advocate at Snyk, where we build a security platform to help developers build securely with open source software. Thank you all for joining my talk.