Staying Out of the (Bad) Headlines: Keeping Attackers Out of your DevOps Toolchain

Log in to watch

Europe 2021

Download slides

Staying Out of the (Bad) Headlines: Keeping Attackers Out of your DevOps Toolchain

Daniel Nurmi

CTO & Co-Founder · Anchore

Paul Novarese

Senior Solutions Architect · Anchore

DevOps lets developers innovate faster. But some normal DevOps processes can create the opportunity for bad actors or dangerous code to enter your DevOps toolchains and your software applications. Where are the security risks and how can DevOps teams prevent attacks without slowing down delivery? We’ll provide some easy tips and best practices to secure your toolchain while keeping your development moving.

This session is presented by Anchore.

Chapters

Full transcript

The complete talk, organized by section.

Daniel Nurmi

Hello. Thank you all very much for joining the session today. We're going to be talking about supply chain security. My name is Daniel Nurmi. I'm the CTO and co-founder of Anchore, and I'm joined here today by Paul Novarese, who's our senior solutions architect, also here at Anchore. Anchore is a company that provides users open source technology products and services that really are targeted at enabling users to bring continuous security and compliance enforcement directly into the DevOps toolchain. For the presentation today, we're going to be focusing in on a topic that's been very prominent recently in the security space, which is really centered around supply chain security.

First, we're going to be discussing what supply chain in the context of software really looks like, and why malicious actors are starting to target supply chain systems in order to implement some of their successful attacks. We'll start off by that discussion, and then once we get through that piece of the presentation, we'll move over to have Paul then jump in with some practical recommendations and examples of how we can bring security and compliance enforcement into your DevOps tool chains today to try to prevent some of these attacks. So, starting with a lot of what we've been seeing these days, in terms of the bad headlines, right?

We see a lot of really prominent and serious security incidents over the last several months. Namely, some of the more prominent ones, SolarWinds was definitely something that caught the attention of the security world. And in addition, more recently, there's another one from another company called Codecov. And we see a lot of these incidents using a phrase associated with what's going on, and that's supply chain risks, supply chain security, and a number of customers that are impacted by these events. And we wanted to have a discussion about really exploring what that means and what's happening in these incidents.

And we can start with this kind of typical iceberg view, and that what we're seeing above the surface is a lot of discussion about what happened, the sort of top-level hack and the exploits and things that are compromising customers and pretty large enterprise organizations associated somehow to some other element in the so-called supply chain. And really, though, if we look into what actually started the whole incident, a lot of those attacks, especially in the supply chain security world, are targeting software suppliers and/or open source dependencies, something very much earlier in the story than where the attack actually happened, where the damage was caused.

And we can see this by evaluating what is a supply chain and look at it from a couple of different perspectives. The first perspective is from the consumer view. That is any organization that is actually deploying production software. We might have something internal that we're running in production, or it might even be internet facing, might be a website, an application, whatever that may be. At the end of the day, that's the software that's executing, and we call those the consumers. And from a consumer's perspective, they know, "This is my application that's actually running in production.

And in order to get that application to run, I depend on a number of different software elements that are all composed together to provide that production environment." Those can be other software suppliers. They can be open source software that comes in, that all in aggregate comes together to form the actual application. Interestingly, though, from a supplier view, if we as a software supplier look at what we have visibility into, it looks a little different and starts to expand out. And this is where we start to see a chain sort of forming.

We know as the software supplier who our consumer is. We also know that as a software supplier, we ourselves are in a sense the consumer, because we're bringing in other software from other software suppliers and open source projects to build our own software and then deliver that to the consumer. If we zoom out another level and say if we're just an observer and we're not taking from any particular perspective inside the chain, but just in general, the chain or the graph starts to look something like this, which every software supplier is themselves a supplier, and there's a lot of open source elements in here, a lot of different pieces of software.

And this is the kind of thing where even this view is pretty oversimplified. Any sufficiently complex or sophisticated application running today is going to have even more independent elements than what we can fit onto a slide. And so this is interesting from a functional perspective, and we might even look at this as a consumer or a supplier and try to derive some information from it or understand what's going on. But from an attacker's perspective, if we look at this from a malicious actor's perspective, any time we see a view like this, where there are dependencies between various elements and they all kind of funnel into something that we want to attack, i.e. the consumer application, we realize, as a malicious actor, all we need to do is compromise any of these elements.

And if we are able to actually get our malicious code or our attack successfully put into one of these environments, then that attack can actually flow all the way through to the consumer. And that would look something like this, right? If an attacker were to say, "If I compromised... this open source element or this software supplier's environment and get my malicious code in, that malicious code's going to make its way into the consumer's application, and that's where I'm going to cause the damage. And an interesting subtlety and characteristic of attacks like this is often that while the software supplier or the open source project, one of these elements in the graph, is the actual initial place where an attack takes place, oftentimes those organizations won't even see it because the attacker isn't actually trying to cause damage at that point.

They want to do that in a way that's hidden or secret because the actual attack is taking place on the consumer. And for that reason, sometimes these attacks can be around and very hard to notice for a long period of time. And we've seen that with some of these recent incidents. So what we're going to do is zoom in on, as a software supplier, we don't want to be the one to actually have a compromise that causes damage to our consumers, even if they might be several steps away. So in order to do that, let's zoom in on what a typical software supplier in a modern environment, what their infrastructure and what their mechanisms typically look like.

And it looks something like this. So as a software supplier, we're creating software. We have our own application source code. That stuff lives on the left-hand side of a typical process like this. And on the right-hand side, we have the software is ready, it's built, it's executable. Our customers or consumers have access to it, and they can pull it down and run it. But in between, we've got a number of steps, especially as we're starting to see modern systems add more and more automation, where every time an application developer might make a new feature or a bug fix, security update, that kind of thing, this whole mechanism kicks in and takes that source code, moves on to a build phase where an executable is created, and then typically there's a testing phase, some staging, where artifacts are signed and configurations are made, and finally, that element gets published.

And again, from a malicious actor's point of view, these are all elements, and if there's a weakness in any one of these, they can identify a weakness and put some malicious code into this environment, ultimately compromising that published software. And so from the malicious actor's perspective, it looks something like this. If any of these elements are weak and can be compromised, there's a number of different actual mechanisms or methods that can be used to form a successful attack, which ultimately results in that supplier software being compromised, thus the consumer being compromised.

Now, when it comes to containers, specifically when we're looking at software suppliers and other organizations who are starting to really leverage container technology in order to facilitate this automation of taking new application code and building something deliverable and executable by the customer, containers are actually a very compelling way to do this. Functionally, you get a lot of power by being able to take your application code, bundle all of its elements, its dependencies, its data, maybe oftentimes even a fairly full-featured operating system, all inside of a container image which encapsulates your application and gives it everything it needs to execute.

Very, very convenient and powerful mechanism to do that. But because it's not just the application code that's being shipped anymore in a container environment, this is also a potential source of security risks. And again, if we look at this from the malicious actor's perspective, we see several elements, and if there's a weakness anywhere, that's a potential avenue for getting malicious code inserted in. So when it comes to what those actual elements are, there's a couple of categories that we're going to walk through to show some practical examples of how we can protect ourselves against these types of compromises and protect ourselves and our customers from being victims of supply chain attacks.

The first category generally is something that we see a lot, and that's protecting our software from known software vulnerabilities. This is something that we should all be doing, is making sure that as we're building containers and as we're bringing in new operating system packages, new language ecosystem dependencies, et cetera, that all of that software is checked for known software vulnerabilities. Because if it isn't, an attacker might notice that maybe it's been a while since you've updated your base operating system packages, and there could be known critical vulnerabilities. Attacker might be able to figure out, well, the consumer is going to have those same vulnerabilities.

That's an avenue. The second category is really around the concept of injecting malicious code into existing software. And typically, that looks like malware or Trojan horses, where either an attacker's malicious code is inserted into an existing executable inside of a container image, for example, where every time it executes, that malicious code can be called, or a Trojan horse approach where an executable can be replaced by something that works the same way but includes malicious code as well. The third general category is a little bit more subtle, but we see this as a vector of attack, which is around software overrides.

And these things can happen at that interface in your DevOps toolchain between having source code and that source code becoming an executable. That compilation or that build step can be hijacked in funny ways. And one of the examples we see attackers using called typosquatting, and that's taking advantage of a human error, where as a developer or a DevOps, a person in that role, you might accidentally misspell the name of some dependency, like PostgreSQL. Maybe I replace or switch the Q and the L and just spelling it out in my dependencies, and an attacker might actually publish a misspelled version of that dependency, such that when I go to build, the misspelled version is brought in from the attacker rather than what was intended.

There's a number of other types of attacks that all kind of fall into a similar category. And finally, there's this notion of credentials. This is where oftentimes during a development phase or a testing phase, a container image or the surrounding metadata might contain internal organizational credentials in order to do some testing or some internal work. Might be hard-coded in the code or in configuration files. Sometimes they can accidentally be left inside of a container image, which then an attacker will watch to see all of the public stuff that's being published by an organization.

They're constantly looking to see if credentials are left in these artifacts, and if they are, they can take those credentials and then perform some other type of attack, and maybe in one of these other categories using those credentials. So I think with that, we're going to switch over to Paul, who's going to give us a demonstration and some practical advice about how to protect ourselves as an organization that's building software from some of these types of attacks.

Paul Novarese

Okay. Thanks, Dan.

Right. Now that we've seen the kind of conceptual ideas behind these types of attacks, we'll try to turn it into some practical measures that we can take to not only detect these types of attacks, but prevent them in some cases. Right. So we'll start right here. Keep this in mind. I'll refer back to this slide, these different categories of attacks. Before we go too far, though, I want to start by showing how Anchore Enterprise actually does scan these container images and keep track of the components inside. So we'll talk a lot about the software bill of materials.

When we scan an image with Anchore Enterprise, the first thing that happens is our analyzer opens up the image and builds a list of facts about the image, right? Everything we can measure or record about the image will be recorded in a software bill of materials, and that includes things like metadata, contents of the image, not only packages, in this case, Alpine packages. We're just looking at a pretty standard Nginx image here. But individual files in the image, the metadata about those files, such as permissions, the sizes, et cetera, all the way down to things like layer by layer, how the image is constructed.

Right? So this is just a real quick layer history. Once we have all that, that is recorded in our catalog, and then we will evaluate the image. And when we evaluate the image, it's a very lightweight operation we can do very rapidly, so we can do it continuously over the image's lifespan. Two things happen in that evaluation. One, we take our current view of vulnerabilities of what we know based on the vulnerability feeds we're consuming from different sources and show a list of vulnerabilities that this particular image is affected by.

Right? This is completely objective. There's no judgment here. This is just what we know, what vulnerabilities we know of that this image is affected by. More importantly, the policy compliance is where we take not only the software bill of materials, but also that list of vulnerabilities and convert this into a judgmental view of policy rules and what violations we've seen. So in this case, things like, oh, there's some Docker file instructions here that we don't like, or there's some licenses that we're not permitting. Right. So once we have those, we can give a pass or a fail score to an image, right?

And let it decide at that point whether we want it to proceed in our pipeline or whether we'll let it deploy into production or whatnot. Several different decision points we can make there. What do those policy rules actually look like? So going back to Dan's list of different categories, we can build rules for each of those categories and other things, too. And then we can measure each image against those rules. So in this case, we'll start with this first category of vulnerabilities. Right. Vulnerabilities are pretty much table stakes at this point.

A lot of tools do this kind of thing. It could be a very simple thing like looking at vulnerabilities and saying, "Hey, if this image has a vulnerability with a severity greater than or equal to critical, what kind of action would we take when that rule is triggered?" In this case, I've got a rule that says we will stop at that point, right? We will basically fail the image. Right? There's other things we could do here, though, right? If we want to look at CVSS scores instead of severities, or if we want to look at whether a fix has been published for this particular vulnerability, or how many days has it been since this advisory was published?

How many days has it been since the fix was made available?, et cetera. A lot of knobs to turn on vulnerability. So we're not limited to just the severity of a particular vulnerability. So we could build pretty sophisticated policies here measuring a couple of different things. The second category of malware, Trojan horses, et cetera, the easiest way to scan for these is to use an existing malware scanner, something like we've integrated with ClamAV. If we see that ClamAV finds something, we'll stop the image. Right? More particularly, though, we can do really detailed checks for particularly known malicious code.

So in this case, crypto miners are pretty hot right now. Crypto miners have a tendency to have extremely reliable fingerprints. So we can look for those traces and say, "Hey, we see something here." It could be something as obvious as a checksum of a particular binary that we could zero in on. Could be things like what directory structure they use. They tend to be, again, very reliable and consistent in these things. So there are a lot of fingerprints we can track. Some sophisticated attackers might change some of these, but the more things we know about it, the more likely we are to catch them there.

Right? So we can layer more and more fingerprints as we find them. Moving into the category of software confusion, there's a couple of things there. Image typosquatting, right? This would be a case where you say, "I want NGINX" and you type NGINZ, and you accidentally get a compromised image. Couple of things you can do to prevent that. The easiest thing would be to prevent people from using public repositories, like in this case, Docker Hub, but other public repositories as well, and funnel them into private repositories. In this case, I've got internal.harbor.example.com.

If someone tries to pull an image from Docker Hub, I will stop the image. If someone does not pull from this internal repository, a registry, we will also fail the image. A couple of things to note here. Attackers obviously will have a hard time pushing images into an internal registry. A lot of times this will be behind a firewall and they won't have access to it anyway, or they may not know where this repository is. So there's a couple of other things to consider, though. How do you move images once you've vetted them and checked them out into this internal registry?

There's a lot of different methods there. You can mirror things, or you can just individually have particular images approved and put in that repository. Likewise, package typosquatting is a little more focused. Instead of entire images, individual language packages will be targeted. So something, let's say, in Python as an example. A lot of methods to prevent this revolve around hardening the package managers. So in the Python case, it's pip, right? There's a thing called pipsec, which aims to help people stamp down on this. Right? In this case, I'm going to require any image that uses pip to also install the pipsec hardening alongside it.

Right? And if they don't, I can fail the image. Another common attack, not just typosquatting, but what we call dependency confusion. So this is instead of putting typo-named packages into a repository, this is a method where attackers try to get you to use a different repository than you think you're using, and then they'll put in a package that has the exact name of what you want in that repository, and you'll get the compromised package, instead of the legitimate package from the trusted repository. So again, this varies from language to language.

I'm using Python as an example here, but you'll see things like with pip, there's an index URL option that tells pip which repository to pull from. So we can do things like forbid installation from a public repo and force it to use an internal repo. And if we see somebody using this option, this index URL option to pull from a different repo, we can fail the image. Another thing we can do is look in the configuration files. Anchore Enterprise has a secret scan facility, which is really just an arbitrary regular expression search engine.

So if we see arbitrary regular expressions in files, we can generate a rule violation and block an image. In this case, I've put in a regular expression for the extra index URL option in Python in pip configuration files. The secret scan facility will come in handy in a minute when we talk about credentials, but for here, we're just looking for not credentials, but for configuration flags. If we see these, we can fail the image. Couple other things here. We can look for things like if someone is piling multiple repositories in a configuration file, we can stop them.

A few other things here. The secret scans, as we talked about, credentials. Anchore looks for arbitrary regular expressions. Out of the box, we supply a bunch of these. Things like AWS access keys and secret keys. Also things like SSH private keys. Right? So any of those that we see in a file, we can flag and take action on. So what does that look like in practice? Let me just go to another image that we've got here, and we go to the policy compliance view, and you can see I've got a bunch of violations, right?

All of these things malware scanner found, crypto miner traces, or secret scan found extra index URL for pip. Or secret scan found AWS access keys and files. All of these are causing the image to fail. SSH private keys, things like the FROM directive in a Docker file using a base image from Docker Hub instead of from the internal repository. And in this case, I actually found two of them because it was a multi-stage build. Violations here, such as the vulnerabilities. I found a critical vulnerability in a Ruby gem. Again, this is all in our web UI, but you might want to integrate this into your CI/CD pipeline.

And what would that look like? So in this case, I'm using Jenkins. I've got a build. I built this image. I pushed it to a repository, had Anchore analyze it, and we had a failure right here. If we look in the logs, we can see, oh, look, it successfully queued it for analysis and got a fail result back. What does that mean in particular? So maybe my developers need to get some feedback. Our plugin will put right here in their workspace an Anchore report that enumerates all of those violations. The same stuff we saw in the web UI.

The developer doesn't need to change tools and log into our web UI just to get this feedback. So he can see, hey, the secret scanner found this, the malware scanner found this, and that developer has a roadmap to remediate. There's also tools for pushing things into things like Jira tickets or GitHub issues, et cetera. Anything that basically has an API, generic webhooks, we can give them that feedback. Okay. So for a practical point of view, what else can we do for takeaways here? What are some quick wins to get our supply chain security a little more up to snuff?

First thing I would say is make sure everything is centralized, all your CI/CD processes are in one place, and that all of your software goes through it. Not only the software you're building, but software you're consuming from the outside world. Two, you want to do things like build images from trusted sources. That means things like using a small as possible base image, Alpine or Red Hat's UBI Minimal, and only adding the things to it that are absolutely necessary. A couple of other things here. Make sure your Docker files are tight and well-written and then comply with best practices as well.

And also, when you're building, inventory everything you're doing. So build those software bill of materials using something like Anchore Enterprise to construct those and store them so you can refer back to them later. Next tier, number three, automate your security testing and enforcement. So make sure that you're incorporating security checks at every stage of your pipeline. Scan the image first, and then once you have that software bill of materials, you can evaluate it very frequently to see if things have changed. You can push that feedback to the developer sooner rather than later.

Couple of other things. Just make sure that you're looking at the differences of these artifacts over time. So if you build up that repository of software bills of materials, you can go back if you need to do forensics and see when something bad was introduced. And then finally here, number four, deploy only trusted images into production. So don't just rely on the first scan you do. Right as you go into production, we can do a last-second evaluation and use something like a Kubernetes admission controller to make sure only images that are passing and that have been through the entire process are deployed into production.

Okay. So that's everything really quickly. I want to thank everybody for their time. We are going to take questions in the Slack channel here, Track Four, and also we have our booth at the expo-anchore channel in Slack.