Controlling your DevSecOps Journey through Open Source

Log in to watch

Las Vegas 2023

Controlling your DevSecOps Journey through Open Source

Software Engineering Principal · Fannie Mae

For highly regulated companies, it can be a challenge contributing to open source communities. There are a number of regulations and other challenges with security and data loss prevention to overcome. Furthermore, there are few products or technology solutions that enable easier open source contributions in highly regulated environments. We also noticed that all companies rely on critical open source software that is associated with some security risks, in some cases with no way to upgrade to "clean" versions. To help with this security problem, as well as to facilitate open source contributions, Fannie Mae created the Clean Dependency Project. The primary goal is to clean up critical dependencies with intractable security issues from scanning software and then contribute them to open source communities. We will discuss how Fannie Mae engineers were able to construct a reliable process for cleaning up dependencies, satisfying Information Security requirements, and launching Fannie Mae's first open source community.

Chapters

Full transcript

The complete talk, organized by section.

Raghavendra Vema

Good morning. Hope everyone is doing good.

I'm here to talk about how to control your DevSecOps journey through learnings from the open source community. My name is Raghavendra Vema. I work as a principal engineer at Fannie Mae, and I work primarily under the enterprise open source program office, supporting the activities of the company.

That being said, since I come from a highly regulated company, I'm sure many at some point might be having the same issues like what we have. Everything is locked. Code contribution is blocked. Consumption is blocked. Access is blocked. Which means there's no influence on the external community.

So that doesn't mean open source is actually scary. There are supply chain security attacks, data breaches, and other vulnerabilities that come from scanning tools. But there is also a paper that has been sponsored by World Bank, which I can share, which talks about the transparency of open source, which we'll talk about later.

In an ideal situation, developers get their open source libraries from open source, and then build platforms and use it to do their development. So that would be ideal.

Fannie Mae has the open source program office that was founded in 2021. The objective of the open source program office is to be like a north star for open source contributions, as well as consumption; also build a culture of InnerSource; and ultimately enhance the working experience.

To begin with, the way we started was to look at the open source library vulnerabilities. I can't imagine any development team that does not use open source libraries. That being said, I remember the recent report of 2023: 90% of applications are built with open source libraries. And there are significant [inaudible]. It's kind of debatable, but if time permits, we can have a chat on that.

So what we did was: how can we learn the principles of open source interactions and implement that, and open it into our company, was our main motto. The three pillars of open source, like transparency, open, and governance, what would that bring?

We also anticipated the concept of open source as InnerSource, saying the principles that are used outside the community, to bring within the company to accelerate software development.

To begin with, we wanted to start with vulnerability management. Being a company, we're obligated to make the software that we build free of vulnerabilities. The top vulnerabilities have been affecting applications. And all of these said data, and it's all common [inaudible], wherein there is a debate between the finding, where it's vulnerable, but then the library, a certain portion is vulnerable, not [inaudible]. Go use the non one.

But the way it works is, if you don't have control on those, it's difficult to have a conversation with the information security team and the scanning tool findings.

So what we did was, we picked up these vulnerabilities and started looking for a way forward. The first one is the vulnerability which, again, the scanning tool shows as 9.8 critical. And the way it stays is, depending upon the usage, it's vulnerable or it's not.

For us, fixing this vulnerability, we started looking at what is the blast radius of this vulnerability. And then we also started working on establishing an exception process in case the teams are not able to fix it, and then also a global exception. Instead of scaling team-level exceptions, we went with the approach of creating a global exception. Any team who is using this library can go with that.

Also, we started building a community to find out how effective it is to find how vulnerable, I mean, if it is actually vulnerable, by gathering information [inaudible] process for this vulnerability.

Also, we started looking into the discussions, and the upgrade to JDK 17 and Spring 6, which is a huge lift and not probably a path. And then, for the applications that were leveraging Spring Boot, we had a way to patch it. But a significant amount of applications are still non-Spring Boot, but use Spring Web and are affected by this vulnerability.

So how we got it: with this Spring, what we started doing, we started building a community within the company to bring the leadership, the key development teams, InfoSec, and then legal IP, as the actual users of the software, to come up with a consensus to roll out some kind of platform for the InnerSourcing effort that we did.

And as this was done, the next one is around the Pandas library, where the function is again a debatable thing. They're saying, "It actually is vulnerable, but don't use it." But what's the point of having that function within the library?

The response from the maintainers of that library is, "Upgrade to Python 4," which is a big deal, or, "Don't use the function," which kind of has a problem with us, saying, "How do we restrict it?"

So what we started doing is, we wanted the Pandas library internally treated like a project that's built within a company. And what we did was, we implemented the safe [inaudible] and distributed a patch, which is like a golden patch that can be used internally for testing. And then once it's approved, we can probably productionalize it.

So in this model, what we found out is, we [inaudible] the open source project and then treat it as an internal project that's being done. So for the internal projects, we've been very safeguarded. It has to go through [inaudible] and put in place.

So based on this, what if we kind of move this golden patch to be open source, so people who are in a similar situation can use it? And that's when we started creating this internally. We started looking at how we can open source it and also build the patch.

The internal patch was pretty much in line with the software development process that is in place. So it was not a big thing. But to enable the developers to contribute this patch externally, we had to build a platform, build a contribution model so that developers are empowered to create the patch, as well as contribute to upstream.

But the challenge that we had is we opened a PR against the open source. And again, we were met with the same comments saying this patch cannot be accepted. The only solution is, you upgrade.

And that's when we thought, why don't we have our own place in GitHub and provide this patch to be used?

In order to do that, we started establishing the access. We had to work with opening up the access, which was closed to all the external influences. So we had to open at least a certain portion of developers who were identified [inaudible] and can contribute to be part of this small pilot program, and have access to GitHub as well as open source communities, so they [inaudible] to their open source method.

That way, in addition to the previous teams that we built the platform on, we had to build [inaudible].

Finally we started laying out what would be our project methodology and rollout into GitHub open source. And what we found is, the main goal of this project is we want to provide a framework to projects that are not well maintained, as well as provide a golden patch for teams who are in a similar situation as us.

The lifecycle is pretty simple. We start with a GitHub issue. If you have a project that needs to be patched, the issue will be reviewed by a group of experts within the organization. And then we'll create the project under this called Dependency [inaudible], and then we provide a framework to patch.

And as part of the patch, we'll also create [inaudible] dependency project organization. We'll also create that, and then we'll have releases internally within the project card, as well as once the PR [inaudible]. And until then, provide a temporary patch for developers to use the clean version of libraries.

This is how we have implemented. And as part of the governance, which is an important thing with open projects, we make sure that all peer review, we have protection, and then we also have CI validation to make sure that all pull requests are validated.

We're leveraging code capabilities of GitHub to make sure that code is of good quality. Library improve, as well as we enable the OpenSSF Scorecard and best practices badge to make sure that, the idea is, we put all these checks and balances to make sure that the quality that we are providing is of trust, and also eventually we'll [inaudible] as of the application that's coming up in future.

Coming to the third vulnerability, wherein the challenge was the 1.x version of this library suddenly jumped into the 2.x with [inaudible], and people who are stuck with 1.x have no [inaudible] unless they do upgrade to the major version.

So what we decided was, we got rid of all the unsafe constructors as part of the 1.x version of this library, and we followed the same steps that were mentioned previously, wherein now, directly under the dependency project, we make changes against GitHub, and then we publish it to Central and GitHub Packages, so that developers can get the snapshot version, and then we'll push to Maven Central.

So here, the project on GitHub and all the projects that are in flight are made public, so that this project follows the principles of open source methodology, and then the tasks that are part of this project that are in flight, and the badges showing credibility of this project, like what kind of checks and balances we put in and how good this project is.

The next steps: we're not done yet. This was the process that was done in the past few months. We had a lot of learnings which we need to implement going forward.

So we see that we also own the quality of the patches. That means many of these projects had very low code coverage. So we're trying to improve the code coverage as part of it. Help would definitely be appreciated there.

And also we're working on to [inaudible], as well as Maven Central. And we're working on what learnings we would like to learn from you all. What would be the [inaudible]?

As well, we're working on increasing the adoption once it's out in the public repositories and also implement the learning.

So one of the learnings which I would like to mention, which is going on there, is we [inaudible] on a newer version of [inaudible]. So we're working on how we expand this patch. How do we maintain whether we maintain 1.5 patch as well as the 1.5.3 patch? That is being worked on.

We need help in what would be a good [inaudible]. And also we're looking to establish feedback on how this model works. The lifecycle at this point is two years from the inception of the project, is what we have decided. But depending on how standard that would be, and the activity on that project would decide.

The next thing also, we're looking for additional projects that comply with this kind of methodology. As I said, this is not a replacement for the [inaudible]. It's a placeholder.