The Modern Open Source Programs Office: Microsoft's Journey
Open source is everywhere, in every product, every business, and every organization -- it’s a core part of the modern DevOps landscape. Understanding your open source engagement enables you to drive the outcomes you want, from creating great ecosystems around your tech, to contributing to and supporting the projects you depend on, to using open source in a secure and compliant way. This is the remit of the modern Open Source Programs Office (OSPO).
Not only is open source in 80-90% of today’s systems, the scale of that open source is staggering -- it’s trivial to type docker build and end up with 100s or 1000s of components in your container.
In this talk we cover the policies, mechanisms, and culture changes needed to be effective as you attempt to navigate the scope and scale of open source engagement, enable your organizations, and ensure important ecosystems flourish. We tell the story through the lens of Microsoft’s journey from actively opposing open source to being one of its largest proponents with over 20,000 developers on GitHub. You will hear how Microsoft evolved, and continues to evolve, on the engagement spectrum and learn how you can drive similar changes in your world.
Chapters
Full transcript
The complete talk, organized by section.
Jeff McAffer
Hey, I'm Jeff McAffer and I'm coming to you from sunny Seattle, Washington, and I'm in my garage because everybody's working at home and, literally, in my house everybody's working from home, so this is my quiet spot.
I'm a product manager at GitHub, and I work on a bunch of different things, but one of my key focus points is helping folks do open source at scale: producing open source, consuming open source, and just having a great open source program. And so that's what I'm going to talk to you about today.
Until recently, I was actually working at Microsoft running their Open Source Programs Office, and that's the set of people who help the company get good at doing open source, do it right, and engage in a meaningful way. My move to GitHub was really about bringing a lot of that experience and understanding that we gained at Microsoft to GitHub, and then enabling all of you who are doing open source to really have an open source program that rocks. Today, I'm going to share some of those ideas and approaches with you.
You might be thinking, this doesn't really apply to me. Microsoft is such a big company, and we're just a small company or organization. That's not really true. In essence, a lot of what we learned and what we did applies to you even at a much smaller scale, because once you get past the number of things you can manage on a spreadsheet, you need some sort of automation. Once you have more than a few people, you need to drive some culture change. All of those things are things I'm going to talk about here.
This all starts with a basic premise. This premise that I like is: aspiring to be world-class isn't enough when everybody else starts there. If you start out, like we often did at Microsoft in the past, saying, "Hey, I want to build some new cool application, and it's going to need a database, so let's write a database. Oh, it's going to need a UI, so let's write a UI framework." If you start off that way, you're really not going to progress very fast. You're going to spend all your time building that infrastructure instead of building your real value.
Everybody else is going out and getting MongoDB or MySQL or React, or pick your favorite technologies, and putting them together. In fact, in some ways that's a hallmark of DevOps: taking components and assembling them into solutions quickly and with confidence. Essentially, that's the way I see an open source program: helping you do that, enabling that fast assembly of components from open source and also the contribution and creation of those components.
You have to start somewhere, and everybody starts somewhere. Maybe you've got a small number of people, and this is certainly the way Microsoft was a few years ago: a small number of people using a small number of components. Then we quickly advanced into having many. In fact, 32,000 people from Microsoft are active in GitHub today, and there are many different components that we produce and projects that we use across the company. As that evolves quickly, you can imagine going from that very small number to a very large number of engagements, you really have to have some level of organization. This stuff doesn't just happen.
Especially in a DevOps world where you're running really fast, you don't really want to have to stop and figure things out every time a new component shows up. For example: are there any vulnerabilities? What's the license? All that sort of stuff. You don't want to do that manually. You need some level of automation. You need an understanding that that's what you should be doing. There's a whole set of infrastructure and approaches that come behind the scenes to make that all friction-free, so that you can do continuous integration, continuous delivery, and so forth.
All this is needed, of course, because once in a while, there's a component that is a bit of a surprise, so my little tip of the hat to Gene and The Unicorn Project. But in all seriousness, there might be a component that comes up that's got a vulnerability, or is malicious, or has a license that's not compatible with the way you're using it. You need to know all that, and you need to know that in an automated way that's integrated into your workflows, because otherwise you won't find it. If it's manual, you won't figure it out.
That's been a lot of talk about tools and everything, and as devs, I think a lot of us are devs, we'll imagine, well, there's an app for that, or I can write a tool that'll solve that problem. To a certain degree, you're right. There are many great tools and apps that can help here. But fundamentally, if you don't have that culture change in your organization that says, "Hey, we're going to use open source, we're going to engage with open source, we're going to release open source," if you don't have that culture that drives that from a business-need level, then all those tools are not going to be of any use to you because nobody's going to be doing it. They're not going to be engaging.
Again, here, you've got to start somewhere. There's one or two people off in the corner figuring it out for themselves, putting it together manually, following the instructions from the book step by step, and that's cool. That's a great place to start. But from a culture-change point of view, the real magic starts happening when you get a lot of people together, a lot of helping hands coming in, collaborating, sharing, engaging with one another to see: how can we do this better? How are we doing it now? There's this other piece of technology we could use. That's where it really comes together, the value of open source.
But that also doesn't happen for free. You have to build a culture, build an understanding, a shared set of values, a shared set of expectations, an understanding of what's desired, what's safe, what's not desired, how to behave. Otherwise, just like you might have ended up with tool and mechanical friction, you'll end up with societal or cultural friction where people have different expectations and understandings, and it just won't go together. So you need to build that understanding and that culture as well.
You can imagine at a company the size of Microsoft, both of those challenges, both the tooling and the culture, the process and the culture challenges, are quite significant. Fortunately, we had a lot of help from on high: Satya here loving Linux. This is really great. A lot of the senior leaders in the company understood at a deep level the value of the company, but also the value of open source in both producing and consuming. So we had this guidance coming from top, a lot of people coming into the company who'd been doing open source for a long time, and so we needed to look at how we could evolve that.
As you imagine, going from that zero or small number to a lot of engagement, it doesn't just happen for free. There needs to be some sort of organization and coordination to make this happen. That's where an open source programs office comes in. That's their job.
Looking back, around 2015 is when we created an explicit Microsoft Programs Office. About that time, it was a pretty pivotal time. Before that, we were doing open source here and there, had some really serious efforts, but it wasn't widespread. What happened in about that timeframe was that we realized, okay, this is a good thing. This is here to stay. This is the way we should go. In Microsoft speak, we're all in, and so we've got to make this real. We've got to systematize it. We've got to make it sustainable. We've got to coordinate it and really enable our teams to engage. So we created the programs office, and that was our mission.
Really, if you look at where an open source programs office lives, it lives between the engineering need to engage with open source, the business drive to do so and move further, and those cultural shifts that need to happen. The role or job of the programs office is to drive initiatives around culture change, understanding, tooling, et cetera, and iterate on policies and processes and tools that make that all happen for free: remove the friction, essentially.
Organizationally, a programs office can live anywhere in the org. I've seen them in HR, I've seen them in marketing, I've seen it in the CTO's office. But I think the best place, honestly, is nearest to the engineers. Often you'll have a central engineering organization who run the builds and the other infrastructure. That's a great place for it. Or some engineering leadership, perhaps even the CTO's office, might be good. Either way, this is how we looked at it at Microsoft.
We had the OSPO in the middle, and we were taking executive guidance from a committee or a council that would help us understand the business needs and the goals of the business, and then consulting with legal, HR, security, accessibility, all the subject matter experts in the company who have an opinion and a stake in how we do open source at the company. Then our job was to bring those groups together, to actually relate those needs and desires and goals to the engineering teams and put it in terms that the engineering team can understand and can utilize fully. Then, of course, drive that to value for our customers and communities that we're engaging with, and also collaborate with our industry peers so we can evolve the norms, et cetera, around open source.
Just in terms of looking at scale and back in time a little bit, Jeff Wilcox was one of the folks on the team, and he wrote a couple of blog posts. This one's from 2015, and it really talks about the transition from 2011 to 2015, from 20 people doing open source on GitHub to 20,000 Azure folks doing open source on GitHub. Then he wrote another post about three and a half years later, where he talked about the scale of going from 2,000 to 25,000 as we made open source happen across GitHub -- sorry, across Microsoft. Even now, we're at 32,000 folks engaged on GitHub. I encourage you to go and check out the blog post. There's a lot of great stuff in there, a lot of great detail about individual tools we built and individual initiatives we did. I strongly recommend you checking that out if you're in the process of building an open source program.
To put a little bit bigger picture on the scale of what we were trying to accomplish, of course, Microsoft loving open source on the use-of-GitHub side, so for producing and contributing to open source, we currently have around 100 orgs, 32,000 people, 7,000 teams, and 22,000 repos. The numbers don't really matter that much as much as to say that it's a big scale. This is serious. You need serious amount of effort and understanding of what's going on, tooling and process around it, and the culture changes really happen because we've got all these things and all these people happening, working on GitHub.
Then looking at the Microsoft side of things, of where we actually use open source in the products, today we're tracking about 10 million uses of open source across 10,000 different products. It's 200,000 different versions of 50,000 different components, and we discover half a million new uses every month. That's the kind of thing that is going on. Again, back to that scale question, you might say, "Well, I don't have that many zeros in my system," and that's true. But if you look at how we got there and how we're doing this, there's a ton of automation and integration that's there that you can use to solve your problem, even if you're at the hundreds or thousands scale on some of these things. So we'll take a look at that now and really talk about how did we do that and how did it unfold.
The open source programs office was roughly about 10 people, and there's another 20 or so subject matter experts, lawyers, security folks, et cetera, and partner teams helping us build tools. I would say that from 2015, probably for the next three years or so, our main focus was really building tools and helping to get the process to proficient, is what we would call it.
I mentioned earlier that culture eats process, and while it's true, when we started out the programs office, we thought, "Well, hey, what we've got to do, we've got to go around and evangelize. We've got to go and tell everybody that open source is good. You should get engaged. You should get involved. You should use and contribute more open source in your team." It turned out that people didn't need convincing. They were right on it. They were, "Okay, let's go do that." But it turned out that our policies and processes were so hard that they didn't actually believe that's what they should be doing, because there was just so much friction that even if they wanted to, it was really hard.
Just as an example, every time you wanted to use a new piece of open source, you had to manually identify that piece of open source, and then you had to go to a tool and manually enter and answer about 20 questions. Really, 20 questions per open source piece that you wanted to use. Some of these questions, you didn't know how to answer. You didn't know what they meant. You didn't know where to get the information. Oftentimes, we'd give that information to somebody, and they would have to go and revalidate it or whatever.
What we discovered is essentially, we had to evolve the process and tools and the culture to enable the culture change. Then as the culture got here, we had to advance the process and tools more and just keep walking up that ladder, those steps of doing a sequenced evolution of culture, evolve the process and policies, evolve the culture, and do those in a coordinated fashion.
Looking at some of the activities that we did in that space from a culture-change point of view, we did a bunch of consulting. Teams would show up every couple of weeks and say, "Hey, I want to open source this key piece of infrastructure or this key part of our product or this key technology. Can you tell us how to do it? Can you help us figure that out?" Of course, we would. We love that, but I also love getting paid, so I want to make sure we're actually doing it in the right way and for the right reasons. We would ask the five whys: why are you doing this, et cetera? How do you expect that to unfold? How are you going to create a community? What are you trying to achieve, et cetera? Then come out with essentially: yeah, this is a really great idea, or maybe you need to tweak it a little bit, or yeah, no, don't do that, or this isn't the right time. We'd also devise playbooks that would help people to do that discussion in a self-serve fashion. If you're doing a cloud service and you have these goals, then do it this way, that kind of stuff.
Social events like meetups, et cetera, to try to get people together and talking with one another and collaborating. On the other side of things, it turns out that people really want to understand what's going on. They don't want to just pitch stuff over the wall or take things in from other places. They need confidence. They need data and insights to understand what's going on in their engineering system, what's happening with their product, et cetera. So we did an enormous amount of effort gathering data, generating insights, and helping give people that confidence of what's going on, as well as engaging with the open source community as a whole, joining foundations, sponsoring projects, conferences, et cetera, just trying to get more engaged with the communities and understand what it is they're doing and why and what they need, et cetera.
We also then, sort of as a culture part, as we were looking towards the policies and processes, evolved this mantra of identify, eliminate, automate, delegate. We'd try to identify a source of friction, and then we would try to eliminate that friction. I'll give you an example of going back to the open source use case, where you're needing to register the use of open source. It turns out that it was pretty obvious there was a ton of friction there, so we were identifying that friction. That was easy. Eliminating it was going through those 20 questions and saying, "Okay, do we still need this question? Who's the right person to answer that question? Can we get that data from someplace automatically? Do we already have that information in the engineering system?" It turns out that a ton of questions could be eliminated or automated in that fashion.
Things that couldn't be eliminated or automated, we would then seek to delegate those to the right people. For licensing questions, for example, devs often don't understand licenses. They don't know what they're about, what they're for, where to find them, et cetera. It turns out that if you ask a developer about the license and they give an answer, oftentimes lawyers don't actually trust those answers, so they go and figure it out for themselves anyway. Well, okay, let the legal team do that instead of asking the developers to figure it out. Get the answers or the task closer to the people who have the expertise and the skin in that game.
This is a continuous iteration where you have to keep on cycling over. Every time you find and eliminate something, you'll get closer to a solution, and ultimately, you just keep on cycling on this. Every month or so, we would find another way to eliminate some friction or automate something, where we would just roll out a new policy or a new tool and keep iterating on that. Eventually, we got to the point where, for example, you see those 500,000 new uses of open source being discovered every month. Imagine if you had to do manual review, anything manual for any sizable chunk of those. That would be horrific.
Through this iterative process, we got to the point where 99.5-ish percent of those new uses were handled through some automated process that did it all, followed encoded business rules using high-quality data so that a human never touched it. We automatically discovered it, automatically processed it, automatically validated it, and away we go.
To dive a little deeper on that, this is the sort of lifecycle we put together around using open source. I'm not going to go through every piece of this, but you can see that if you look all the way left, when a dev is discovering open source, that I need a new JSON parser for my application, they're going to go to npmjs.com or NuGet or Maven or wherever they're living in their ecosystem, and they're going to start looking for things.
We have a lot of understanding about what the rest of the company is doing because we know everything that's going on. We can help guide them to solutions or choices that are already being used in the company, that are known to not have vulnerabilities, that have licenses that are known to be compatible with their usage scenario. We can do that shifting all the way left into their browser. When they're on npmjs.com, we have a browser extension that will tell them right there, give them a Microsoft viewpoint on the components that they're looking at and help them decide, guide them to a good choice.
As you scan to the right, we're doing detection, automated detection of the open source you're using, so the engineering system knows that. The engineering system knows where the components are coming from. If it doesn't, you shouldn't be using them, like seriously. But so the engineering system knows that. It can also ingest that. It can go and get the source code and bring that in and keep that for historical purposes and to enable future servicing, et cetera. Keeping track of the registration of who's using what and running policies like I was describing earlier, giving people notifications when something's out of policy or breaking the build, blocking a deployment if something's out of policy: all of these things allow you to run really quickly like we're used to in a DevOps world, but also run with confidence that you know what's going on with respect to open source.
When it gets to the point of actually shipping your system, you might have open source compliance requirements like shipping a notices file. Most licenses require attribution. Or you might have to disclose source code if you're using a copyleft license. Those things can all be automated because, again, we know the information about the components you're using.
Once you've shipped, two months later, a vulnerability might be discovered in a key component that you're using, and you need to know where you're using that. If you're using a component in several thousand places and it suddenly has a vulnerability, you want to really be able to go quickly and find out which data centers is that deployed to, so that I can go figure out what build that was in, rebuild it, fix it, and deploy it. Ultimately, once you've shipped, being able to understand the nature of the open source interactions and integrations in your code through dashboards and reporting is super useful.
That's just an idea from a use point of view. There's similar challenges around releasing open source and contributing to open source. But from a DevOps point of view, I think use is most relevant here. Of course, in an open source way, as we went through building all of those things, we built a bunch of open source tools and we contributed to a bunch of existing open source tools. I won't go through all of these, but some key ones: the GitHub Portal, for example, was a tool that we built and open sourced. It's a great way of managing your engagement on open source, all your teams and your organizations and your repos. When people want to make a repo public, that's an event. Oftentimes it needs a review. You can do all that kind of thing through the GitHub Portal.
GH Crawler is another interesting one where, again, monitoring your open source presence on GitHub and gathering all the data, what's happening, who's committing, who's doing pull requests, how long are they taking, what's going on with my teams, all of that information can be pulled in through GH Crawler, and then you can build your insights on top of that data.
One that's near and dear to my heart is Clearly Defined. That's a means by which we crowdsource the gathering of all of the compliance data that we need. It turns out that compliance data around open source is really hard to get. Different teams use different approaches for identifying their licenses or their copyrights, and the commercial products in this space are often incomplete. So we actually started an open source community to gather and curate that information in a systematic way and make it available. Now there's, I think, 10 and a half million different components characterized in Clearly Defined, so you can look to integrate that into your process.
Speaking of that, that's been a lot about how we did it. Now I want to cover briefly how you can create your own open source project or program. An important point here is that there's a lot of variations. You have different tech stacks, different communities or cultures, different business goals. You're at a different point in adoption or engagement, so you have to tailor your own thing. But this next discussion is a lot of context that I wish I had when we started the open source program. We'll pass that on to you, and hopefully it's helpful.
I think the first thing to look at here is: what's your value? From a business point of view, what is it that makes people excited about your products? What makes them spend the big dollars to buy the things that you sell? What makes them line up to get the next release? That's your secret sauce. That's the thing that differentiates you from everybody else. The rest of the stuff, the databases and stuff like that under the covers, typically are not your secret sauce. They're the same thing that everybody else could use functionally. Those things you can get for free. You can engage with open source and create them. You can use them and spend more of your time focusing on your value and delivering that to your customers.
Understanding that value is great. The next thing you want to do is focus on what's your engagement level today. What are you doing? How are you engaging with open source? Then you can look to say, okay, I want to get these other values. I want to evolve to this next level of engagement. What is it that I should do? Everybody's notions here are different, so don't take these all too seriously. It's more of a set of key points to think about, and that act of thinking about it is actually the real value. You're not going to score any sort of level here, and so just the thinking about it is the real point. I'm going to skim over a bunch of these. I did this all on a blog post that's listed here. I encourage you to check that out and get the details of it. We'll go over these, and in part this is a good opportunity to have a lot of fun with Lego pictures as well.
Denial: obviously, Microsoft was in denial about open source for quite a while. You saw us attacking and countering and preventing open source quite actively over the years. Hopefully, that's all gone now. You all might be in there, and there might be pockets of your organization that are in there, and it's legit. We just need to help people evolve past that if they want to, if that's one of their goals that they've identified.
You can evolve from there into a stage of hype. This is where this is the next best thing. The CTO has read about it on the airplane in Wired Magazine or something, and so this is going to solve all of our problems. We're going to hire a bunch of high-profile open source people, and we're going to drive a whole new culture and change everything. Oftentimes, those fall a little bit flat, because there's not enough of a business model, not enough of an impetus under there to make it a sustainable effort.
Of course, Microsoft has fallen into that trap as well. We've also fallen into the tolerant trap of saying, okay, this is a natural evolution stage, but you've got a few people off in the corner -- here we've got a couple of folks hiding under the space bar -- doing some great stuff, but it's not at all widespread. It's pretty ad hoc. They're certainly not rewarded. They might even be getting penalized for doing it and not producing product that they can sell or not writing more code or whatever. This is, like I say, a legit place to be, but I think you want to evolve past this pretty quickly because this too is not super sustainable. People who aren't rewarded are just going to leave. It's not going to take hold.
Where I think a lot of people end up and where we can aspire to in many ways is being proficient. This is where it's systematic. Everybody understands how to do open source. It's well-tooled, it's efficient, and people are engaged both internally and externally across the company, and it's part of a lot of different products, and people understand the value of what's going on here. I would say today, the vast majority of Microsoft is at this level. They're proficient in that. That's in part because of the tooling that we created, and we've gotten to that point in the culture change.
The next level is fluent. As you think of this in language terms, you've moved from just being able to read the menu to actually being able to have an argument or a discussion with people on a regular basis. You can manipulate the concepts, you can understand the value of open source and weave it into a fundamental part of your business in an open and engaged way. You've recognized that value, so you're rewarding it. People who do open source, engage in open source, are rewarded for doing so, and that means more people want to do it, and it just evolves from there. I think if you look at Microsoft, there are quite a few teams at that level: VS Code, .NET, TypeScript, a bunch of the Azure folks, a few other teams around the company. They're really at this fluent level of engagement.
Then you get to mastery. There's relatively few of these, but these folks have got a fundamental core understanding of where open source can help them, what they can do, et cetera, and they're using it to disrupt markets, disrupt incumbents. They're sharing control of the open source projects that they engage with. They're really getting proactive and integrating open source into the fundamental nature of their business model.
The question for you really is: where do you want to be, and how do you want to get there? These ideas that I've just outlined can maybe help you understand both of those questions and make some progressions. Then there's also a bunch of material out there from the TODO Group, which is a collection of open source programs offices that have come together, produced a bunch of case studies, how-to guides, even things like job descriptions for hiring an OSPO staff member. I encourage you to go and check out todogroup.org. Lots of great information. That's about it for me. I hope this helped. I hope that you are excited about creating an open source program, and I look forward to seeing what you do in this great world called open source. Thanks a lot.