Log in to watch

Log in or create a free account to watch this video.

Log in
San Francisco 2015
Share
Download slides

Beyond the Culture Deck What You Don’t Already Know About Netflix

A significant chunk of DevOps rhetoric centers around “unicorn” companies like Netflix, Etsy, Facebook and many more. They are held up as the models enterprises should emulate. But what makes a place like Netflix so special? What does life inside a unicorn company look like? Is the famous Netflix culture deck true to life or just hype? Most importantly, what lessons can I take back to my employer?


In this talk, I’ll share with you the reality of working at Netflix. I’ll share details about how people work at Netflix, how we communicate, how we are organized, how work is prioritized, how we manage risk, how we build teams and how our culture plays a central role in everything we do. Lastly, Mike will share what are the important lessons that every manager and executive should learn about Netflix’s culture.

Chapters

Full transcript

The complete talk, organized by section.

Mike McGarr

Good afternoon, everybody. I'm lucky enough to have the after-lunch slot, so my job is not to inform you, but to keep you awake this afternoon. Yeah. It's going to be hard.

I'm going to start with a story. A year and a half ago, I was in D.C. working as the director of DevOps for Blackboard, for their Learn product. Actually, last year... Show of hands, who was here last year? A colleague of mine, David Ashman, spoke last year about some of the journey we had at Blackboard.

I was in charge of the tools team. We led a continuous delivery effort, a lot of introducing people to Lean and Kanban and talking about test automation. A lot of the challenges everyone here is talking about and is going through.

Then I got an opportunity to come to Netflix, and I couldn't pass up that opportunity. My wife and I collectively decided that we'd move across the country from D.C. to come to California. I was excited to join Netflix, and I still am excited to be at Netflix.

Part of what I was really looking for was not only to contribute back to Netflix and be a part of this great company culture, but really learn what I could about Netflix. The question I would constantly ask myself is, and I'm sure a lot of you are here to find out, what makes Netflix so special?

You hear a lot about Netflix, and part of my goal was to also understand what is it that makes a company like Netflix, that creates a great product and creates great open source software and has a fabled culture. What can I learn from this culture that I can carry with me going forward?

My name's Mike McGarr, and I'm going to be talking about what I think enterprises in general can learn from a company like Netflix. I'm not going to go as far as to say you should be like Netflix. I think that's impossible. But there's a lot of useful things that I know I will carry with me going forward. That's what I'm going to really focus on today: what enterprises can learn from Netflix.

Let's start in the most obvious point, which is the culture deck. It's not a Netflix presentation if you don't at least show this slide at least once. How many people have read the culture deck, all 127 slides? Okay. All right. A lot of hands went down.

It's an awesome document. It's actually the primary reason I'm at Netflix right now, because when I read it, I was excited, but I showed it to my wife, and she was like, "You have to be a part of the company. Just find out if this is really true." I can verify that this is really how we operate. If you haven't read it, I recommend it.

Sheryl Sandberg has gone as far to say that this document is arguably the most important document to come out of Silicon Valley.

The culture deck serves as kind of our constitution as a company culture. Oftentimes I'll find myself, I'm about to make a decision, I will refer to the culture deck and almost be like, "Does what I'm going to do violate something within the culture deck?" What was fascinating to me is that there's no internal version of this. The version that's publicly available, if you Google Netflix culture, that's the first thing that comes up. It's on SlideShare. That's the version we all use as well.

The biggest, most notable part of the culture deck is freedom and responsibility. That's right there in the title, and it's exciting. When I first read it, the words freedom and responsibility really excited me, and I wanted to see what it was like to work at Netflix because of this.

Freedom. I think a lot of engineers will hear that word freedom and say, "Man, I really want some of that." But they forget that it's freedom and responsibility. This is really the key message here about freedom and responsibility, is that you can't take the two apart. If you take your freedom, you have to take your responsibility with you.

You'll hear people talk about Amazon, and you build it, you own it, and that's very true at Netflix as well. Whenever you create something, you're the one responsible. You're the one wearing the pager for that particular service.

When I joined Netflix, I joined the engineering tools team, and we're no exception to this rule.

That's not supposed to be there, but that's okay.

This is a picture of our delivery pipeline. We have a suite of tools that allows engineers to take code and deploy it to production. Our team is responsible for this whole pipeline, from source code to production. We run the production in AWS in the cloud. We make the choices as far as what tools go in there, and we have the responsibility. So we have that freedom as well.

If an engineer comes to me, and I happen to be the manager of the build tools team, and says, "Mike, I don't want to use Gradle. I want to use SBT for my builds," the answer I'm going to have is not going to be, "You can't do that." The answer I'm going to have is, "You can do that. You have the freedom to do that. But with that freedom, you take that responsibility." So I can't provide any support for that tool or that implementation of the build.

It's really a value proposition that we offer the engineers, that if they use our pipeline, our suite of tools for deploying to production, that's less things they have to worry about as far as taking responsibility.

Another way to think about freedom and responsibility is if you want more freedom, you have to take on more responsibility.

How this might relate to you and your organization is a question you might ask yourself: looking at your organization, and you look at the people in your organization who have responsibility for something, ask yourself, do they also have the freedom to choose what goes into that?

This is really a common DevOps problem. Ops is responsible for maintaining the reliability of the production environment, but they don't always have the freedom to choose what goes into that environment, right? I think this is a good question you can take back and ask yourself, looking for where teams don't have the freedom, but they have the responsibility for something.

You might ask yourself, in a culture of freedom and responsibility, how do you get alignment across the whole organization, right? If it's total freedom, right, it's not. But if it is, how do you manage that?

Another part of the culture deck is this idea of context, not control. This is a big part of our Netflix culture, is that I as a manager, or my director, we believe in sharing business context with individuals on the team and giving them the freedom to make the right decision and achieve those business contexts in the way they see fit, and then doing that over controlling them.

Another way to look at this is managers at Netflix really focus on the what. The what generally looks like strategy, priorities, what are the problems that we have? Then engineers really get the freedom to focus on the how.

When you look back at freedom and responsibility, you can see that this is not total freedom. Engineers don't necessarily have the freedom to work on anything that they want to. My team is responsible for solving a business problem, and I'm the one setting the strategies and the priorities for our team, and then the engineers have the freedom to choose what tools they can use, what languages, and how to solve that problem. Still very compelling, but the freedom's not total.

That's how I would manage a team with context, not control.

We also manage problems across the company using context, not control. An example we can look at is over the time as we went to the cloud, and I don't know if Adrian is here, he could tell you the journey of how we got here, but we have a bunch of patterns or rules or best practices. These aren't all of them. These are some of them that we find every single service running in production should adhere to.

The immutable server pattern. You hear this a lot with Docker. We believe every service should register with Eureka, which is our service discovery system. Red/black deployments, which basically means when you have a version of your service running, you deploy another version, and you route traffic between the two, and so you can do fast rollbacks. Then the rule of three, which means we have at minimum three instances of every single service running in production at any given time for reliability.

The question you might ask is, "That's great. You have these best practices. How do you ensure compliance? How do you ensure that every single service at Netflix is running with these best practices?"

The answer is we don't. We don't really have a way to force people to do this. What we do is we make it super easy for somebody to do the right thing.

A lot of the way we do this is through tooling. You're getting a little sneak peek in. AWS re:Invent happened recently, so the team working on Spinnaker, which is our successor to Asgard, this is a screenshot of Spinnaker in the background. You'll be hearing Netflix talk about this a lot more coming forward.

Spinnaker is an example of a tool that we've built to make cloud deployments and applying the immutable server pattern, rules of threes, et cetera, very, very easy for engineers.

Inevitably, sometimes someone will deploy something that doesn't necessarily comply to our best practice or standards. How we approach that is we give them feedback.

Raise your hand if you've heard of Chaos Monkey. So Chaos Monkey has a little brother named Conformity Monkey. This is Conformity Monkey. So no one knows about Conformity Monkey.

Conformity Monkey's job is to scan through all the instances in production and to look for instances where the servers are not complying with what we consider best practices or what a service should do. Its job is not to remediate the problem. The job is to inform the owners of the service, basically saying, "Did you know that this was happening? Did you know your service only had two instances running?" Or whatever rules we put into Conformity Monkey. Then it's the team's responsibility, because they have the responsibility for that service, to take action. It might've been intentional, it might not have been.

This is how we use context, not control in our cloud deployments.

I would challenge you in looking at your organization, if you ask the question, if you give your employees the right business context, do you trust that they can make the right decisions? I think the key word here is trust. Where Netflix does a great job of this is we have a high-trust culture. We hire people that we know will be forced to make these types of decisions in their design, knowing that we're going to give them business context. This tends to work for our company.

I think this is a good question to look at your company and ask, do you have that trust with your engineers to let go of the control?

I showed an example of engineering tools and an example of a centralized team, so let's talk a little bit about centralized teams and how they differ at Netflix.

You can think of a deployment as like crossing a river. You have a long journey. At the end of this journey, you have this river that you have to cross. There's a lot of ways to get across the river.

One way is you could pay like a silver farthing, I think that's the name, is it? To this ferryman, and he can take you across the river. The advantage is you're going to get across the river. The disadvantage is you and this ferryman are very much coupled together, that you're reliant on this person to take you across the river, and that the two of you have to go in concert. Then you take very low responsibility, really, for that journey across the river.

I view this as kind of an analogy for centralized teams, QA teams or centralized ops teams, really. You find a lot of organizations that a development team will say, "Well, my job is not quality. That's the QA team's job. And so I will push the code into the QA team, and then now we are coupled on this journey to get out the door to production." The same for ops.

At Netflix, centralized teams take a very different approach. We think of ourselves as building a bridge across this river. What we mean by building a bridge is that we're going to spend a lot of engineering effort to make this bridge stable, wide enough, guardrails to make it safe. Then this gives teams the opportunity to cross the river at their own velocity whenever they want to. They can choose to take that journey on their own.

I really like this picture here, too, because it also kind of highlights an aspect of Netflix's culture, which is we're not afraid to rebuild things and do things as we see fit. I mentioned Spinnaker earlier. Spinnaker is the successor to Asgard, which is arguably a very successful deployment tool for AWS. You can think of Asgard as kind of this lower bridge here, but as the business has evolved and we'd assess the future business needs, we're building a new bridge. Then we'll eventually tear down the old bridge.

This is how Netflix generally views centralized teams. They're going to be focused on tools, but what they're really focused on is enablement. Centralized teams really focus on enabling product teams to do the job for themselves. We will support these tools, and we'll make it very easy for you to get your code out the door. But at the end of the day, you're the one responsible for that.

This applies to a whole host of teams at Netflix. I'm part of the engineering tools team, but this applies to security. I was just reading the security team's charter, and it very much states that teams are responsible for security, and our job is to give them the tools and help coach them and give them the ability to secure their services. Our performance team, our traffic and chaos engineering team, which is arguably the coolest name for a team at Netflix. Insight engineering, platform engineering, all of them have this very same responsibility or view of their responsibility.

To punctuate this point, we don't have a centralized ops team, and we don't have a centralized QA team.

Mark Schwartz and I were talking a little bit about QA on teams and how to organize, and at Netflix, if a team who's responsible for a service feels they need QA, that manager will hire QA engineers onto the team, and now they're part of this product team, which is responsible for delivering this service, which is solving a business problem.

Another question you can ask yourself is how coupled are your centralized teams to your product teams? How many opportunities do you see for actually decoupling these teams and changing the roles and moving people around so that you actually have these product teams and centralized teams can focus on enablement?

Netflix is made out of people. Tools are great, but at the end of the day, people will lean on process. Process is something that is inevitable whenever you have an organization.

The Netflix culture deck has an answer for this. In the culture deck, if you've read it, you'll hear a lot said in the culture deck about process. In this slide I chose, but there's a few others, you can see right here that we're saying that instead of a culture of process adherence, we really want a culture that focuses on creativity, self-discipline, and freedom and responsibility.

If read incorrectly, you might interpret the culture deck saying that Netflix has no process. This is a very common interpretation, not only for people outside of Netflix, but for people inside of Netflix, too.

The reality is that our cultural immune system has been developed to weed out unnecessary processes. This is what we actually are doing. We're trying to figure out where process is necessary, but if it's not necessary, how can we eliminate that process?

This is a well-honed feedback mechanism we have in our culture. The challenge we have is that we want to avoid this process immune system becoming an allergic reaction to all forms of process. This is something that we have to guard against ourselves.

You see this if you've ever worked on an Agile team and someone has said, "I'm Agile. I don't do documentation." That's a misinterpretation of working software over documentation. So we ourselves have to be careful about this.

On my team, when I took over the build tools team, I was very cautious not to just throw out, we're going to be doing Kanban, or we're going to be doing Scrum or anything like that. What I did is I worked with the team, and as problems started emerging and we talked about it, I would say, "Let's use this tool or this technique."

This is an example of our current process right now. I know that there's other opportunities for us to improve, but this is a culturally compliant way for me to introduce process. Essentially, the only problems we're solving right now is that the team didn't know what everyone else on the team is working on. Visualize work. I see Dominica sitting there. She just said that yesterday. Then we realized that every single person on the team is single-threaded, and we weren't getting anything done. Let's limit our work in progress. That was it.

The way I view process at Netflix is really use process to solve problems, and then aggressively eliminate it or abandon it whenever possible. Look for opportunities to eliminate process in your organization. Not all process is bad, but I think if you focus on the real problem and trying to eliminate it, I think that's probably a good approach.

Let's talk about communication. This is an aspect of Netflix culture that I think we do some unique things.

Taking some ice was a bad idea.

One bit of our culture that's really interesting is that we value feedback. When we value feedback, the whole culture, when we hire people into the culture or into the company, one of the things we focus on is, can this person thrive in an environment where they're going to get constant feedback from everyone in the organization and immediate feedback?

This can be jarring for some people who are not used to this, and this sounds somewhat intimidating, but the reality is it's not malicious at all. It's definitely more constructive, and people drive to get to understanding when there's ambiguity in conversations very quickly.

What people are shocked by when they come to Netflix is how collaborative the culture is. If you read the culture deck, it kind of seems very Machiavellian, and you feel like people are going to come in and it's going to be a very cutthroat culture. But the reality is, because we're weeding out brilliant jerks aggressively and we're focusing on people who can give feedback and collaboration, you have this environment where you have really smart people that you love working with, and everybody wants to talk and share and help you out. It's striking.

I just had a guy I hired two weeks ago, and he was like, "That was the most shocking thing about coming to Netflix, was how collaborative everyone is." But this idea of weeding out brilliant jerks is key. I don't think I've seen another organization that so aggressively can take the most brilliant person on the team and say, "You're bringing the rest of the team down with your bullying, and we're going to let you go." It's an interesting place to look at in your organization, of instances where people are actually bringing down the whole team with their bullying or being just basically a jerk.

One-on-one conversations are a big part of Netflix's communication mechanism. I will have one-on-one meetings every week with the individuals on my team. Every week, I'll have a one-on-one meeting with my manager. I'll have one-on-one meetings every week with all the peers in my organization, at least my immediate peers. Then I'll have one-on-one meetings once a week, or at some frequency, with my customers, which end up being other engineering teams at Netflix.

Our culture of having one-on-ones is so prominent that our architecture for our new buildings, and I'm showing a picture here of one of our rooms, has been built around the idea that we have lots of one-on-ones. We have a lot of one-on-one rooms that just have two chairs and a table and maybe a whiteboard.

You'll also notice that we have pretty pictures on there. Every room has a theme. This is District 9, the conference room. It makes it easy to remember where the conference rooms are. Every floor has a theme as well.

I just described a whole bunch of one-on-one meetings that I have every single week. For engineers at Netflix, I spend a lot of time focusing on making sure that managers, I'm spending more time meeting and then trying to protect their time so they don't have to meet. I think this is pretty important for Netflix engineers to have that. Anyone who's an engineer knows they need that ramp-up time.

The last thing I'm going to talk about is waste.

When we think about waste, especially in the DevOps and Lean communities, people tend to think about waste like this, which is you have a value stream, and you identify the gaps and the waste, and then you eliminate that waste and try to optimize your whole value stream. This is great. This is right. I'm not talking necessarily about this type of waste.

I'm talking about a different type of waste, which is what a lot of people see as duplicate efforts. Two teams working on the same thing. Someone wants to do a learning spike, and that code is throwaway. Or you take over a new system, you identify some architectural problems, and you end up rewriting the system. You could categorize this as waste. I have a different term for every single one of these things.

I'll give an example of where this waste sometimes materializes.

Engineering tools are responsible for tooling for the whole organization. There's also another team at Netflix called Edge, Edge Engineering. They're responsible for the API, which is every single device on Netflix that, when it communicates with our web services, goes through Edge. They have a huge engineering problem and different needs than the rest of the services at Netflix.

As a result, they actually have a team called the Edge Developer Experience Team, and their job is to basically be a small little engineering tools for the Edge team.

Some organizations might see this as duplicate teams doing the same thing, but what we found is very useful to have a good relationship with this team, and they're supporting the small, unique needs of the Edge Engineering team. What we do is we communicate, and sometimes a good idea comes out of that team, and we take that idea, and we scale it up to the rest of the organization.

Having this duplicate effort might be weeded out in other organizations, and so we tend to embrace it at Netflix. Left unchecked, it can get out of hand, for sure. You don't want to go every team has their own engineering tools, but there are opportunities in innovation.

That's one of the observations I have, is that waste can be a necessary byproduct of innovation. I think when you think about your organization, a way to think about how innovative your organization is, is really, I would look at it as how tolerant is your organization or your culture of duplicate or throwaway efforts. This is another way of looking at how effective your organization is at being innovative.

Gene asked us to give five takeaways. I've run through a lot of things. My five takeaways are: think about those who are responsible in your organization and see if they have the freedom as well as the responsibility. High performers will do the right thing given the right context. Centralized teams should enable product teams, and I like that model a lot better, and I've seen it be very successful. Use process to solve problems, but then abandon it when you can. Then innovation generates waste, and knowing that, I think, is useful for helping understand how to become more innovative in a culture.

That's it. I don't know if I have any time for questions. I think we have time for one lucky winner. One question. I will be around for a while, too. Anyone? Okay.

Q&A

Q: Thank you for the presentation.

A: Yeah.

Q: When you mention your one-on-ones, are these 30 minutes? Are you spending your whole week in one-on-ones? It sounds a little bit...

A: I meant to get numbers on how much time I'm spending on one-on-ones, but generally, I have 30-minute one-on-ones. The way I approach one-on-ones for my team is, really, it's an opportunity for them to share and communicate information up to me. I will maybe share some business context, but in general, I'm going to be focusing on what problems they have.

With my peers, we'll be focusing a lot on context and sharing context. In general, that's 30 minutes. Then even with my customers, I'm definitely going to be asking them what's going on, what are they seeing, what problems are they experiencing, and then we'll have one-off one-on-ones. I would say a good chunk of my week, at least 30% to 50% of my time, is spent in one-on-ones. But it's actually a lot of fun.

What I didn't say is that managers at Netflix, if you're writing code, you're doing it wrong. I will write code. I'll find opportunities to write code, but never in my team's core product. It'll be something on the side for fun for Netflix.

I think we have time for one more. Oh, there's one right there. Yeah.

Q: I've been a dev manager, and I also had the thought about trying to take the meeting so the engineers don't have to, but how do you also avoid the point where they say ABC, we got to always be coding?

A: Right.

Q: And they lose sight of the fact that sometimes talking to other people is the most valuable thing to do.

A: Yeah. That's a great question.

It's interesting enough. I thought about that aspect, and it's not a problem I've ever seen where... I've seen some engineers who definitely have a preference for always coding. ABC. I like that. They'll bring their laptop to the meetings, and they'll be coding in the meetings, which is never something I like, but I give them the option.

If you want to go to this meeting, we're going to talk about this design principle with this other team. You have the option to or not. If I do have to schedule meetings for them, I tend to focus on scheduling meetings in the late afternoon because they get in, they get their work done, and by that time, they can go to the meeting, and then they'll just go home.

That's generally how I solve that problem. It's like our weekly planning meeting is actually just Mondays at 4:00, because I leave at 4:30.

That's all we have time for official questions, but Mike's going to hang out.

Thank you, guys.

Thank you. And thank you, Mike. It was great.