SRE at The Walt Disney Company
EXCLUSIVEJason Cox and Amy McCain of Disney's Global SRE team reveal how a centralized shared service can break the mold by embedding curiosity, empathy, and genuine delivery into every engagement. Rather than imposing standards from above, their team listens first, earns trust by sharing in the struggle, and then actually solves the problem — a model that led senior leaders across Disney to consistently single them out as different from every other shared service. When the final mastering system for Guardians of the Galaxy Vol. 3 failed days before release, the team bypassed bureaucracy entirely and built a fresh Kubernetes stack from scratch to get the movie to theaters on time.
In this talk, you'll learn how Disney's Global SRE team structures three distinct engagement models — community building, a flex bench, and long-term embedding — and how they fund reserved capacity through a transparent overcharge model so they can respond to critical crises without a waitlist.
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
Gene Kim: It has been so fun to introduce Jason Cox from Disney over the years. He has been instrumental in my own DevOps learning journey. He is currently Director of Platforms and SRE at The Walt Disney Company. Amy McCain is Senior Manager of SRE.
I just need to retell this one story. I am hoping that all of you have seen his presentation that he gave in Las Vegas. My favorite way to introduce him is the way I got to shadow him in 2014 at the Glendale campus. At 8:00 AM, someone saw him from across the way and said, Oh, Jason, Jason, Jason, thank you so much for saving our butts for a big movie launch. It turns out that she was CTO of one of the large business units at The Walt Disney Company. This is something that happened four times throughout the day.
I was so amazed that he was able to share yet another one of those stories, where he was able to save the bacon of a team who needed something done and needed the amazing skills on Jason's team. Before we start, can I just confirm that Jason and Amy can hear me and make sure your audio is working?
Amy McCain: Hi, Gene. Can you hear us?
Jason Cox: Hey, good morning.
Gene Kim: Yes. Good morning. Fantastic. Before we start, can you briefly introduce yourself and what you talked about at the last DevOps Enterprise Summit, two months ago now, the Enterprise Technology Leadership Summit?
Jason Cox and Amy McCain
Jason Cox: Yes, sure will. Thanks, Gene. I am Jason Cox, and I lead our global SRE team. Amy, do you want to introduce yourself?
Amy McCain: Sure. Amy McCain, Senior Manager, SRE under Jason, and I lead a couple teams for global SRE.
Shared Services That Work
Jason Cox: One of the things we talked about was related to the fact that we are a shared service. We are a centralized corporate shared service at Disney. A lot of times, shared service does not necessarily work well. Those of you who have to use shared services know this can be a challenge at times to get what you need.
One of the things you mentioned, Gene, in the intro, about running into one of those leaders, was that I heard a similar comment from leaders all across Disney. A lot of times they will start talking in a derogatory way about other shared services, and then they say, Oh wait, Jason, just so you know, not your team. Your team's different. I am like, What is it about our team that is so different?
So I started asking that question. What is it? I asked these leaders what they see that is the difference about our team. As we talked about in Vegas, we began to assemble some of those learnings. What does it look like? The answer I got, regardless of the different groups we asked, was that the nuanced difference is that we are actually part of the business. We understand their business.
As Nick Cannon, who is our CTO of Animation Studios, actually said, you do not come in with this super-prescriptive, self point of view of, come and learn and do this, the company standard. You actually come in with curiosity to understand and learn about our business.
The model that we picked is exactly that. Let's embed into the different teams across Disney. The learnings, having talked to those leaders, really distilled down to three things. Number one was that first part: just listen, have some curiosity, listen to the business. The second was: have empathy, have a frame of reference, understand the business's frame of reference by actually being in it and having that shared experience of the struggle, so that the actual thing we are delivering to them is relevant. Then of course the third thing was actually help. So many times you get a shared service, even though it will cover the first two, ultimately just fails to launch because you do not deliver.
Guardians of the Galaxy Mastering Rescue
Gene Kim: Can we go into some specifics, and then we will talk about the mechanics of how? Everyone who has been following Jason's work and your team's work knows I shared a story about how you just seem to suspiciously be there to save the day for some important project. One cannot help but be blown away by the fact that you shared the story about how your team helped with the release of Guardians of the Galaxy, the movie. You even had the tweet from James Gunn acknowledging the amazing technology teams.
Amy, can you tell us a little more detail about what happened here and how your group got involved?
Amy McCain: Sure. We were dealing with some legacy systems. Well, the original team was dealing with some legacy systems. Because, like Jason says, we work across the entire company, a lot of people know us and know what we can do. We were able to come in as friends to work alongside this team. We just had some extra bodies, extra sets of eyes, extra pairs of hands to help figure out what was going on. The original task was to get the legacy system back up and running, and that was just failure after failure, unfortunately.
Gene Kim: Amy, can you talk a little bit about what needed to get done, to whatever level of detail you can? What was broken? Why was it important, and why did it need to get fixed?
Amy McCain: It was part of the mastering system that does the final mastering to send the movies out to the actual theater.
Gene Kim: Yeah. That is kind of an important step in getting a movie in front of an audience. So no mastering, no movie, correct?
Amy McCain: Correct. The legacy system just was not functioning, and we pivoted to just building a new Kubernetes stack for it. It needed to migrate off the legacy system anyway, but instead of coming up with a lengthy project plan and putting in tickets and going through all the red tape, we just built it to get it up and running. Give them a new platform, help the team get their technology up and running and out there and working again.
Jason Cox: That is listen, understand, and actually help.
Gene Kim: In my head, I am imagining that at some point someone said, Help, help, help. We have this really important date-driven, cannot-slip deadline. I am guessing you were not intimately familiar with the technology, so you came in, looked around, and said, We might be able to help. It is not like you were embedded in that team previously for years or decades, right?
Amy McCain: No, but we were able to bring in some people who had worked with those teams before, just because of our long experience working across the company. When I say bring in, we literally sat in the same room as these people and on the same conference bridges for a few days, just working through all the options and talking together and seeing what needed to be done. That is one of the special touches that we bring: the ability to actually work alongside people.
Gene Kim: I love it. No Jira tickets, no ServiceNow tickets. I actually got to see this in 2014, seeing Jason and team in the room. Amy, I have to imagine the studio is pretty happy. Can you talk about how this was a good thing for the SRE platforms team? What happened as a result? Was it just a thank you, pat on the back, high five, or is there now more work that you are doing with them?
Amy McCain: We do partner with them pretty extensively in a lot of different ways now, and we are very much working alongside them in a lot of different capacities. It is expanding the scope of what is capable for that team and for our teams. It broadens the exposure to the different technologies across the company and makes sure everyone knows what is available, what is out there, and what kind of knowledge we have. It is a win-win for everybody.
Gene Kim: Before I talk about the engagement model with Jason, I have to ask: in that project, in that crisis, what was the most rewarding moment for you?
Amy McCain: The movie was pretty good. I am a problem solver. My team are problem solvers. At the end of the day, we got it done. We fixed the issues. We came up with creative solutions, and at the end of the day everybody was happy. That is the best part for me: seeing a solution implemented and working, and everyone is happy.
Disney SRE Engagement Model
Gene Kim: Thank you, Amy. I am so grateful to you and Jason for being able to share that gem. Jason, this fits into some of the questions coming in. How does this fit in your typical engagement model? If I understand correctly, they were not an existing SRE client of yours. How do you jump into a crisis like this? Typically, what are the ways that you liaise with the other groups so that you are actually embedded in them?
Jason Cox: That is a good question. The way I look at it, there are three different ways that we engage.
The first at the outset is that we are helping build a community around SRE and elevating SRE practices across Disney. We do that with everyone. Part of that is our outreach. You are familiar with our Jedi Engineering Training Academy, Gene. It is one way that we connect our technologists. We have some 3,700 different technologists from all across our businesses, from Cruise Line to Marvel, Pixar, studios, consumer products; everyone is part of that community that we build. We bring experts in to come speak to them, including ourselves. Topics come out of that: infrastructure-related, cloud-related, maybe even AI-related.
Those are part of the engagement that we operate on, being in this unique spot in the company to bridge all the different business barriers, to pull people together. We believe all this serendipitous, virtuous thing happens whenever you can pull these technologists together. So we help do some of that.
The second, and sometimes that will lead to a second engagement, is that we will flex in. We have this flex model, like what Amy is talking about, where we have a bench of highly talented, SRE-minded technologists and engineers that can drop into different engagements. It could be different business units, different projects, things like this. We have got to get something out. We have got to get this shipped. Do we have some help? We need to get this automated. It has got to get to the cloud. All sorts of engagements come in through that, where we flex in. In fact, we call it our SRE flex team that can flex into these different groups.
To be fair, as you talk to any SRE team, you know that SRE is a scarce resource. Being able to have that bench time has been difficult for us. It also means that we have to be judicious about those engagements. We look at those that come in and say, Is this a company priority? If it is, we have got to figure out a way to resource it.
In some cases, like this one that Amy was on, we were able to pull engineers and deprioritize some of the other work that they were doing, or delay it, to shift over to help get this done. Why? Because, as the executive who reached out to me in Slack when this all kicked off said, We cannot not get this, Jason. This must go out, would you please? Part of it is that we know very clearly some of the priorities for the business. That is why we have that team. Let's get this done. We do that all the time. It could be a short-term engagement, several weeks, or it could be a year at times.
Sometimes they extend beyond that, which gets to the last model: we have teams that full on embed into groups that did not have that expertise and were looking for a longer-run engagement model with us. We have some teams, for example, that are always supporting our Imagineering team. Some are always supporting different parts of the studio team. Some are supporting consumer products, our retail tech, and some of our parks. Those are the three different models. That is how we engage.
Gene Kim: That is interesting. As an aside, I have been studying you for going on a decade now, decade-plus, and I have never actually heard that. How are Jason's teams able to jump in and help people who are in need, having genuine crises and technical emergencies? You are actually reserving capacity to some degree, so you do not have to say, Sorry, I have some capacity freeing up in 2025, right?
Jason Cox: That is it.
Jason's Leadership
Gene Kim: By the way, just to share a little joke, I was telling Jason after the Las Vegas conference that I was having the sneaking suspicion that maybe Jason is actually causing all these problems, maybe walking through random areas of the business, throwing a wrench in things just so that his teams can jump in and save the day. Of course, that is obviously a joke.
Amy, what is it like working with Jason? What do you think makes him effective as a leader? For people who maybe do not have Jason or aspire to be a Jason, can you say what Jason does and models that makes him effective?
Amy McCain: One of the things that makes him an effective leader is the ways he makes things easier for me to do my job. He has always got an open-door policy. He is always willing to listen and do some solutioning with you.
But one of the things that I appreciate most about Jason is the way he makes my job harder, which is that he is constantly challenging us. You will figure out pretty quickly when talking to him that he might be saying something that he does not entirely agree with just to see if he can get you to challenge him on it. It is an interesting dynamic. He likes to make sure that we are free to challenge his ideas and present our own as equal.
Gene Kim: Can you give me one example of how Jason made your life harder and why it was actually worth it for you?
Amy McCain: That is an ongoing thing in our one-on-ones. He will throw a question out there, and it will be a tough question to answer. He really is interested to see where I will take it. That is part of being a trusting leader and a trusted leader. Jason has developed that kind of relationship with all his direct reports and with everybody he talks to, as far as I can tell.
Jason Cox: I can tell one about the superpower that Amy has. This was probably a hard thing. I said, Hey, Amy, you do great with this engagement with these teams. How about two more? Could you please run after these two more? Am I right, Amy?
Amy McCain: Yeah, that too. He never stops challenging us to grow in our capacity.
Gene Kim: This is great. I love that. Dr. Wester introduced me to the notion of the socio-technical maestro: high energy, high standards, great in the large, great in the small, and loves walking the floor. That sounds like this example.
Choosing and Funding Engagements
Gene Kim: Here is a question from Nick. How do you decide which engagements to take? Can you talk about how you judge which ones are the ones that you are all in on, that represent the commitments that you and teams have to make?
Jason Cox: That is a great question. There are sort of two dimensions. One is that it is pretty easy to align to super company priorities that are coming down from our CEO, for example. If it is aligning to one of those major efforts that we know are priorities on the grid, we are going to figure out how we align that, because that is going to meet the business outcomes coming from the top office all the way down.
But so much of the problem is more nuanced. There is this one app that helps support cast members. There is this one app that does one little niche thing. Or it is a platform that many people use. Could you please come and help? Those become more difficult to suss out: should we engage on that?
I hate to say it, but it is like anything: what allows you to be able to staff engineers on that, at the end of the day, is funding. A big part of where we know we can run into different groups is those that are actually able to show back, show money that they have, as a signal this is a priority for the company. Our engagement model includes that. We look to see where teams have the funding for it through project funding. It could be a capital project. It could be an operating expense project, whatever it is. It is a signal to us that this is a priority for the business, so much so that they have the reserve to be able to do that.
One of the ways we have been able to build a bench is that we operate in an agency model. I have talked to a lot of SRE teams, not just Disney but outside larger organizations, about how you have a bench team. Predominantly you are going to be deeply buried into supporting these products, and your team will not have any sort of bench.
The one thing we did was that we decided to overcharge. We overcharge per hour. We overcharge for the resource. What I mean by that is I worked with our finance team, because at the end of the day we are a cost center like any other shared service. If you can work with the finance team to negotiate a way to have extra buffer, and I am transparent to all the business teams, I say, Hey, listen, we charge you at this, and that does not just cover our costs. It covers the ability for us to flex in to support you.
It allowed us to build enough of a reserve. You can talk to any of our SREs across the team, and they do not think I have enough of that. I agree. But we do have that, and it allowed us to create enough buffer to drop team members into different parts.
Close
Gene Kim: This is so great. Amy, Jason, there are a whole bunch of questions in the Slack channel. I am hoping you will get a chance to address those, maybe even in networking connection time. This is the first session, and I am already loving this format because there are all these wonderful gems I have heard for the first time. Amy, Jason, thank you for all your amazing work. Keep up the amazing work, and we will catch you soon.
Amy McCain: Thank you, team.
Jason Cox: Thank you.