Log in to watch

Log in or create a free account to watch this video.

Log in
San Francisco 2015
Share
Download slides

DevOps in the Enterprise: A Transformation Journey

As the largest live event ticketing company in the world, Ticketmaster is in the rare position of having to balance the needs of a multi-billion dollar global business with the need to defend against competitors that are hungry and very agile. Powered by legacy systems, it’s clear that Ticketmaster can’t rest on its laurels and must innovate to maintain its market leadership.


But how do you take such a large, mission critical enterprise service and make it nimble enough to compete with competitors and delight our customers?


When Live Nation and Ticketmaster merged 5-years-ago, the executive leadership team had the vision to commit to re-architecting Ticketmaster’s 39-year old ticketing platform. While this is an incredibly exciting prospect and will lead to industry changing innovation, it does create huge challenges for Ticketmaster’s Global Technical Operations team. As with any long lived enterprise, the “muscle memory” around managing, developing and delivering our service is very well formed. Changing habits is hard and in the high stakes game of selling tickets it is fraught with risk, both perceived and real.


Jody Mulkey, Chief Technology Officer, Ticketmaster

Chapters

Full transcript

The complete talk, organized by section.

Jody Mulkey

Thank you. Wow, it's pretty humbling to be called a leading thinker from someone like Gene Kim. So thank you very much for that.

First of all, I really want to say I'm super excited to be here. Seeing the community grow like it has is just amazing. 100% growth year over year in attendance at this conference is just one of the metrics that tells you that we're onto something. At the speaker dinner last night, I was saying, "Has DevOps jumped the shark?" Well, I certainly hope not.

I want to talk to you a little bit about Ticketmaster. And one of the things that I realized when I joined a couple of years ago was that we do a lot more than just sell tickets, right? We power these life's experiences.

And I wanted to give you a specific example. About two years ago, actually last weekend, I was up in Seattle to attend a Seattle Seahawks versus Tennessee Titans game. And I bought tickets on ticketmaster.com. I flew up there with my best friend, and we were at the stadium. We bought front-row seats, and we're sitting there, and there's only four seats in our little space. And the two seats next to us were empty.

It was about 10 minutes before the game, and in walks this family, young family: father, mother, and their young child, in head-to-toe Seahawks gear. They must have spent $1,000 at the team store. And they were just incredibly fired up.

So the gentleman sits down next to me, and he starts saying, "Seahawks, Seahawks, Seahawks," in this Australian accent, which I will not try to do.

So I turn around and I ask him, "Wow, are the Seahawks big in Australia?"

And he was like, "No." He's actually, "I'm the biggest Seahawks fan in Australia, and I'm probably the only one. I'm an orthopedic surgeon from Adelaide, and I did my residency in northern Oregon, where I became this huge Seahawks fan. And it was on my bucket list to bring my son, Oliver, to a Seahawks game."

And so why don't we just let him tell you a little bit about it?

All right. We're live at the Seahawks-Titans game. We got some Ticketmaster Plus customers here.

"G'day, mate. My name's Matt from Australia. This is my son, Oliver. I've only flown 25 hours to be here. Ticketmaster.com, beautiful experience. Front row atmosphere. Doesn't get better than this. No other place in the world could it happen. Seahawks, Seahawks, Seahawks. Whoo."

So it was at that moment, when he told me that this was on his bucket list and he flew 25 hours to be here, that the responsibility of what we do really hit me. And like a ton of bricks, it gave me a ton of anxiety, to be honest. We're responsible for this experience. And it's an experience for him that he'll never forget, and his son Oliver will never forget.

And I think that it's really symbolic of the type of culture that we're trying to create at Ticketmaster and the pride with which the technology team really takes in what they get to do.

So I'm going to jump in a little bit. The longest journey must begin where you stand, and we've been on quite a journey.

So some background. Live Nation, Ticketmaster is a proud division of Live Nation. We're the largest live entertainment company in the world. We have about 450 million fans in 40 countries around the world. We operate over 120 venues, and we put on about 24,000 concerts each year. So that means that every 22 minutes, somewhere in the world, we're connecting people with unforgettable moments of joy. It's an awesome responsibility.

Let me give you a little background about Ticketmaster. When I joined Ticketmaster, if I would've thought about it for about 30 seconds, I would've realized that we're one of the oldest software-as-a-service companies in the world. We've been selling our software as a service for 39 years.

We started in 1976 in ASU. In 1996, 20 years later, we launched ticketmaster.com. In 2010, Live Nation and Ticketmaster joined forces. In 2011, the transformation journey begins.

So a little bit background about Ticketmaster. In the technology org globally, we have over 1,200 folks in 16 different offices. For scale around applications, we have 127 major applications and thousands of components or services.

And so it's quite a large-scale enterprise. And I say enterprise because we are burdened with our legacy that also powers billions of dollars of revenue. And so it's tough to see that as a burden, but as many of you know, it's quite challenging.

So let me set the stage here. During this transformation, Live Nation, the leadership, had the vision to invest heavily into the technology team and to build and refresh the platform. So from 2011 to 2014, we had 230% increase in developers and only a 12% increase in operations.

That's a problem. Especially a problem where operations are the team who does deployments for you, who wakes up in the middle of the night for you, right? When there's so many more developers creating change and creating products than there are operations, well, DevOps to the rescue. At least that was what we thought.

So in trying to bring DevOps to a large organization like Ticketmaster, I rely heavily on analogies. I'm just cursed with analogies, because it's a way to relate to folks, to help them understand what's going on.

And so when I try to explain DevOps to our team, who have been... Our tenure at Ticketmaster, it's really incredible. When I joined in technical operations, there wasn't a director or above who had been at the company for less than 10 years. The average tenure in technical operations is 12 years at Ticketmaster. So it's an incredibly well-tenured team, really focused on delivering.

And so I try to explain it: it's kind of like football. And I've been using this football analogy in my career for probably 25 years. I used to explain it like this: ops is like defense. Prevent the other team from scoring. And then I said, well, dev's kind of like offense, right? They're the ones trying to score the ball, trying to score.

And then I thought about it, and I'm like, wow, I am doing this wrong. I'm doing this wrong because that means they're not on the same team, or they're not on the field at the same time.

And so I've refreshed my analogy, and I feel much better about it, where it is just like football. And you think about operations as the offensive line. They're trying to create time in the pocket for the skill positions to move the ball down the field. And I say skill in quotes, because if you've seen some of the code that we have, you would, that's quite a skill.

But this was very effective. And I even had a little screening of The Blind Side for my ops team. Because they have this huge protector instinct, and they're trying to protect our customers, whether they be fans or clients, trying to protect their experience.

So all of their resistance to change, it's really coming from an incredible place. And these are the most dedicated people that I've ever worked with in my entire career.

So the strategies that we are finding to be helpful for us are really around these three domains. One is around empathy, around empowerment, and around metrics. And so I'm going to talk you through these a little bit here to give you a sense for that. And remember, we're definitely still just on a journey.

Empathy. Joining a large organization, I had been at a smaller company, about 300 people, and we were David sort of fighting Goliath. Joining Goliath, one, it's quite a challenge, and it's a lot harder to be Goliath, to be quite honest.

But joining this organization, I wanted to get context and really understand what's it like to be in the shoes of my team. And so I did 75 in 75. So I interviewed 75 people on the team in 75 days. And these were folks all up and down the value chain, from the folks that were doing sales to our clients, to product management, to the finance team, to operations, to development, to QA, all across the org.

And it was incredibly powerful because I really got to understand, what's it like to be them? What makes it hard for them to do their job? What gets them excited to come to work? And I pulled all of those things together in the first 75 days to really start to build a strategy, and one that resonated with the team.

I believe that you cannot create material change without engaging the hearts and minds of your team. And when the suggestions about the improvements we need to make come from them, they're a lot more likely to buy in.

The other is, I noticed that as a big company, as a startup where I was before, you're always connected to your customers, right? Sometimes you're the support engineer, you're the engineer, you're the ops engineer sometimes. But you're always connected to these customers.

And in our business, everyone can be a fan, and at Ticketmaster, everyone is a fan of live events. But really connecting with our clients, it's really tough. We have 12,000 clients around the world who put on these live events, and it's not that easy to connect. Like, what's it like to be a box office manager? What's it like to be the guy at the door scanning?

So we started this program called Breathing Customer Oxygen. And to date, over 150 folks in the product development and operations teams have gone out to night of show, sat in a box office, worked a window, sat in a scanner, and really understood what's it like to be these folks.

When there are 5,000 people waiting to get in to the Rose Bowl for One Direction and that scanner breaks, you have to know that the developer will have a new sense of importance of what they're doing when you have 5,000 screaming 13-year-old girls trying to trample you.

So it's really a great way to connect with our customers, to connect people with what they're doing, and to get that feedback loop that you just don't get in the B2B software-as-a-service world.

The other thing we did is, everyone is a fan. And so we started a program at Ticketmaster where everyone can actually go through the customer experience. Just a quick show of hands. How many of you have been customers of your company in the last 30 days?

Okay, so my rough math with the blinding lights is maybe 15%. But it really gives you a newfound perspective when you can walk in the shoes of your customers.

So we give everyone a gift card to go and buy tickets, and then go and attend the show and give us feedback on the experience. Not the experience of just buying tickets, but the end-to-end experience.

And the last thing, it's really about breaking bread together. So we started a DevOps meeting every week where we have lunch on Thursdays. And some of these leaders in the development teams and the operations teams, they had worked at the same company, but they had never sat in a meeting together, which is just wild.

So we put them together. It's really tough to be angry at those guys when it's just Bob who you have lunch with on Thursdays.

And we also started, what came out of this was a really interesting program that we've built in called DevOps Training. And so one of our incredible system engineers had this idea to put together training for our team so that they can use our tools.

While they may not be the best tools, they're the tools that we have, and it was really a great way to scale up the needs of the operations team by empowering developers with how to provision a server, how to connect into our monitoring system, how to connect into metrics, how to do all of these basic tasks that they need to take ownership of their product.

And again, we've had over 100, I think, 120 people go through that program. It's a three-day training. People from all over North America fly in, and the team has been training those folks up. They get a great little sticker for their laptop saying, "DevOps certified," and they get access to production to be able to manage their own products.

So it's really been an incredible way to bring people together to have empathy for what we're doing. The operations teams, we don't have the best tools, and so when a development team puts in a ticket for something to get done, they actually now understand what it takes to do that, and it's been really helpful.

So empowerment. This has been another really, really strong strategy. I think it's a common theme around the DevOps community, but for us, this has fundamentally changed the game for us.

So more specifically, teams crave responsibilities. And so in the last presentation, they were talking about how great people want to work with great people, and Gene was saying how the best want to be surrounded to the best. Well, I absolutely believe that game attracts game, and people want to work with the best.

And the folks on our teams, our development teams, we have over 110 development teams globally. They all want responsibility to run their product. They didn't all want it in the beginning, to be quite honest. But once they saw how quickly teams were able to deliver and how they were able to minimize these business disruptions based on the ability to quickly make changes to their software, it was pretty incredible.

So we have over 110 teams globally. The 73 teams that we have in North America, 100% of them push their own code. So 100% of these 73 teams push their own code. When we started in 2013 in the summer, we were at 2%. We also had about 60 teams. We were at 2%, and then we have grown that, and now have 100% adoption where each of our teams pushes their own code through the development pipeline, all the way to production.

Each of those teams is on call, and they manage their own service delivery.

The other is one team, one mission, one goal. This is where I would put up the DevOps silo slides right here, so imagine that in your brain, but we had massive functional silos across the organization. Our company is so functionally oriented, it's been incredibly challenging to get folks to truly work together.

So we have this concept of four in the box on any one team. So we have a product leader, an engineering, which also means operations, UX, and a process person. So those four really work together to deliver value to customers, whether they be internal or external.

And so giving these teams one mission, one goal, is really this binding concept. And this is an area where we're continuing down the journey, where the performance reviews for the folks on our team really come from the team in which they deliver value. So 80% of someone's performance review comes from the team in which they deliver value, not their manager.

And so this has many of our middle managers sort of scratching their head, like, "Well, what do I do?" And the job of a functional manager is to make someone great at their function. To help them grow their career, to coach them, to know what good looks like, and to bring out the best in their team. This has been a big change.

Development teams on call, I mentioned that. First thing joining the company was, okay, now the development teams are the first line of defense, and magically, the code got better. I think that's just a truism. Magic. It's absolute magic.

And self-service dot star. So everyone, the mission in the operations team is: why are you the one doing this work? Is it because of skills or access? And then how do we remove each of those as the constraint so that everyone can manage their own systems?

We're on a big, heavy push to go to the public cloud. And it wasn't too long ago that our operations team was not the most friendliest of teams. Phrases like, "Get off my floor," might have been uttered. Phrases like, "Come back to me when you have a smarter question," might have also been uttered.

And so, as it turns out, public cloud APIs have never said no. They've never made someone feel silly. And we're really trying to move to the self-service model so that you can be empowered and not be blocked.

At Ticketmaster, blocked is an unacceptable state. You can't be blocked. You have to keep moving. You have to keep finding a way.

So I'm going to jump in a little bit about metrics. Business metrics are better than system metrics. Historically, at Ticketmaster, we're all about CPU and disk and memory, et cetera. Those don't matter. Are we selling tickets? Are our fans happy, or are our clients making money?

And so business metrics are the primary metric. I found this to be a big challenge when I first joined the company, is that none of the product managers talked about money, or they didn't talk about transactions. They just didn't really focus on that.

And so now we have a revenue graph on the operations sort of dashboard that basically shows predicted revenue over actual revenue. And so you can really understand, are we having an issue or not having an issue? And so that's really how we're trying to focus around uptime, is focused on these business metrics.

The other is that outcomes are way more important than outputs. Historically, we were a ship-the-feature company, not a solve-the-problem company. And so we're really trying to transform how we do product development, how we do operations, to really focus on these outcomes.

So I think that everyone in the company is tired of me asking them, "What problem are we solving? What metric is going to tell you that we're making progress?" But it's really religiously kind of repeating that ad nauseam. And I have a great partner in my chief product officer who basically says the same thing, but a little bit more elegant. So now that they are desensitized to me asking, we have a new voice in the mix.

Instrument everything. So we made it very easy with some open source technology to instrument everything. Our service template, that is our basic default Java service, has annotations very quickly. With a couple of keystrokes, you can instrument your entire service and have it tie into the metrics and dashboard system. So that was super easy.

Democratize the data. So all data in the company, other than PII and credit cards, et cetera, is default open. It wasn't the case and it wasn't the culture before a couple of years ago, but it's default open. Right? So GitLab is default open. Right? There are a couple pieces of code that are sort of hidden off that have really proprietary information, but default open.

Democratizing the data is a huge fundamental principle for us, and we have found that it has really helped us find more problems. When more eyes are on the data, we find more problems.

So talk about some lessons learned real quick. Change is hard. It is really hard, especially in a team and in a company that's been doing things the same way for a very long time.

Historically, Ticketmaster did not invest heavily in technology. So the team that built the systems did very clever, and they put a lot of hard work in, and it's very clever how we've done things. And now that we're investing a ton and growing the team, it's been challenging for those folks to change.

Again, focusing on empathy has been a big key, but it's just really hard when folks have been doing the same thing for a very long time.

The other is that empowered expertise equals lower mean time to repair. Right? So this is fundamental for us and it really changed how we operate, and I'll show you.

We have this concept of support at the edge. So it's really a formalized support model. We move support as close to the customers as possible. It focuses these teams on projects that can prevent problems.

So we've really structured our technical operation organization in this fashion. We have product support, who are really the frontline. So they're taking calls from clients. We have 12,000 clients around the globe. Well, those clients sometimes have issues with the system. They take the first ring.

Then we have our technical operations center. So this team is 24 by 7, sits in Hollywood, California, and handles incidents and alert, and they manage incidents as they come in.

Then we have our site reliability engineers. So this is kind of tier two. These are folks that have deep application knowledge who are on call.

And then we have the production engineering team. So this is basically engineers that are aligned to our scrum teams, that have domain expertise about these systems, and they are sort of the third tier of support.

So let me show you a little bit. Again, I'm the analogy guy, right? So really thinking of support at the edge, our technical operations center, that's like the EMT. "Hey, I got in an accident. I'm going to try and fix you up before we get to the hospital. If I can't, I'm going to take you to the emergency room." Right?

So think of our site reliability engineers as emergency room doctors. They have a broad, really wide set of knowledge, and have deep expertise in their applications, and they know how to deal with incidents. The picture there is one of our SREs, Johnny, who's actually through his leadership here, has now gone on to do even better things.

And then we have production engineers. Think of the production engineers as like surgeons. So they have deep expertise in an application, and they can help find these issues.

And if you really think about it, this actually helps provide a great career path in operations, which I hear it's been a challenge. So you think someone out of school or first job, I don't care if they went to school or not, they start in the technical operations center. They get a lay of the land. They understand the whole ecosystem by seeing how it chirps and alerts and how it has issues.

And then they can move into a specific area of application expertise and join the SRE team. And that way they can get deeper knowledge of an application, really start to work more hand-in-hand with the development teams.

And then over time, they can then graduate into becoming production engineers with a lot more expertise, and maybe they have expertise in storage or networking or databases. Right? So it not only provides us this tiered level of support, but also provides a career path.

So before support at the edge for Ticketmaster Online, our web operations incidents, our average mean time to repair was 47 minutes. Right? So which is horrible, right? That means for 47 minutes, fans are having a hard time connecting to these unforgettable moments of joy.

Through some work with the teams and encapsulating a bunch of these responses, we used Rundeck to basically encapsulate a bunch of remediations and hand that over to the technical operations team. We were able to drive that down to 3.8 minutes. So basically a 90% reduction in mean time to repair. Right? Phenomenal change, right? The business went crazy over this. These are measurable outcomes that have really helped demonstrate that we are making progress.

Other lessons learned. DevOps has nothing to do with technology. Nothing to do with technology. And here's why I believe that.

Our ticketing engine, AKA the host, right, it powers $25 billion in commerce annually. Performance is measured in microseconds. The code was first committed in 1976. It's VMS running on emulated VAX. So you think you've got legacy? We've got legacy.

But here's what I want to share with you, is that legacy doesn't mean that you can't operate with the best practices. So our team, they've named themselves the Bedrock team because they are the foundation of this. They deploy their own code, right? They have PagerDuty for on-call alerts, and they leverage Cucumber and BDD for functional tests.

Here's a view of their dashboard for Cucumber, right? So we are running BDD tests for the host VMS on emulated VAX. DevOps has nothing to do with technology. It's about a mindset, it's about being able to take responsibility, and it's about owning your system end to end.

The other lessons learned: there are only two states in my mind. The system owns you, or you own the system. These are the only two states that matter in technology.

It's quite often the case we find ourselves in the first state. We actually found ourselves in that first state with our ticketmaster.com property. It drives 40% of the revenue of our company. The majority of the system was built in 2000, mod_perl, and extended and extended for the last 15 years. And most of the tribal knowledge is no longer at the company. It's kind of a scary proposition.

And so to attack that, let me introduce you to the Boom team. So you'll see up here that this is an Amazon Dash button that our team has put the label Boom on.

So this team has been tasked with metal to money with no hands. So if you can build your system, right, then you can actually own the system. So there will be tribal knowledge is not encapsulated in people's heads, it's actually encapsulated in code.

So this is the development team taking over 100% service delivery. Push-button deployment of the Ticketmaster Online stack. So when they press that Boom button, it takes the most recent artifacts, it builds the most recent release, takes those artifacts, and pushes them to AWS. Metal to money with no hands.

Now all that tribal knowledge is encapsulated in the code. You should see this team. So earlier, the Target folks were talking about putting them in a space so they could work together. We actually put this team, a cross-functional team, across the street in another building, and we cleared out space for 20 of them to work together, whether it be development, operations, QA, program management, product management. Everyone sitting together making this happen. And you should see the camaraderie. It's incredible. When you put great people together, they will do great things.

So closing thoughts I just wanted to share is that what we've also learned is that ego is a force field for learning. If you think you have all the answers, why would you try to learn something new?

And at Ticketmaster, I think for a long time, we thought we had all the answers. But it's pretty clear that we don't. And to see our team really figure out how to operate at scale, how to operate in a no-fail environment, and how to really push excellence across the org, encourage excellence, I should say, across the org, has been really incredible.

And it's thanks to folks like you in the community who have helped us learn. We're learning from the best, and it's been a really great opportunity to help see this come together at scale and to see the culture really align around what great looks like.

So with that, thank you.