Log in to watch

Log in or create a free account to watch this video.

Log in
San Francisco 2016
Share

DevOps in the Midst of an Airline Merger

DevOps as a cultural change agent to bring enterprise/federated, infrastructure/development, employees/vendors together, while merging two major airlines.

Chapters

Full transcript

The complete talk, organized by section.

Susanna Brown

What I love about the video is that it speaks to our rich history, the many firsts for American Airlines, and a new first today: an airline speaking at the DevOps Enterprise Summit.

Good morning, San Francisco. Ben and I are so excited to be here and share with you a little bit about our DevOps journey.

At the end of 2013, US Airways and American Airlines merged. The financial transaction itself was very quick. The IT integration, not so much so. In fact, we estimate it's going to take between five and seven years to complete all of the work we need to do to bring together 1,400 systems.

But we had an agreement early on with our business groups, and that was that we were going to focus on integration before we did innovation, in other words, the new ideas coming forth from the business. And that was very intentional, because we wanted to better serve our customers, better serve our employees, and ultimately realize the value of the merger.

However, drawing a line in the sand is easier said than done in terms of getting the work started. Each legacy carrier brought with it, basically, different headquarters, different standards, technology stacks, very different cultures. So moving forward with work was as much an organizational change management effort as anything else.

And mergers are the perfect vehicle to really foment a lack of trust. And trust was just what we needed if we were going to get all of that work done.

Ben Chan

That's true. We needed trust.

There were thousands of front-end devices that had to be integrated: computers, kiosks, mobile devices. The task at hand was huge, very, very complicated. And planes have to keep flying, so it was crazy times.

Susanna Brown

Yes. So at the beginning of integration, we determined we had about 2,000 projects on the books that we had to go and tackle. We've completed about 1,200 of those. In fact, I think this slide is a little bit dated. We're about 70% completed with all of our integration work.

2015 was a major year for us. We started the year, first quarter, with the integration of our loyalty program. That would be Dividend Miles and AAdvantage, followed very quickly in second quarter with getting the approval from the FAA to fly as a single carrier.

While all of that was happening, we had parallel IT teams working with corporate real estate in order to build a whole new operating center into which we were going to bring our folks from Pittsburgh and Dallas, our crew schedulers and dispatchers, to manage the entire operation. That completed successfully in third quarter.

Fourth quarter was basically the merger of our passenger reservation system, and probably the change that was most visible to our customers because it entailed changes to our airports, our reservation system, our website, and our mobile applications. But it went off seamlessly. And as you can see from some of these newspaper headlines from last year, it was definitely a success.

Ben Chan

Yes, you can see there's no shortage of work with all that goes on in a merger. The success of these projects can be attributed to our laser focus on the most critical projects, or what we call the big rocks.

And from a resource perspective, we added any and all necessary people. However, our CIO, Maya, says we can't keep throwing bodies at every problem. And the largest integration project was on the horizon.

Susanna Brown

Absolutely. This project is called Single Flight Operating System. And for my organization, the epiphany that we really just couldn't add more people to do the work came early in 2014 as we were planning out this project.

This was putting together our crew and flight management systems. In fact, it was so complex that US Airways had not completed the integration of their crew systems. So what we were looking at was integrating three crew management systems and two flight systems. Basically, three airline mergers in one.

So we realized we really had to take a different approach.

From an infrastructure perspective, it would normally-- You need to go back one slide.

It would normally take us about six months to build up the infrastructure for a new application. That would be all the way from development to production. And here we had on hand the work that we needed to do to bring online 26 new applications.

The good news is that we had built quite a bit of trust equity in the previous year with a lot of our cultural change, and we were able to introduce a lot of automation to be able to accelerate the infrastructure delivery, risk mitigated, in fact, and also hopefully set the stage for a much steadier operation in the future.

Ben Chan

Yes, we had a mountain of work to cover, and our systems were critical for AA to fly. As we look back over the last two years, we realize that the merger, and specifically Single FOS, created the momentum for us to work differently.

Connecting teams was foundational, and we set that stage in 2014. Intense collaboration was introduced with Single FOS in 2015, and creation has been the culmination in 2016.

Now, connecting is obviously that foundation cultural piece that everyone talks about. In a merger, connecting is even more challenging due to FUD, or fear, uncertainty, and doubt. People were afraid that the systems that they know and love may not exist in the future. They were afraid of losing their jobs. And then pile on the distributed locations of IT development teams being distributed made the connecting and even creating trust a challenge.

Susanna Brown

So, in order to create trust after merger, in my organization, we rallied around a shared vision. We established that in 2014, and it was namely that any team member could add value to any team from anywhere.

And being able to realize that vision really required our leaders and our teams to sit down and establish some common norms of how we were going to operate, develop, and function. As we say in the South, we had to sing from the same hymnbook. So we took a very intentional approach to get to know each other, build trust, and learn to work together across distances.

So this is a screenshot of our internal website. It's based on Jive. And so when new team members joined the group, they obviously had access to this. And this is where we posted our shared architectures and team structures and miscellaneous tips.

Information from staff meetings, those critical pieces that we wanted to propagate quickly to the teams, would be posted for transparency and really for that information flow to all of our team members.

Ben Chan

Yeah. And you can see that Susanna is informing us of some very important news. Chick-fil-A is coming to our dining venue called The Spot.

Susanna Brown

Yeah, it's all about food everywhere.

Ben Chan

It's all about food.

Susanna Brown

If you know me, there's no better way than to connect over food. And for Ben, it's cupcakes.

Ben Chan

That's right.

Of course, as technologists, there are other ways to connect, and we want to use our tools. So communicating in real time is very important.

Here you see my team working on a spike in compute for our line of flight display system, and that's the system that tracks our planes. The team was able to quickly add capacity in our virtual environment, and thank goodness, we had no disruptions.

Here you can see we are targeting lean teams of software development with a focus on items to complete. And finally, we use video conferencing and virtual whiteboards to facilitate exchange of ideas and collaboration without losing focus and to foster those personal connections.

Susanna Brown

Right. And at the end of the day, we're really all social creatures, right? So having shared experiences really helps build bonds. Volunteering, happy hours, team building, and outside activities really have made a huge difference to bring our teams together.

Ben Chan

Yeah, those events are planned, but I think it's just as important to just have shared experiences. And that could be running through the airport to catch a plane, or like I did last Friday, I was learning Bollywood moves with my team for a Diwali celebration. So that was really fun.

But at American, we also love our Halloween events. And so dressing up and sharing trick-or-treating with our employees is really important. But this also allows us to extend that relationship building with our employees' families.

So 2014 was connecting. Moving on to 2015, it's been about collaboration. That was what we did a lot last year. And we focused on automation, synergizing, and really partnering across the enterprise for some of our work.

Susanna Brown

Yeah. So speaking of automation, the DevOps toolchain is where all of us, as technologists, gravitate. Again, with our merger, we had different technology stacks, and we wanted everyone to come together around a single set of tools. But in the end, it's best not to slow down work or impede progress in pursuit of this utopia.

So it's all about delivering business capabilities to the business and our customers. And we decided to actually have two DevOps toolchains so that the developers could work in familiar stacks.

So here we have the Java toolchain and, similarly, the .NET toolchain. Now, they do share several similar components, some of which are collaboration, would be Slack, Puppet for configuration state management, vRealize Automation for provisioning, and testing tools, to name some.

And we know you just love your configuration automation, don't we?

Ben Chan

Yeah. So configuration automation is near and dear to my heart, right? How many times do we spend half a day or multiple days troubleshooting to find out that one out of four nodes was configured incorrectly? Or that it works fine, developer and application code works fine for a developer, but not in the higher environments, and that was because a configuration change was made that we didn't capture.

So here we have an example of Red Hat patching from my team. Say you have an engineer. They need to patch Red Hat, and what they need to do is find the host, log in, download the software, apply the software, verify that it's okay, and then they get to move on to the next node.

Well, if they're really fast on a keyboard, that's five to 10 minutes, which doesn't sound like much. But in our dev environment, going across 200 nodes, that's now almost 17 hours. Right? So with Puppet, we can do that in five minutes.

And it's not just about speed. It's also about consistency, because I don't have to worry about my fast-typing engineer potentially mistyping something. And it's quality of life, right? We can set this off to go off at night, and we get a report in the morning to see that all of the nodes have been patched successfully, or maybe there's one or two that are straggling.

Also, working with our security team, this was a wonderful example that we had where we were informed and notified of an OpenSSH bug, where we were able to identify the systems impacted and correct those files all within five minutes of receiving that notification.

Susanna Brown

So at American, we like to say, really, that technology is the engine that runs the airline. But we don't just live in a vacuum.

IT at American is probably, we have about 5,000 IT professionals, and they're distributed across four major organizations. We have Customer Technology. That's where the website is, as well as reservations. And Cargo actually is in that organization as well.

Corporate Technology, which includes everything that is finances and employee technology support. Enterprise Technology is the group that has security, infrastructure, and networks. And then we have Airline Technology.

So our organization lives within the Airline Technology group, and we support maintenance, crew, flight, safety, security and environmental, catering, and onboard technologies. So we kind of are the group that makes the airline fly.

But the reality, though, is that each area is critical for American to move forward and strive to be the best airline out there. So what we've been doing most recently is a lot of partnership across the groups within Airline Technology. And we know that we can't succeed in DevOps without learning from them and them learning from us as well.

And a lot of what we've been doing as well is working with our enterprise groups, network, infrastructure, and security, to try to automate more and more of the processes to ensure that we can move forward on DevOps.

Ben Chan

Yes. In fact, last month, we partnered with our enterprise friends and kicked off a series of automation challenges focused on solving specific problems.

The concept is one day a month, we all get together to tackle stories to further our strategy, cut through bottlenecks, and clean up technical debt. In this case, we wanted to tackle Slack and email notifications for our .NET and Java builds. We wanted to automate Active Directory, SiteMinder, and firewall access requests.

Susanna Brown

So let's just pause right there, because I thought that what Heather said earlier, what we've been doing with the automation challenge sounds very similar to what maybe Target is doing with Dojo. And so we're really excited to learn more from them on what they're doing there, and maybe how we can incorporate that into the automation challenge and make it even better.

So going back to the cupcakes I mentioned earlier, obviously any activity like this succeeds only if you have a lot of food around. So breakfast, lunch--

Ben Chan

Specifically dessert and sugar.

Susanna Brown

--and cupcakes. And this was around Halloween, so we had some awesome decorations there. Oliver loved it.

Ben Chan

That's right. Now, sometimes there's this discrepancy between what you design and what you implement. So while we had these very ambitious plans, we hit some rough waves.

Susanna Brown

That's really somebody from our team.

Ben Chan

That's so--

Susanna Brown

Yes, it is.

Ben Chan

That's great.

But let's remember, we need to experiment, fail fast, and keep improving.

So at the end of the day, we had 11 initiatives provide readouts. Here you can see that Lucas is giving a readout on the work that he did with the enterprise security team around automating SiteMinder configurations. And they have an approach now, and I'm really looking forward to seeing that in the lab.

Susanna Brown

Yeah. Same here.

So moving on, this is really what I consider to be the Achilles' heel for DevOps, at least it is for our organization, and that is test automation. So we recognize that this is an area where we're lagging, so we're trying to create lighthouse projects that we can use to showcase to the rest of the organization.

And the application that is furthest ahead in this area is our flight attendant customer experience tool, as you'll see what Ben is showing you here. You might have seen one of these on one of your recent travels on American.

And at the end of last year, we had about 60% automation on the applications on this device. At the end of this summer, we completed and got 86% automation. The 14 that remains is really the tests associated with the peripherals and the card swipes. So I don't think that we're going to be able to get those closed out.

So we're really excited, and Ben's going to tell you a little bit about the statistics in terms of the accomplishments on the numbers that we've achieved with automation.

Ben Chan

Yeah. So test automation is great, right? We were able to reduce our cost by 85% and our testing time by 33%. And by the end of the year, we're targeting a testing time reduction of 66%. We're really excited about that.

Susanna Brown

Really excited about that, and trying to figure out really how to push that out to more of the groups.

Ben Chan

Right.

Susanna Brown

So a critical part of collaboration at American and within my group has all been about learning and then automating and documenting what it is that we've learned so that others can follow.

And so one of the things that we've done that has paid a lot of dividends is pulling in members from other teams into the core product teams so that they can be more agile in their work. So we're matrixing in architects, tech leads, the program office, and even operations into those groups.

So really, this helps gain consistency across groups, and that's going to be really critical if we're going to accelerate DevOps. So all of this, though, is still taking place within the goodness of our agile and agile methodology. So we're not wanting to throw the baby out with the bathwater or anything like that. Instead, we're wanting to embrace it and figure out how do we make DevOps a part of it.

So we talked about connecting and collaborating, which are really essential for us to be able to create more value for our business. We need to make sure that we're providing systems that make our employees successful at the work that they're doing and make them happy.

And we definitely, for the sake of our mental health, want to make sure that we're simplifying the footprint for IT so it's simpler for us to support the environment. And if we do all of that right, we believe that we'll definitely be the trusted advisor for our business, so that as we're moving into that innovation stage, we're going to be well-positioned to deliver.

Ben Chan

Yes. And to be a trusted advisor, we have to deliver faster, starting with environments that developers use to build solutions.

Now traditionally, we allow for two months to build out dev and test. But now we can do that in a matter of minutes. And in fact, we put into place a self-service portal to allow our developers to spin up their own infrastructure through blueprints.

Now this automation not only speeds up delivery, but it provides consistency and empowers the development team to try new things, run experiments, and start and stop what they need.

Now, you would think that the developers would just run to this tool and just be so excited. However, there's always those cultural changes, right? So we've released it, we've got some people on board, and as they are learning and/or needing environments to try new things or to experiment with new software or to try algorithms, they're coming to us, and now we're starting to teach people how to fish and actually use the provisioning portal for their own needs.

Susanna Brown

Right. So we're excited to see that.

So here's an example of an application that we developed in a very short timeframe, and that I really, truly think that a lot of the automation that Ben was just talking about made a difference.

For about three months before Single FOS cutover, we got a request for a new application, a mobile application. And it was very important because it had to do with how our flight attendants were going to be able to change position as they were working in aircraft. So we had to deliver this application very quickly.

We spun up the infrastructure within a matter of minutes, and the application team also knocked it out of the park by completing the application and delivering it to the flight attendants 30 days before they were even expecting to do so. So a great success story.

Another success story out of Single FOS was the delivery of the training management system. So you'll remember what I was talking about in terms of simplification, and I think this is a great example of that, so that as we're striving to do DevOps and we're looking at our infrastructures, we're not just creating new applications, but also looking at how we can simplify the footprint.

This app removed five end-of-service-life applications out of the environment and also is going to be leveraged for both the pilot and flight attendant training. So, two in one. It was fantastic.

Ben Chan

Another example of simplification was the creation of services that could be leveraged across applications. So here are actually two that were most recently delivered by Single FOS, the new image service and the new employee notification service. And by having services like these, IT can deliver products faster.

Susanna Brown

Mm-hmm.

In the end, though, it's about what customers get. And we believe that not only delivering quality, but delivering quickly on customer requests positions us well to be the trusted advisors to the business.

So what happened with this massive project called Single FOS?

On October 1st, we actually successfully did the cutover. And one of the examples that Ben showed earlier about the Slack work, that was from the night of Single FOS as we were doing the cutover and adding capacity to our systems.

So it was a success story and really a testament to the work of hundreds of employees to get this out.

So we talked about a different approach that was needed to meet the merger integration milestones. In Operations Technology, my organization, we just couldn't keep on throwing bodies at it. DevOps ended up being our answer. It's just that we didn't know that was what it was called back in 2013.

But the beauty of the merger was, even though we had this lack of trust initially, merger became a catalyst for cultural change at American, which really allowed for DevOps to take hold within our organization. And we're really excited about what we're going to be able to do as we move towards innovation.

Ben Chan

Yeah. So I'm glad we could share our story with you. We're still very much in the infancy of our DevOps journey, and I look forward to sharing and learning from all of you.

A few things we'd like to know are: How do you measure success? How do you handle DevOps in a managed service provider world? How has DevOps simplified your environment? And finally, how do you market your IT capabilities?

And with that, thank you all so much for your time.

Susanna Brown

Thanks, Gene and IT Revolution, for having us out. And we hope to see you flying on American soon.

Ben Chan

Yes. Thank you.

Susanna Brown

Thank you.