Creating Safe Environments that Inspire Innovation

Log in to watch

San Francisco 2014

Creating Safe Environments that Inspire Innovation

Sr. Manager, Software Engineering and QA · BlackBerry

There is this notion that in order for DevOps to be successful it has to be supported by top level management, or CIO. We could walk through what really matters from both sides of top level management and grass roots. Ultimately, we all want to inspire people at all levels of our teams to take more risks, and for our peers and other executives to create safe environments that inspire innovation.

Creating a safe environment that enable people to fail fast and how this translates and spurs innovation is a common theme in our community, but we think we can highlight some new and interesting points around how to effectively influence culture, manage risk, make decisions at the appropriate place in a business and drive even better results.

BlackBerry is a company that the market has forced to change. And as new leaders have come aboard, we have had to focus on changing the culture to drive innovation; to drive action; to integrate teams to focus on delivering solutions; that time continues to be our enemy – and only we can address that.

That we have a plan to win – and that DevOps is a very key component to that strategy.

Chapters

Full transcript

The complete talk — auto-generated from the talk's captions.

Hi, everybody. My name is Cynthia Taylor. In July of last year, we had a problem. We had a very large system.

It had grown organically over a number of years into a state that could politely be called spaghetti code. And we had to make a fundamental change to this system. Once we started to make the change to the system, once we started to work on it, it would become completely incompatible with our current production system. The issue with all of this is that we had one QA environment, and this QA environment was very difficult to make changes to.

So to reduce the risk, we naively said, "Well, can we just clone that QA environment? Create another one and do all of our work on there to reduce the risk?" The answer I got back was, to do that would cost us a half a million dollars and take six months to do. It was completely out of the question. We did not have six months for the entire timeline of the project.

So to try and just spin up this new environment was out of the question. So we weighed the risks and we accepted them. We moved on, thankfully. Thankfully, we were able to move forward with just the one environment.

But it was at that point that we realized we could not continue like this. We could not have an environment that was that difficult to change, that difficult to spin up new systems. We couldn't tackle it right away. But again, we decided to move on that journey.

A month ago, we started a new project. This time, we didn't need one environment, we needed four environments in order to meet our timelines. This time, it took us minutes to spin them up, and the cost was negligible. So I wanted to talk to you a bit about our journey to get from there to here.

That we decided to start on this journey was really not particularly miraculous. I think everybody here has decided or is in the process of deciding that DevOps is the way to go. It really wasn't even all of that noteworthy that we got to where we did. What was astounding about what we were able to accomplish was that we did it at the worst period of our company's history.

The organization that we have now is over 50% smaller than it was when we started. The organization does not resemble anything like what it did a while ago when we started. Despite all of that, we are stronger, we are faster, and we are much more agile. And I have some thoughts about how we got there.

So, sorry. It is often said that innovation is born of necessity, but there are lots of companies that need to change and don't. Technology is also an enabler, but there are lots of technical choices that someone can make. There's no silver bullet.

What's right for one company is not necessarily going to be right for another. What really creates an environment where people can make these types of fundamental changes is ensuring that you have a safe environment to do it in. What makes an environment safe? We need to trust each other.

We need to trust that you have my back, and I've got your back. We need to trust that our leaders are acting in our best interests and not just in the best interest for the company. And when you are a leader in tough times, you have to be especially vigilant because you are feeling everything that your people are going through, but you've got to be really careful about how you show it. And you have to be really careful about how you support and roll out change in a time when people are afraid for their jobs.

Building on that trust, you need to have a team environment, and a team is more than just a bunch of boxes that sit next to each other on an org chart. A team is a group of people who are working together towards the same goal. You have a collection of people who have strengths and weaknesses and understand each other. And when someone stumbles, there's somebody else there to help them pick up the pieces.

There needs to be support for new ideas and new ways of doing things. People need to feel that they can challenge the status quo, that they can challenge the ideas that have been put forward, that's the way that we have always done things. You need to be open to having your ideas challenged. You need to be open to someone coming up and saying, "The way that you've said that we need to do this, I don't necessarily agree." So you need to be open to hearing those things.

And we need to be able to fail without fear, because trying new things does not always mean that the new things are going to succeed. We need to recognize that failure is part of the process, and we need to plan for that failure instead of being derailed by it. There's a great story in the book "How Google Works" about the launch and failure of Google Wave, and how the entire company was watching how they reacted to the failure of that product, and what was going to happen to the people who were involved in it afterwards. And what was great about it is once that product failed, those people that are on that team, they were sought by the other teams because they had tried something.

They had tried something new, they had tried something innovative, and now they were free to try that again in another space. So I work for BlackBerry. You may have heard of us. We make these things called smartphones.

They're kind of neat. I think it's going to be a real growth market someday. In all seriousness, when everybody thinks of BlackBerry, I know we think of the devices, but BlackBerry is actually a whole lot more than just BlackBerry smartphones. There's the BlackBerry enterprise servers, which manage not just BlackBerrys, but also iOS and Android devices across enterprise systems.

It's QNX that is embedded in cars, nuclear plants, and all sorts of embedded systems. It's BBM, the first, and in my completely impartial opinion, still the best mobile instant messaging system out there. It's our BlackBerry network that ensures global message delivery. And on top of that, we also have our corporate technology services, our ERP, our manufacturing and supply chains, our BlackBerry customer support systems, et cetera.

So the part of BlackBerry that I want to talk about is that corporate technology services. That's where I come from. A year ago, we were a collection of a dozen different development and QA teams. We had myriads of different tools and processes.

These graphs are the results of a survey that we did a year ago. The top one is the number of source control systems that we used, the middle one is our CI servers, and the bottom one's the number of build tools that we had. On top of that, we had a collection of about 500 different application systems and technologies. Again, we grew very fast.

And we had lots of different organizations, and they all built their own little apps, and they all came together into this bundle of this suite of systems that we had to support. Some of these systems had one active user. Far too many of them had no active users. But again, part of the portfolio that this area was actually maintaining.

Our QA deployments took days at times, routinely. It would derail all of our development efforts as the developers needed to stop what they were doing on new stuff to get the QA environments working for the testers. Our production deployments took hours and routinely went over our windows that we had defined. Our people were run ragged.

They might be allocated 100% to one project, but they kept getting pulled into four or five other applications that they had worked on in the past because there was a production problem or a new feature needed to be implemented. And we had all kinds of critical application knowledge about how to build our systems, how to deploy them, the business logic that existed in people's heads. And in far too many situations, it existed in just one person's head. So we had to do an awful lot in a very short amount of time.

How we tackled it? Well, we started with a new vision and a new organization. We continued with new roles for our leaders and support for new thoughts. But it only succeeded because we had a talented, passionate group of people who drove it and made it happen.

The catalyst for us is we had a new VP in our organization. He brought with him a whole new breath of fresh air to the organization. We were going to focus not on all of those 500 things, distracting things, but only on the things that were going to move us forward. Radical thought.

For all of us who didn't already have a copy, he bought us "The Phoenix Project," and we went through the same exercise that I'm sure everybody went through. How many went through and identified all the Brent's in their organization? Yeah. Everybody did that.

We formally adopted Agile. Prior to that, we'd had little pockets of Agile here and there, real grassroots efforts. But it didn't encompass the entire organization. It was just these little pieces of places where people thought that they could work.

But it took an executive to come and say, "Agile is important for all of us for it to roll out across the entire organization." We moved from a strict hierarchical organization to a matrix organization. That was also pretty key to us. So one of the things that was, we had our verticals, which were our services. They were responsible for end-to-end supply to our customer units.

So we reengaged with the customer units, made sure that we were listening to them and delivering all the way through to operations. We gave people dedicated resource managers, people who were focused just on them and what their career path is and what they needed to get out of their career, not just on what we needed in order to get a project done. And we also formed what we called the practices, and we had practice managers. And practice managers were not if you practice hard enough, you get to be a real manager.

Practice managers were responsible for the specific roles that were from start to finish. So our business analysts, our architects, project and change management, software engineering and quality assurance, and our operations team. Also key to all of this was bringing our operations teams together with the development groups that they were paired with. So organizationally, we brought dev and ops together.

But one of the most critical things that we did through all of this was treating development, testing operations as crafts. And this idea that what you did was more than just a commodity that we could replace at any point. The craft of what you did was important. And as a practice manager, it was our responsibility to make sure that we were growing your craft.

That no matter what the project was or what you were doing, that there was consistency, and that we were making sure that you were able to be the best that you could be at development, at testing, at whatever it was that you were passionate about.So our journey to DevOps officially began when we got the software engineering and QA practice together and we said, "We need to change the world because what we are doing is not going to work. What we are doing today is not going to get us forward." We knew there were problems. And in fact, we had our first brainstorming sessions and came up with a whole lot more problems than what I thought we had. But one of the big things was that we realized that we weren't going to get better unless the people who were directly affected were given the power to change that.

So we gave them the mandate, we gave them the control to figure out what their pain points were and what they had to do in order to change it. This was not something that we brought project managers and process analysts and a capital P project into place in order to fix. This was only going to change when the people whose lives were directly affected by what was going on defined what success looked like. So we asked for volunteers.

We asked for people to volunteer 20% of their time in order to get this done. So again, not a formal project. If you want to be involved with this, if you're passionate, if you believe that this can be done, 20% of your time, get together, you have all the support you need in order to get this done. So they got together, they self-organized.

They divided into streams. These were our first streams that we had. Common tools, policies and practices, knowledge sharing, and building technical SMEs and automation. As we went along, some of those dropped off.

Some people didn't actually drive the streams. Some of them combined together. But ultimately, they all merged into what we called, we really couldn't come up with a better name than our Change the World framework. So this is our basic stack of what it looks like.

And based on what I've been hearing, it's pretty standard from what everyone is doing. We use Git, we use Jenkins, we use Sonar, Chef, Selenium. We are adding in Fortify for security testing. We use SoapUI in order to get our API based testing in there.

We use Selenium to do automated smoke testing and regression testing in there, but it's all push button deploy. We can go from a code check-in to a complete tear down and rebuild of our cloud environment, deploying the code, run an automated sanity test, and then we publish out to our wiki the success or failure of that particular deploy. Our critical success factors to getting this done. The first one was finding the people who believed it could be done, finding our change agents.

Though this started in our software engineering and QA practice, it would not have succeeded if we didn't have two critical members from our operations team who sat side by side with the developers to make it happen. And let me tell you, their 20% time came after they'd already given 150% of their time already. They did this because they knew it needed to be done. We started with a non-production test application.

We didn't have a whole lot of wiggle room in any of our applications right now, and we knew that we were going to be changing some pretty fundamental things about the way we were building software, deploying software. So we started with basically a glorified "Hello, world" application. It had all the basic building blocks of what we needed. It had separate front end and back end.

It was talking to external APIs. It had two separate databases. All of the basic stuff that we needed to test stuff out. We used that as our guinea pig to try things out.

And then once we were reasonably confident that we had something that was going to work, then we moved on to a brand new application that was just starting out so that we could influence it from the beginning, not try to retrofit some of these new processes into an application that was already live. So the advantage there was that we'd had our little test bed to try stuff out, to fail safely, and then we could apply it to something new, which again, influenced from the beginning and not derail something that was already there. Management was really interested in what these guys were doing, particularly people who understood DevOps, which sadly was not as many managers as I would have liked. But we didn't interfere.

We didn't roll it out before it was ready, and we didn't try to redirect them into a direction that we thought they should be going in. It was totally what they wanted to do. What our role was in all of this was to remove roadblocks and be cheerleaders for what they were doing. Coincidentally, at the same time, another one of our groups was building out a private cloud infrastructure, and we took advantage of that like crazy.

It wasn't part of the original plan, but the fact that it was happening at the same time was serendipitous. So that was critical to what we were doing. And we celebrated every win that we had. When people started to feel confident that this was going to work, we published it out a little bit wider.

We published it to our VP's group, our leadership team, to the group itself, to other VP groups. And as we got more and more success, we published it even wider and let people know about it. The key thing there was that we made sure that everyone knew that it was the team that had done this and the team that got to take credit for it. It was their-- They did this, they drove it out, and they were the ones who were responsible for the success.

Challenges. We had a lot of challenges. The first one was buy-in. Lots and lots of people who said, "That's great.

That sounds fantastic. But we can't do it here, not now. Not at this point in our company." How we dealt with that, luckily, we didn't need them.We had our core group of change agents. They were the people who believed in it.

We just let them go at it. That group sort of changed over time. Some people dropped out. We had new people who came in to kind of pick up the ball and get it over the finish line.

But we really enabled and empowered those people who believed in it. We had some passive resistance. We had a few people who signed up for it and did all the head nodding, and then either didn't do anything or passively undermined the activity. The great thing about having this being driven by the team themselves was that the team figured that out pretty quick.

So if I needed John to do something for me and John didn't deliver it, well, then I ask Greg, and Greg did it for me instead. And that way, we had lots of other people who were learning the new tools, who were learning the technologies. We were spreading it all out. And those people who were kind of passively resisting, they either sat there and listened, or they dropped out of the initiative.

Time. Time was killer. We certainly did not have a whole lot of time. I've watched a lot of the presentations, and I get really jealous when I see timelines that start with 200.

We didn't have that. We literally started doing this about nine months ago. And the people that we had were being pulled in so many different directions. One of the things that we did finally at the end after we'd been doing this for several months is we went around to all of the stakeholders of everybody who was involved in the project, and we said, "I want you to pretend that these guys are on vacation for a week." And we locked them in a room, and they just pounded away.

And a bunch of little nagging issues, because really, it's just the nagging issues, one after another, that makes you feel like you're not making progress. We had to rely on other departments. We didn't have everything. The cloud infrastructure, a database team, they were different departments.

And they had their own critical timelines that they had to deal with. What we found was that what we were doing was really cool, and just the passion with which we would go into, "This is what we're doing, and if you could do this for us, it would allow us to move forward." And that passion, a lot of time, got over the hump of, yeah, they were able to find a little bit of time to help us. But morale. Morale was really the killer in all of this because our layoffs did not happen all at once.

It was waves. It was a sustained over the course of multiple months. At times, it felt like every week you were watching your friends and colleagues being walked out the door. You were working on this project, and you didn't know if you were going to be around to see it come to fruition.

And that was really tough to work through. But we did. And it was really so critical to just remind everybody that what you're doing is revolutionary, and it's really cool, and not a whole lot of people are going to be able to accomplish this. Signs of success.

So last month, we started doing this from end to end on a brand-new project. Everything. We'd done it little pockets, mostly in our production environments. But last month, month and a half ago, we started the full end-to-end automation, all of our dev environments, all of our QA environments.

I mentioned before that our QA deployment sometimes took days. So we're now moving to a system where every day we would be deploying what was in master into our QA environment automatically. People were a little nervous about that. So we actually injected a manual step into it just to give people a little bit of comfort.

"Don't worry, we're not going to just completely overwrite your environment every day. Caleb is going to merge the code and hit a button. And if it doesn't go okay, then we'll roll it back." So after a week and a half of doing this, I get an email from Caleb that says, "Can you replace me with a small shell script? Because what I'm doing is providing no value whatsoever." And that was it.

Shackles were off, complete automation. Every day, we completely deploy everything at 9:00 in the morning. We used to do it at 3:15, but we have an offshore team in India, and if something failed at 3:45, then they would be out of luck for overnight. So we do it in the morning now, and then we have the ability to manually trigger it any time that we want.

One of the things about this when it works is there's no fire. And when you're fighting all kinds of fires all around you, because there's still fires around, we haven't solved everything, you sometimes don't notice the fires that don't happen. So even though you're used to dealing with painful QA deployments, when it doesn't happen, you're so focused on everything else that is happening that we kind of went, "Hey, remember all that pain that we used to have? We don't have that anymore." Which is great, because the days that it used to take to do deployments, we don't have that luxury anymore.

The way we used to do things would have meant failure now because we removed all of the slack from the system. And I mentioned I had a problem with buy-in. I really like it when people come to me and say, "Yeah, I was wrong, and I can't wait to see what you guys are going to do next, and I really want to help you out with it." That's particularly satisfying. So some takeaways.

For us, it was supported at all levels. So our VP set the tone. He came in, and he gave us permission. He gave us the empowerment.

The leaders and the managers in our organization agreed that this needed to happen. But it was the team members who actually drove this forward and were critical to the success. So find your believers.Give them time, space, and support. Celebrate the wins.

And it doesn't have to be huge. Evolutionary, not revolutionary. There was a talk yesterday where it said, "Think big, plan big, but start small." Little pieces at a time. Tackle your biggest pain point.

One of the things that I love about DevOps is that I believe that it frees people to do what they do best. So if you're a developer, you can't develop the next great project when you're troubleshooting the last one in production. If you're a systems engineer, you can't figure out the best way to optimize your network for HA or DR if you're up all night because something is on fire. If you're a tester, you can't get feedback to everyone else in time if you're constantly trying to figure out how to keep your QA environment up and running.

You can't innovate if you can't breathe. So DevOps gives people time to breathe, and it gives them the focus that they need in order to get things done. And as leaders, that's our responsibility, is to make sure that we are creating environments where people get to do what they do best. So, here's what I need help with.

How do you combat the ebbs and flows of engagement to inspire continuous excitement? Even when you have your change agents, your change agents get tired. Someone else said yesterday, "You get four nos before you get to a yes." Those nos are exhausting. How do you keep people's enthusiasm up when you're going through this?

Because change is hard, and fundamental change is really hard. So I would really love suggestions for people, how they keep their teams motivated, how they keep the people around them going. Thank you. Any questions?

So you mentioned that their goal was to keep it to 20%, but you hinted that that probably really didn't happen. Mm-hmm. So, I guess there's been various discussions about have a dedicated team, don't have a dedicated team. What would be your views based on what you've done, what your experience was?

Somewhere in between, truthfully. What we have is we had groups of developers who-- Well, I have one developer who's phenomenal. He learned a lot of the op stuff, and then we have our ops guys who learned a little bit of the dev stuff. So we have this core group of people who never quit their day job to do this, but now they act as evangelists to other organizations.

So we don't have a dedicated team to do this, but we do have people that we consider subject matter experts to do it, and we do free up their time and their allocation to make sure that they're able to do that. We're starting to spread this to other areas of the organization now, too. Yeah, completely different. If you could go back in time, would you have a dedicated team?

No, I wouldn't. I think that if you dedicate to it, you can sometimes lose sight of what you're actually trying to do. And when you're actually doing development or deploying actual applications, that's fresh in your mind when you're thinking about the best way to automate the processes or the best way to make your life easier. All right.

Thank you, everybody.