War and Peace and DevOps
Mark Schwartz is an iconoclast and CIO and a playful crafter of ideas, an inveterate purveyor of lucubratory prose. He has been an IT leader in organizations small and large, public, private, and nonprofit.
As an Enterprise Strategist for Amazon Web Services, he uses his extensive CIO wisdom to advise the world's largest companies on the obvious: time to move to the cloud, guys. As the CIO of US Citizenship and Immigration Services, he provoked the federal government into adopting Agile and DevOps practices. He is pretty sure that when he was the CIO of Intrax Cultural Exchange he was the first person ever to use business intelligence and supply chain analytics to place au pairs with the right host families. Mark speaks frequently on innovation, change leadership, bureaucratic implications of DevOps, and using Agile practices in low-trust environments. With a BS in computer science from Yale, a master's in philosophy from Yale, and an MBA from Wharton, Mark is either an expert on the business value of IT or just confused and much poorer.
Mark is the author of The Art of Business Value, A Seat at the Table, and War and Peace and IT and the winner of a Computerworld Premier 100 award, an Amazon Elite 100 award, a Federal Computer Week Fed 100 award, and a CIO Magazine CIO 100 award. He lives in Boston, Massachusetts.
Chapters
Full transcript
The complete talk, organized by section.
Mark Schwartz
My name is Mark Schwartz. I am an enterprise strategist with Amazon Web Services, AWS. What that means is that I'm part of a small team of ex-CIOs, CTOs, IT leaders who pulled off a digital transformation of some sort, moved to the cloud in our roles before Amazon. And what we do now is we speak at conferences, we write books and articles. Each of us meets with about 120 AWS customers a year, large enterprises at senior levels. And we try to help them overcome the impediments to transformation, which almost always are non-technical impediments, things like cultural change, organizational structure, models for managing investments, people skills, and things like that. So we get a good feel for what all these large enterprises are thinking about and what problems they're running into.
And along the way, I wrote three books, which you might have seen, "The Art of Business Value," "A Seat at the Table," and my latest, "War and Peace in IT," which, as you might guess, is a sequel to Tolstoy's famous novel. It turns out Tolstoy ran out of pages before he got around to talking about digital transformation, so I figured I better finish the job for him. And I will note that if you haven't read "War and Peace," there is no need to. This book is a lot thinner, and it really covers everything that you really need to know. And just to give you a feeling for that, I'm going to tell you a little story from the book that I think will help put digital transformation in perspective.
So, as you might be aware, if you've read "War and Peace," the backdrop to the novel is Napoleon's invasion of Russia in 1812. So as the action is going on, Napoleon is invading Russia, and it's not really clear why he's doing it. It was a very complicated situation. He wanted Russia to boycott Britain and not trade, and Russia wasn't quite doing what Napoleon wanted. So he invades Russia, and it turns out that he's not really sure what his objectives are in doing so, but he assembles this huge, huge army, and they enter Russia, and the Russian army lined up against them retreats. And then Napoleon's army moves forward again, and the Russians retreat again, and Napoleon moves forward again, and the Russians retreat again. And finally, Napoleon's spoiling for a big battle. Finally, outside of Moscow, the Russians take a stand at a place called Borodino, and the battle is clearly going to happen there.
So the day before the battle, Napoleon walks the battlefield with his generals, and he issues dispositions. "You guys go over there, and you guys go over there, and you do this, and you do that." And once the battle actually starts, the generals wind up ignoring what Napoleon told them to do anyway. They just use their own judgment. But he plans it all out, and then the next day, the battle actually takes place, and you have to sort of imagine the scene. You've got Napoleon on a hilltop about a mile away, commanding the army. And the army and the Russians are engaging. And Napoleon is issuing orders on what they should do. But of course, the problem is that he's a mile away and on top of a hill. And so how he does it is messengers will ride up to him from the battle, and they'll tell him what's going on, and then he'll think about it and make a decision and tell the messenger what he wants people to do, and the messenger will ride back to the battle, and in theory, then the troops do what Napoleon said.
So in one great scene in "War and Peace," the messenger comes riding up to Napoleon and says, "We've taken the bridge over the river. Do you order our troops to cross it?" And Napoleon says, "Oh, yes. Have them cross the bridge and form up on the other side, and I'll be there." What he doesn't realize is that actually, since the messenger left the battle, the Russians have recaptured the bridge. Not only have they recaptured the bridge, but they burned it. There is no bridge. So when Napoleon's order gets back to the battlefield, nobody can take action on it. And that is pretty much what happens through the entire battle. Napoleon's up there issuing orders, and they have nothing to do with what's actually going on.
There's a little postscript to this battle. First of all, it's not really clear who wins the battle. A lot of both armies wind up getting killed, and in a way, the French win because Napoleon's able to proceed, and he occupies Moscow. And at this point, he's stumped, this brilliant General Napoleon. He is in Moscow, and every time in every previous war, when he's conquered the capital of his opponent, that means he's won the war, and they sign a peace treaty or something. But the Russians have evacuated Moscow, and Napoleon is there with his army, and he's like, "What do I do next?" Essentially, the Russians have disrupted warfare. Napoleon is there, and he can't figure out what to do next, so he hesitates in Moscow, and he takes weeks and weeks and weeks to figure out what to do next. And finally, he orders his troops to retreat back to France. Doesn't really make sense, but that's the best he can come up with. And because he's hesitated so long, it's now the middle of the winter, and the troops start to make their way back to France, and they're decimated by the Russian winter, is essentially the outcome.
So, this is one of the stories that's going on in the background in "War and Peace," and this is how you should do your digital transformation. Thank you all for your attention and see you later.
I want to draw sort of two morals from this story. There are two points that I think are important. The first is Napoleon gets to Moscow and he can't figure out what to do next. Why? Because he had no clear outcomes for what he was trying to do. Once he got to Moscow, had he accomplished his outcome? Had he not accomplished his outcome? It just wasn't clear. And I think this is a really important point to keep in mind, not only with transformation, but with execution, even in a transformed state, is that without clear outcomes, it's very hard to make decisions as you experience disruption. If you don't know what you're trying to do, what do you do?
So, having that sense of a clear outcome, or a business goal, and a business goal or an outcome is not something like we're going to build this IT system and finish it. That's not a business outcome. When things change on you, you don't know what it means for your project or for the execution of it. I'm talking about having clear outcomes that are measurable business results: increased revenues, decreased costs, speed up processing, make customers happier, whatever it is. But I'm talking about an actual business result rather than the completion of a project. So I think that's the first thing that I learned from this example of Napoleon.
But more to the point on DevOps is the realization that Napoleon is actually not able to command his troops because his lead time for implementing a decision is really long, while the action is happening very quickly. So if you think about it, for Napoleon to make a decision and have it implemented, somebody has to sense what the issue is, ride all the way to Napoleon and his hill. Napoleon has to think about it, make a decision, send the messenger back. The messenger has to go back, and then implementation can happen. That is a pretty long lead time when you're in the middle of a battle and bridges are being taken and destroyed and units are moving in different directions, and so many things are happening so quickly that really need to change what you're intending to do. So in a way, Napoleon's problem is that his lead time for a decision and implementation is out of sync with the lead time for change on the battlefield. Rapid change, long cycle.
So putting this in the context of DevOps, DevOps gives us a very fast way to create capabilities, deploy capabilities, see what their impact is. We have this very fast cycle time once we've decided to implement something. But that's embedded in a value stream that typically is really long. So among other things, before you actually make that or introduce that capability, you have to plan out what you're going to do. You have to make an investment decision. You need to set up status meetings and all kinds of other things. There's this long upfront lead time, and then there's a change management lead time after implementation, typically. And when you put it all together, you wind up with this long lead time from when you sense something you need to do in the market to when you actually roll out that capability. And it doesn't matter how many deploys a day you can do. It doesn't matter how fast you can go from code commit to deployment, because there's this long business cycle that it's embedded in.
Some of you know that before I joined AWS, I was the CIO of US Citizenship and Immigration Services in the US federal government. It's part of the Department of Homeland Security. And I've talked a few times about what our process was for that value stream. We had to follow something called MD 102, or Management Directive 102, which had essentially the SDLC for developing IT capabilities, and it involved 104 documents that had to be prepared and 13 stage gate reviews. And the result was that by the time anything got to production, typically it was somewhere between 18 months and five years or so. So yes, we could do 100 deployments a day, but it was still going to take us five years to release a capability because of that long upfront life cycle.
Now, of course, that's an extreme example. But I find in talking to companies, and it's no surprise, there is a big upfront process that involves usually assembling a bunch of requirements and turning it into something you can call a project. And then writing a plan for the project and a business case, and having the business case reviewed by some sort of governance process, making a decision to proceed. In some cases, bringing on board contractors to execute or procuring hardware or whatever else needs to be done. You wind up with this inability to respond very quickly to changing circumstances, and we know circumstances are changing very quickly now, even though we have the digital capabilities.
So this is sort of a problem I throw out there: how can we actually shrink the entire lead time for responding to market activity to when we deploy a capability, concept to cash, or when we think of a new innovation to when we can get it in the market, or even to we try an experiment and we want to get feedback on it? How can we address the entire process?
So I was thinking about that, and my first train of thought was why do we do this? Why is there this long process to get from one end to the other? And I would submit that there are two reasons, really. The first is risk. We want to mitigate the risk of the initiative because IT investments are expensive and we don't want to waste the money. We want to be good stewards, so we put in place a process for controlling the risk, let's say. The second reason, I think, is that we want to make sure that we're applying capital to the right initiatives. So we've got some money. We're going to invest it. How can we make sure that we're investing it in the right thing? And so we set up this process.
When we did our 104 documents and 13 gate reviews, they were either concerned with making a business case for what we were going to do and vetting that business case. We had to establish the mission need first, and then we had to make a capability development plan, and we had to make a project plan and a lifecycle cost estimate, things like that, to figure out: are we investing in the right thing? Are we investing the right amount of money? And a bunch of the other steps in the process had to do with managing risk by checking to make sure that the executors were going to do the right things and that they had a good plan and they'd thought about everything, and that the overseers could validate these things. So again, I think government is a little bit crazy, but most organizations have similar goals and generally set up a process that takes time in order to accomplish those goals. So if you think about those two goals, I'll call them risk and return.
Let's take risk first. Risk, I think there's a lot of misconception about risk, and the result, I think, is a long process upfront. DevOps and the cloud are intended to reduce risk. Let me say that one again. DevOps and the cloud are intended to reduce risk. I'm saying this very carefully and with emphasis because I think unfortunately, when we communicate the goals of a transformation to the leaderships of our organizations, we tell them the opposite. We might not say it specifically, but we're communicating often that the opposite is true. We say things like, "The company has to be less risk-averse. We need to transform. We need to take some risks here, and we need to be willing to fail fast." Well, these are messages that say transformation is risky. If you're a CEO or a CFO, you're thinking that this is something we got to think about really carefully.
It's weird because, look, with DevOps, we can take what used to be a big, risky investment where you don't see the result until the very end, and we can start deploying pieces of it 100 times a day, seeing the results and altering course depending on what results we're seeing, or even stop the project if it turns out that it's not going in the direction you want. How much more risk mitigating could you be? Instead of risking a huge investment and not finding out until the end whether it was successful, we can break it into pieces, check to make sure each piece is moving us towards our goals. We can set up automated guardrails in the DevOps world. With the cloud, instead of investing a big investment in fixed infrastructure where it might turn out that that infrastructure isn't what you need two years later, that's a big risk. In the cloud, you can provision your infrastructure as you need it. You can change your infrastructure as you need it. Can you think of a better way to mitigate risk?
So while we're implying that this transformation is risky, it's a transformation to an approach where you need to take more risks, the truth is we're introducing practices that are intended to reduce risk, which is a much better story, I think, for the C-suite.
We also have this way of talking where we say it's important to fail fast. If you want to innovate, you need to fail fast. You need to be willing to try things and have them not work, and this is an essential way to innovate, and DevOps lets you do it, and the cloud lets you do it. Well, I promise you that your CEO and your board of directors do not want to fail, let alone fail fast. Failure is a bad word. It's something you don't want to do. And our challenge in the DevOps world is not to get them to be comfortable with failing. Really, when we use those words, when we talk about failing fast, what we mean is reducing risk by trying out new ideas before we commit to them.
So old school, you decide what you're going to do based on a hypothesis about what's going to work, and you commit yourself to doing it, and you go all the way through it based on your plan. In the DevOps world, we can take that big plan and say, "Before we commit all the money to that, let's try it out." And after we try it, we can either decide that we're not going to continue with the investment, or that we are going to, or that we're going to change what we're doing because there's something else that would give us an even better result.
That's not failing. If instead of doing this big project that's going to cost us a lot of money, we arrange it so that we can have a checkpoint very quickly and decide whether we're going to commit the rest of that money, that actually is success if we decide not to commit the rest of the money. That's not failure. If we decide that we're going to try three different experiments and choose the one that is going to give us the best result, and we choose that one, were the other two failures? They're not failures. What we've done is we have succeeded in finding an evidence-based way to make the decision and thereby reduce risk. So why talk about failure? These are different ways of succeeding.
The more we talk about risk and failure, or the more we make it seem like failure is a likely outcome, or that we're going to do something that's risky, the more that leadership wants to put in place processes to reduce risk. That results in these long value streams. So risk, I think, is an important topic to treat well in order to figure out how to shrink that value stream and set ourselves up so that we're reducing risk by actually executing and executing quickly and using that as a technique for reducing risk instead of this upfront planning.
The other purpose, as I said, of a long upfront planning cycle is to make sure that we're investing in the right things. We've got our capital, now we've got to consider the business cases and make good investment decisions. In the past, the way we made those investment decisions was by preparing a business case and evaluating the business case. A business case typically had projections. That's how you make your case for a return on investment. You project your future revenues or your future costs, and you project what your investment is going to be, and then you sort of compare those. That's the thinking behind this upfront process.
The problem is that when you make projections, you're making a projection. It's uncertain. If we were honest about our projections, we would never have a single point estimate. We couldn't really say, "This investment is going to return $10 million a year in increased revenues," because $10 million, that's a pretty exact number. You don't know how many people are going to buy your product. The right way to frame a projection like that would be to say $10 million plus or minus $1 million to be within an 80% confidence interval. That's really what an estimate is, although we tend not to say that explicitly when we're doing projections. Now, in a very predictable world, maybe you could say it's going to return $10 million plus or minus $1 million to be within an 80% confidence interval.
But in a world of rapid change and uncertainty and complexity and all the characteristics we know the digital age have, that isn't actually the confidence interval that we have for most of our projections. It would be more accurate probably to project something like, "This will return $10 million a year plus or minus $20 million to be within a 40% confidence interval," something like that. It depends on how much uncertainty there is in the environment, and there is a lot. So if you're comparing business cases to make a good investment decision upfront, you're comparing a business case that says $10 million plus or minus $20 million, and I still don't have confidence in it, to $6 million plus or minus $50 million, and I don't have confidence in that one, and you're making an investment decision based on those. It's no longer a really good way to make sure you're directing capital to the right places.
On the other hand, if you're using DevOps, and you're using the cloud, and you're using the other techniques that let you quickly try experiments, you can say, "Well, we have a few possible investments, and we're going to figure out what we can do that will help us learn which of those investments is the best and how much it's going to return." And then you can proceed in incremental stages and make good decisions. So, this long upfront process of business planning, business case building, and business case evaluation doesn't actually accomplish its goal of making sure that investments are directed to the right place.
In fact, putting all of this together, this long cycle time with all this upfront stuff, it does not reduce risk because the environment is constantly changing, and this is not the best way to manage the risk down, this upfront plan. It also doesn't successfully make sure that we're putting our investment in the right place. So it's a long cycle time for no really good reason, in a sense. In fact, you could say it makes those things worse. It makes our risk worse because we're committing to something upfront based on an upfront plan in a changing world. It makes our investment decisions worse because we could be constantly learning and adjusting and only investing as long as we're getting returns. In the Agile world, in the DevOps world, it's entirely possible to start deploying capabilities to reach an objective, and then as you start to get diminishing returns, you can start to direct your money elsewhere.
So by moving into execution quickly with the right controls and the right process, we can do a much better job of fine-tuning our investment so that we know it's going to the right place. We can constantly manage our risk, constantly reassess, adapt to change as we learn about it. We can do a much better job of meeting those big objectives that are intended by that big upfront process.
So the challenge of transformation and the challenge of getting an enterprise to adopt DevOps and some of the other practices that we want to be using is the challenge of Napoleon standing on his hill at Borodino, where a long cycle time or a long lead time for making decisions and getting them into execution makes him entirely ineffective at accomplishing what he wants to accomplish. And instead, if we look at that upfront process and find ways to reduce that long lead time, we can do a much better job of managing risk, of investing wisely and getting the best returns from the investment, and the best job of satisfying our customers and sustaining our organizations and making them future-proof and disruption-proof.
So I will leave you with that today. Thank you.