Wiring the Winning Organization
Wiring the Winning Organization: Creating Conditions for the Enterprise’s Distributed Intelligence to Achieve Unparalleled Results
Human beings—individually and collectively—can be phenomenal problem solvers, if conditions in which they work are shaped properly. However, if events are happening at high speed, in complex and tightly coupled systems, with high costs of failure and few iterations from which to learn, then people are put at disadvantage, having to figure out, on the fly, what to do, how to do it, and for what reason. That’s a real “danger zone” in which to operate. In contrast, the triumph zone is the stark contrast. If conditions are changing slowly, systems are simple (linear) or at least controllable and otherwise loosely coupled, and failure is cheap and learning iterations are many, then people’s individual and distributed intelligence can be well expressed.
In this presentation, we explore how to move people’s experience from the danger zone to the triumph by simplifying the environment so sensemaking is easier, and by slowing it down, by bringing problem-solving back into feedback-rich planning and practice (preparation) and out of fast-moving, high-stakes execution and performance. This is illustrated with a simple example, but one that extrapolates to others of significantly greater complexity and urgency.
There’s empirical reason to believe simplification and ‘slowification’ create significant advantage, and there is theoretical backup to the claim, based on fast and slow thinking distinctions by Kahneman and Twersky, “normal accidents” as explained by Perrow, and theories of modularization by Baldwin, Clark, Wheelwright, and others.
Chapters
Full transcript
The complete talk, organized by section.
Gene Kim
So one of the things that I have been, actually the only thing I have been working on for the last three years, is working with someone who has influenced my thinking more than anyone else, which is Dr. Steven Spear from the MIT Sloan School of Business.
On the next slide, I will share with you where I met him, which is in 2014, when I took an executive education course because I had read his book, The High-Velocity Edge. It made me rethink and see things I had just never thought of before.
Although Anna, our editor, would probably dispute this, I personally blame Steve for The DevOps Handbook being over a year late, because I knew it had to be rewritten.
What we have been working on for the last three years is really trying to understand what is in common between Agile, DevOps, the Toyota Production System, safety engineering, resilience engineering, psychological safety, and so forth.
The reason why this has been so rewarding is that Steve is famous for many things, but among the things he is most famous for is writing the most widely downloaded Harvard Business Review article of all time, called "Decoding the DNA of the Toyota Production System." He wrote this in 1999, based on his doctoral dissertation at Harvard Business School. As part of that work, he actually worked on the factory floor of a Tier 1 Toyota supplier for six months.
Over the years, he extended the work beyond just the high-repetition work of manufacturing to engine design at Pratt & Whitney, to helping build the safety culture at Alcoa, to studying the design and operations of the nuclear reactors for the U.S. Navy seagoing fleet. It has been so rewarding to see what is in common, and see if we can explain all of these things with a very parsimonious theory.
We finally have a title of the book. For those of you who were here for the workshop the day before the conference, you were able to see some of our early thinking on this. The book finally has a name. It is called Wiring the Winning Organization: Unleashing Our Collective Greatness Through Slowification, Simplification, and Amplification. This will be coming out in November of this year.
There was an excerpt of the book that was made available. The excerpt is a little bit smaller than I thought it would be, because we are a little bit behind, but not for lack of trying.
What I would like to do in this session is share with you some of our goals, share with you two vignettes that we think have tremendous explanatory power and help explain why some organizations have this magic of winning. I will share the first vignette, then I will turn it over to Steve to share the second vignette and show how it fits into this bigger picture.
The observation I will make is this: winning organizations have this magic. They do extraordinary things, more than any single individual could do alone, and are able to fully unleash people's creativity and problem-solving capabilities.
This is in contrast to organizations that have the exact opposite characteristics, where they constrain or even extinguish entirely the creativity and full problem-solving capabilities of the people within them.
It is not just in DevOps and technology. Consider two hospital emergency departments. In the first one, families are crowded into the emergency department waiting rooms. Worse, there is no movement. When people finally get into the emergency department, there are patients waiting everywhere: in exam rooms, on chairs, on gurneys, in hallways, in those embarrassing hospital gowns. There are frustrated clinicians describing how difficult it is to do their work. If you are a patient sitting there, you would wonder, quite rightly, whether this is a great place to be treated. Maybe there is another hospital somewhere else where you should go. That is not unfounded.
But consider another hospital emergency department where the waiting area is nearly empty. It is not because they are not treating patients; it is the exact opposite. There are more patients. Every point is easy and seamless. They get registered immediately. They get signed in immediately. Eighty percent of the patients are seen and get to go home because they are triaged. The ones who have to get treated have clinicians focused on the patient and delivering care.
What is so remarkable about this is that it is the same hospital, same patients, same clinicians, same equipment, same floor space. They are identical almost, except for how they are wired. What I have learned so much from Steve is that it is the management system that differentiates the two.
In our world, and in the State of DevOps research, we have seen the same dynamics, where you have elite performers versus low performers. We have seen in the DevOps Enterprise community organizations go from not great to great: same people, same equipment, but radically different performance.
I love this notion that the goal of science is to explain the most amount of observable phenomena using the fewest number of principles, confirm deeply held intuitions, and reveal surprising insights. Our goal is to put together a very simple theory of performance.
To show how simple, let us go through one of the most critical things we have discovered. It will come in the form of a story of two people moving a couch. Let us call them Steve and Gene.
You would think that this is all brawn work, but it is actually an incredible amount of brain work. There are a bunch of tough problems that need to be solved: where is the center of gravity in order to get through a narrow doorway? Around which axis should they rotate to get down a narrow set of stairs? Who should go first, and should they be standing backwards or forwards?
What is astonishing is that they can do this without bringing in a lot of consultants or study groups. The instant they pick up the couch, they are instantly communicating, coordinating, getting feedback, and they will be able to sense-make and figure out how to move the couch together.
But there are all the things that we can do as managers to make it very difficult for Steve and Gene to get their work done. We can turn the lights off, and suddenly this is not so easy anymore, especially going down the stairs. Or we can turn up background noise, like a siren or loud music, where they cannot hear each other anymore. This is actually difficult in a way that the first one is not. The problems are more challenging, but they can no longer communicate or coordinate.
Or we can put an intermediary between them, where Steve and Gene are not allowed to talk to each other directly. Instead, they have to go through this intermediary. All of these things make it much more difficult to solve problems. It will take longer, the quality will be worse, because they can no longer act as a unified whole.
The big idea here is that there are two concepts. One is about coupling. Steve and Gene moving a couch is a joint problem-solving cognitive activity, and there are so many things we can do to destroy their ability to work together. This is different from Steve and Gene moving two chairs, which can be done independently.
One of the big aha moments for me is that when possible, we want to decouple them. A great example of decoupling is air traffic controllers and a commercial airline pilot. They are sufficiently decoupled that the air traffic controllers can change shifts without permission from the pilots, and vice versa, because the information is so encoded and the protocols are so well known that we can hide information from each other. We can make changes on one side of the interface without notifying the other.
There are a whole bunch of domains where that is not possible. I would like to leave you with these two exercises about how powerful the couch is.
Think about interactions in your organizations where you might be over-couched. This is when there is not enough partitioning between teams, where in order to get anything done, as was mentioned earlier, you have to deal with 35 people because we are all stuck to the same couch. We also have to communicate, coordinate, prioritize, schedule together, deconflict, and escalate, because we are all coupled to the couch.
Or think about places where you are under-couched, where there are situations where you want to get something done, but you do not have direct access to the people involved. You spend an enormous amount of time trying to figure out who needs to do something, who owes you something. You cannot create connections on your own, and all you would like to do is sit next to the person so you can solve a problem together, just like we see with ops and dev and so forth.
That is the role of the couch. With that, Steve, over to you.
Dr. Steven J. Spear
Thank you. Good morning, everybody. Thank you, Gene.
As Gene was saying, my research started about 30 years ago, deeply embedded inside Toyota. The paradox we were trying to explain is that the auto industry is pretty much a level playing field, where people have access to similar resources and IT systems and capital equipment and so on. They are trying to find and meet needs in the same competitive space. And yet, despite all that level playing field quality, the differences in performance are extraordinary.
We were trying to understand the paradox with Toyota: twice as productive, defect rates one one-thousandth, time to market one half, relative to everybody else. As Gene said, I had this chance to embed, sort of Karate Kid, inside the Toyota system for an extended period.
When we went in, we may have thought that this was a technology issue, access to better equipment. But that was not true. Access to better programming, production-control algorithms? Turned out that was not true.
What we discovered is that day in, day out, when someone went into one of these Toyota plants or supplier plants, the conditions were much, much better for individuals to solve the hard technical problems in front of them. It was much easier for groups of people to collaborate on solving the hard technical problems that were in front of them.
This was certainly true in upstream styling and design, both of the product and the process. But even day to day, the number of problems that occur in a manufacturing environment is somewhere close to infinite. In the Toyota environment, they just get digested and processed in a regular, routine kind of fashion, almost as easy as actually turning a bolt with a wrench. In other places, every problem seemed to generate chaos and confusion.
To Gene's point about this idea of people being coupled, attached to the same problem, having to coordinate their collaborative efforts versus being decoupled, we started to realize that within that system there were a lot of tactics and strategies used to break the big problems of designing things and running things into smaller and smaller problems, so that people's intellectual horsepower can be engaged not only more productively individually, but engaged simultaneously.
What we want to do is take you through a very small vignette, which, while very simple, captures some of the ideas of breaking very big problems down into much more manageable pieces. Not only are the pieces themselves more manageable, they can be processed simultaneously.
The vignette we will set up is hypothetical. Let us say Marguerite, Gene's wife, and my wife Miriam are asked by a friend to help in the refurbishment of a hotel in rural Maine. The purpose is that this hotel will be refurbished and used for nonprofit charitable activities, where people can come for a therapeutic session and so forth. They say, "Gene and Steve, while we are trying to help design this refurbishment, we are in rural Maine and we do not have access to skilled trades. If you guys could line up folks to do the painting, the interior work on the hotel, and to clear out furniture and restage furniture when we are done, that would be very helpful."
At first, Steve and Gene are overly optimistic or unappreciative of the difficulty of this. They hire movers and painters who can relocate to rural Maine for some period, and then they just let people have at it.
Certainly within the first few hours, it is chaos, because the painters are in rooms that the movers have not cleared out. The movers are trying to go into rooms where the painters are not showing up. The movers have cleared out rooms and think it is time to put the furniture back in place, but the painters are not done. It is just chaos.
So then Gene and Steve pull out their laptops and whatever software they have, and they say, "Oh, what we need to do is schedule this thing." They start giving direction: you go to this room, you go to that room, da da da. What they find out within about an hour, for all the precision of their schedule, is chaos again.
The reason is chaos again, and I am sure you all encounter this in your normal work, is that no amount of scheduling, and certainly not done by these two guys on a laptop in rural Maine in a rush, is going to be precise enough to account for every contingency. That is just in a static fashion, let alone the dynamic stuff that happens.
They still stick to the strategy of creating a schedule. They say, "Well, we have a more precise schedule, but what we are going to do now is expedite." If there is difficulty clearing the furniture out or refurbishing one room over here, we will grab other people and move them around, and get the paint and the plaster and the brushes and the drop cloths and this and that.
It turns out that fails too, because whatever expediting and patching they are trying to do, in terms of where people are and in terms of actually patching the walls in the hotel, is an inadequate solution to the fact that the schedule itself is a very bad way to coordinate people in a complex, dynamic environment.
At that point, there is a fortunate, I would not call it a rebellion, but constructive feedback from the painters and the movers. They say something to the effect, very gently and politely, "You guys are idiots. Why are you trying to schedule the entire hotel when the hotel is broken up into rooms and each room is idiosyncratic? It is an old building and whatnot. Just put us into teams: some movers, some painters in this room; some movers and some painters in that room; some movers and some painters."
That immediately makes things easier, because no longer are they trying to coordinate across all the movers and all the painters. They have taken this approach of partitioning or modularizing this hotel down into smaller pieces, where there are fewer people who have to communicate with each other at the same time about the same problem.
That helps things move along a little bit. As they continue, now that the movers and painters can be in the same place at the same time, no longer dependent on the schedule that Gene and Steve tried to generate, they can say, "Well, guys, before we all pile in there, trying to move a couch while you are trying to set up a ladder, or trying to move a chair out of the way while you are trying to put a drop cloth, why do we not partition this even further into phases?"
There is a clear-out-the-room phase. There is a prep, paint, plaster, and all of that phase. Then, when all of that stuff is ready for the movers to come back in, there is a put-the-furniture-back-in-place phase.
What they figured out how to do is take the problem of the room, which is much smaller than the problem of the hotel, and partition it into smaller pieces of phases. Now you have even fewer people who have to collaborate and coordinate their problem-solving efforts with each other, because they have broken this hotel down to room, down to phase. It is already going more smoothly.
Now something else is going on. With this partitioning room by room, you have basically separated, decoupled, like Gene was talking about with the couch. You have taken this massive problem, where everyone's hands are attached all at once, all at the same time, into much smaller pieces of work.
There is still this issue of having people solve problems in conditions that are more conducive to solving problems than not. The painters especially have this problem because they go in and this is an old building, made out of wood, about 100 or 150 years old. There is new plaster; there is old plaster. The old plaster has to be treated differently than the new plaster. There is wood that has to be stained, and staining is a pain in the butt because you have to come up with the right mixes and formulations. Trying to get all of that problem-solving done while they are still trying to refurbish the room becomes impossible.
On one hand, there is a speed at which they have to prep and paint or prep and stain. Then there is a speed at which they can actually solve the problems related to prepping, staining, and painting. Those are out of sync with each other.
They say, "Wait, hold on a second. Not only do we have to partition this system into smaller and smaller pieces, we have to give ourselves an opportunity to solve problems in a slower environment than a fast-moving environment." This ties into how our brains work. When we are locked in, we are dependent on habits, routines, and muscle memory that we have already ingrained in ourselves. When we are facing a situation that is novel, where we do not have a preexisting habit or routine, we have to develop a new one, and that takes time.
What the painters and movers realize, and this is true for the movers also in how to move wardrobes that are awkward and heavy and have loose hinges, is that by giving themselves an opportunity to solve problems offline before they step into the performance environment, they are better equipped. This certainly helps a ton.
There is another issue here, and again it gets back to Gene's issues of coupling and decoupling. If you think about his example of chairs and couches, it is much easier to be asked to move a chair, where you do not have to coordinate with another person, than to move a couch, where there is all this problem solving that has to go on either before or during.
Now Gene and Steve, with the help of the movers and painters, have taken this big problem of hotel down to the problem of room, down to the problem of phases. Even within the phases, they have also done more subdivision: this wall versus that wall versus the ceiling versus the trim by the painters, or this piece of furniture versus that piece of furniture by the movers. They have given themselves another advantage of doing problem solving offline, where they can be thoughtful and deliberate and reflective rather than hurried and fast.
There is another issue that comes up, which is that things still go wrong. That happens with moving and painting, and certainly happens in things more complex and challenging.
At first, Gene and Steve think they are going to be helpful, and the way they are going to be helpful is they say, "Hey, Annie, you are working in room two, but you are not doing anything right now. Go help room one."
Immediately there is an obvious difficulty with that approach from the folks in room two, which is, "Wait a second. Anna might not be doing something right at this particular moment, but we need her in the next particular moment. If you pull her into room one, then the problems they are having there are going to start bleeding over into our room because Anna is no longer available."
Gene and Steve thought getting Anna's help would be helpful, but inadvertently what they did is recouple the system, because now the problems are linking room one to room two to room three to room four. They realize another necessary strategy here is that when you are having any work being done, you have to have "slack resources." Not to insult Anna by saying that she is a slacker, but she is the slack resource who is an expert who can step into the moment of a more difficult problem than the team is already dealing with and pick up that extra burden to locally stabilize the situation without having it spill over to other parts of the situation.
As far as this hotel example goes, as we started working this through, and again it is simple, it is Gene and Steve managing a few dozen people on a fairly simple project in the scheme of things. But what we found, as we started reflecting on our experiences over the last 20 or 30 years each, is that every program or project we have been involved in, Gene's and mine, in manufacturing, heavy industry, drug development, et cetera, had these qualities of an overly coupled system that did not allow people to step back and do slow-thinking, slow, deliberative, creative thinking. When it started running problems, problems started bleeding over and recoupling the system.
As we started thinking through how this all comes together, often, too often, we put people in a difficult situation, what we have been calling in this book a danger zone, where we ask them to solve problems in environments that are fast moving, which is not conducive to being reflective. The stakes are very high, maybe even hazardous. There is not opportunity to get multiple cycles of learning, so we really cannot learn; we just do and then discover whether we have succeeded or failed. The problems we are trying to solve are really tightly coupled and consequently highly complex: the whole hotel rather than a particular piece of a phase within a particular room.
When we start looking at organizations that are really successful, what we find is that they are not at all accidental; they are really deliberate at changing the conditions in which people have to solve problems. They break big things into smaller pieces so sense-making is easier. They give people the opportunity to solve problems in slower-moving environments rather than in faster-moving environments. Because of all of that, they give opportunity to have iterative learning, multiple cycles versus not. The stakes of getting the wrong answer on the first try are very low relative to the actual performance environment.
These are some tactics that we are starting to develop in the book and wanted to reveal with this case study. The last thing I will ask, as we go through this, is that we are still in the test-drive phase. As Gene said, we have a paperback which represents about one-fiftieth of our word count. So consider this a beta, and betas are wrong. Please tell us how.
Thank you very much.
Gene Kim
Thank you, Steve.