Agile/DevOps Transformation at QRG (Using Team Topologies, Improvement Kata, and Dojos at scale)
Nagesh Kapalavayi of Qurate Retail Group joins John Ediger of DXC Technology to discuss how patterns such as Team Topologies, Dojos, and Improvement Kata (among others) have helped propel QRG's DevOps/Agile transformation journey.This journey describes not only how to take advantage of these leading practices and patterns but also describes key ingredients of the secret sauce for getting outcomes from DevOps/Agile transformationsLeveraging Gene Kim’s original three ways of DevOps (fast flow, feedback loops, and continuous improvement), the leadership team is embracing these principles broadly across QRG's product and technology organization to increase speed, agility and value.Nagesh and John will show the top ten transformation themes and dive into a handful of these, discussing highlights and learnings so they may be leveraged by other organizations for their transformation journeys.
Chapters
Full transcript
The complete talk, organized by section.
John Ediger
All right, welcome. Glad you can make our session. We're really excited to talk about this. I think the challenges that we're so excited to talk about, we could fill probably two hours with it, so we're going to condense it down and get to the main points.
What this is about is the collaboration with DXC and Qurate Retail Group on our continuous transformation journey, and we're going to talk some specifics around how we use dojos, how we use Team Topologies, and some other patterns.
My name is John Ediger. This is Nagesh from QRG, and that's Ali from QRG.
Nagesh Kapalavayi
A little bit about our companies. Qurate Retail Group: we are the largest player in the video commerce space, serving about 20-plus million customers across five different countries and under seven different brands. We are beyond brick and mortar. We are beyond e-commerce. We have the third way to shop. Our core brands are Zulily, QVC, HSN, and Cornerstone Brands.
John Ediger
And DXC is a $17 billion IT services market leader. We're a consulting company. We're a technology and engineering company. Basically, we help companies transform their business across the overall enterprise technology stack and ways of working for flow of value.
Nagesh Kapalavayi
All right, where did all this start? Back in 2018 is when Qurate Retail Group formed and we had the merger integration between QVC and HSN. The leadership came together and talked about transformation, the need for us to change the way we work together. Obviously, they came up with strategic goals at the company level, at the corporate level, and then you have the inheritance for the technology goals: modernization, protect and accelerate the business, agility and speed to value, and most importantly predictability.
Great goals, great objectives, and you have two different companies all coming together. Transformation is not easy when we're having so many different new things coming together. That's when we wanted to really find a partner that is living and breathing such transformations and help us identify what those transformative ideals should be, and then more importantly provide transformation themes, so we're not really all over the place as an organization chasing a gazillion things, but are structured under guidance in executing the transformation journey incrementally.
John Ediger
All right. We worked together to assess the whole organization, identified where the biggest opportunities were and the biggest challenges. We aligned on a set of ten transformation themes, and they're depicted here.
I'm going to talk through a couple of them at a high level just to set the stage, and then we're going to go into some details on how we actually implemented and made changes across the board here. The very top is the secret sauce, and I'm going to talk more about that in a minute.
Going across the top colored row there, people excellence is around culture, how the ways of working. We identified leadership, team skills, and team ways of working, and how business and IT work together as one rather than the business and IT.
The second row is around operating excellence, and this included things like prioritization, interdependency management, how we measure, and then a really important one: that consistent shift from project thinking and project execution to product thinking.
And then on the bottom are the technology delivery excellence categories. The first theme, I don't know if you guys heard Scott Prugh this morning, but that is a big one about decoupling architectures for fast flow. The bottom two on the right, we're going to spend a fair amount of time on this, are around shifting operations and self-service, and leveraging the patterns around platform teams and enabling teams to increase the fast flow of value.
Rather than starting from scratch, we leveraged a lot of the great minds, the great thinking. Here's just a sampling of those. You'll recognize a lot of the material here from the conference: value stream mapping, Team Topologies. Team Topologies we've used a lot in our client efforts and transformation. I consider it probably the most important book in IT transformation in the last decade. It's really that powerful, and it really works. We'll talk a little bit about how.
Sooner Safer Happier, Jon Smart's work, was throughout this from a transformation standpoint and a culture standpoint. DXC immersive dojos, Black Belt Dojos, that we modified from the original great work that was done at Target, and we morphed them into a more immersive, short-term, one-week innovation session. We'll reference those, the Improvement Kata pattern, and OKRs.
Starting with the secret sauce, just to mention this, we won't be able to go into detail, but having that culture and discipline both of continuous improvement permeating through the organization is probably the most important thing we identified.
This is a whole half-hour talk in itself, just this pattern. In a nutshell, this is how we can institutionalize, and how we've been institutionalizing, continuous improvement both at our discipline level and at an overall cultural level. The bottom half of this represents how the teams operate in identifying their top impediments and then using a scientific method with the Improvement Kata process and working through those with hypothesis-based continuous improvement.
This all happens only if the whole leadership team really reinforces this, supports it. The top half of this represents that leadership team getting together on a regular basis, prioritizing the top org-wide impediments, and then working with the teams to help reinforce and actually living and modeling using that Improvement Kata process throughout the organization.
This is the Improvement Kata pattern, how we've modified it and leveraged it. I'll leave this in the deck for people to look at afterwards.
What I want to do is get to one of these examples, this far one on the bottom right here. It is this pattern around leveraging enabling teams and platform teams for faster flow of value. It stems from the great work of Team Topologies.
Just a quick review. One of the great patterns is the three team types. The orange team type at the top is the stream-aligned team. That's where the teams work in value streams and are cross-functional, and their whole goal is to get fast flow of changes, fast flow of value. They're supported by the two other teams: the purples, the enabling teams. They help coach, enable, help close the gap on capabilities of the team, and help that team achieve fast flow of value. And then lastly, the platform team.
This is about creating not the platform that we all grew up with in an organization, centralized, cost-cutting, mandating. This is about platform as a product, from a pull basis, to enable the stream-aligned teams to be able to pull, use these, and have lower cognitive load, and be able to not have to worry about what's underneath, and make it easy for them to get fast flow.
That pattern we leveraged in this case where we wanted to take the central QA organizations, fold them into the actual development teams, the actual cross-functional teams. But that's not enough, to just bring those together and do that reorg. It's really about those other two team types: having enabling teams to coach the teams, to help enable them to shift left, to build quality in, to use all the right practices, and the platform teams to have those quality platforms, those test automation platforms.
I'm not going to have time to talk about this except zooming out. This is the vision we created together of where we want to gradually shift across the organization, of really supporting all those stream-aligned teams in orange at the top by a whole set of these other two team types, enabling teams in purple, and all these teams that typically do infrastructure, do ticket-based work, or sometimes have blocking dependencies for the stream-aligned teams. Automate all those, turn those into platform engineering teams. Long journey. It's continuous, and that'll be ongoing.
Nagesh Kapalavayi
Getting to that previous slide would be nirvana. What did we actually do? As John mentioned, the objective is quality of a product is a shared responsibility. We have tried probably over the years ten different ways: bringing them from different organizations, putting them into scrum teams, but it's always the dev and the QA, and that shared responsibility objective was really not coming across.
Picking a leaf from Team Topologies, what we did from the left is we actually combined the dev and QA under one roof and allowed them to own the product in a self-sufficient manner, and created two new functions for quality platforms and quality enablement. Just assume how this would have operated had these two functions not been there: somebody was doing part-time in some capacity trying to move the needle. But now these two teams have a charter to actually have these teams be self-sufficient and focus on the progressive architecture of these platforms as a product, as well as enabling the customers to actually move faster in being able to do their work.
We deployed this probably six months ago, and we probably spun our wheels more than four months to actually do it. We were pretty anxious about how the reaction was going to be from dev and QAs as you execute this change. Surprisingly, the teams embraced it on both sides. It's a learning opportunity. I think the QAs got a lot of excitement in joining the dev hands together, and there's not the dev managers and QA managers and somebody else trying to navigate resourcing, capacity, ownership. All of them started coming together.
It's still early, but we have a monthly checkpoint for the next six months to make sure that the right support and right help, and we're actually hearing about how the change is actually happening on the floor.
John Ediger
Early. All right. Next we're going to hand it over to Ali to talk a little bit about the prioritization in that category that we have.
Alison Vernamonti
Just checking. Okay, perfect. Prioritization is actually something I'm quite passionate about. I'm sure some of you have heard it before: it doesn't necessarily always matter if the team is as amazing as possible if they're not working on the right thing.
We had an opportunity actually, led a session in 2021. We did an experiment and we had a hypothesis, as you can see, that if we can come up with this prioritization method, then it actually also helped us reduce our WIP. As you probably heard in the Airbnb speech earlier and other companies are probably having the same type of issues, where there might be like thirty number ones. You can't actually focus and get that work done if it's not prioritized.
We did a cool experiment where we actually utilized the WSJF model, weighted shortest job first, and took those different cost-of-delay factors, and then showed. We actually had a very interactive session with our leadership, with QVC and HSN plus international leadership. We had this session, and it was very interesting because they got to hear about the different perspectives from each other. Just because you weren't working with that, you may have not realized how high of a priority that is or what those costly factors were.
It was very fun too because you got to hear those combatant ideas, and we actually did it live on a Lucid board. It was a little bit of Hungry Hungry Hippos, where you'd see cards move back and forth, and you can tell that they had a psychologically safe environment to actually have those conversations.
It's not just about the calculation. If any of you have actually used the WSJF prioritization model before, it's about that conversation. Even if it scores in a particular way, you have to have the conversations. You have to figure out the sequencing. You have to say what's dependent on each other.
The real learning from that is, this year, it was the first time in I think the thirty years that we actually had a combined prioritization with the business and IT together, and we're learning from that. Going into next year, we're just doing some small modifications and we're going to be continuing to do that prioritization. One of our improvements was to make that prioritization, especially at the initiative level, more transparent and make sure that we have input from not just the top but also from the bottom as well.
John Ediger
Excellent. Just keep in mind as we go through these different areas, we're only touching on one small example in a few of these. This is a much bigger continuous transformation. Next we'll go to Nagesh.
Nagesh Kapalavayi
All right, one of my very, very favorite topics: evolving and decoupling architectures. I think we have talked probably in the last 48 hours many, many times, and Randy has done an amazing job describing what good architectures do and what bad architectures do. They also really simplified it and explained how eBay and Amazon, when they were in the early stages, built their products, tested it, grew, and then at a point they had to go really redesign and re-architect their platforms to actually do better.
Same thing. You come into this large organization. There are several platforms that we could talk about, and we'll talk here today about one specific platform, which is the most important one that runs a $5 billion business digital platform. We've been operating on it for almost seven years now, and we continue to build a lot of features. We have developers across probably all five different markets contributing to it in three different time zones, so you can imagine how convoluted it gets to actually get anything out of the door.
Do the architects and the engineers not worry about it? They do. Then when you bring the product, they talk about tech debt and continuous improvement, and, yes, let's do it in Q4, and we're running business hard, let's go do all of our tech debt stuff. That Q4 never comes. Sometimes that next year never happens.
But how do you bring some framework and structure for this organization, for this team, to be able to really go do something about it? That is what we talked about, the immersive dojo: bringing those people into the right room and detaching them from that day job and allowing them to be there for 48 hours.
Yesterday one of the lightning speakers said developers say, it's impossible. The leader asks, how close can you get to it? It's the same thing. It's impossible, we can't touch this big giant ship right now, but when you send them into the dojo and detach them from all the distractions, they come out with something more meaningful.
You can see they developed something on the sorts of six-phased approach, and they used the Improvement Kata in the exercise next to the dojo. Now they not only know what their current state is, they absolutely are passionate about the future state, but they don't have to map everything about all the ways to get to the end. The Improvement Kata is helping them to go where we stand today. We're at phase two in this journey, and it's continuous improvement.
There are outcomes. There is business value. What disabled us from doing more frequent releases? Now with this change alone we can get to weekly releases to this digital platform, and you don't need the amount of coordination. We just heard that this morning, how coordination is so expensive. Now you start decoupling these teams. You make them independent. You now have smaller releases. It's a journey. It doesn't matter if it takes one year, two years. At every step there is value to the business, there is a happy end customer, and there is high morale for the software engineers. They don't have to slog through working to deploy something to production.
John Ediger
The key there is that, in addition to the technology, technology is important, but more important is what Nagesh was talking about: how you organize the teams around this with dojos, and with getting them to align on a specific path to improvement and measurable improvement.
The next one that we're going to talk through is the self-service operation teams.
Nagesh Kapalavayi
As we talked about platform definition, and give you a chance to read, the leaders have described this in its most simplistic form, what a platform is all about.
I'm sure there's many people in this room that have infrastructure teams. You have your customer-facing teams, and you have this 80/20, or 80% of the time your organization is working towards your customer-facing teams, and there's 20% of the shared responsibility.
Most of the focus, effort, improvement goes into that customer-facing team. They go through agile methodology, they go through transformation, there's a lot of investment into those places. But the infrastructure organizations somewhere get left behind for other reasons. Similarly, in our organization, you have infrastructure that does compute, storage, network, middleware, workload automation, ServiceNow, APM solutions, logging. All these things are different platforms, and they provide a service to these customers, but there's not an investment in how these teams can all work together.
Our focus has been to really invest in infrastructure, in engineering operations, and bring them along in this journey. The end customer is not successful if the shared service team is not successful. I think that's so important.
Now we brought one example of many: node provisioning. I'm a customer, I own the Kubernetes platform, and we're adding 25 nodes to a cluster on-prem in the data center. Traditionally, what we would go through: you would have a compute team, network team, storage team, middleware team, access management team. Everybody over the last decade has optimized each of their silos very efficiently, probably, but they never saw what the customer value was from beginning to end. The customer was never happy. The customer was never happy. But everybody inside their silos felt they were good because they were moving the ticket one past the other.
So we really brought the value stream mapping exercise as a framework into this organization and pulled all of those engineers back together, and really brought the actual customer as a talker and said, what do you need? Then do this value stream mapping exercise. They mapped everything that the team goes through, and it was eye-opening for each of them how inefficient the end customer was.
They all did not see it together. Now you go take them to a dojo next and say, what do we need to do? They come back together. They are beautiful solutions. The solution was never the problem. It's just understanding people talking to each other. I think human complications, Paul Gaffney talked very nicely about, it's two people just standing up and talking to each other, solve stuff.
After a couple of dojos, they came up with an Improvement Kata, and as we speak today, in a span of, I should say, maybe a better way to say it: customer said on Friday afternoon, I need about 150 nodes to add to my cluster because I'm replacing my Red Hat with Ubuntu for my Kubernetes cluster. They turned around and provided all the 150 nodes on the same day, and the following Tuesday those nodes were added to the cluster.
I don't think I have ever seen this organization being that nimble and agile in a data center. In the cloud, it's a whole different ball game, but here we didn't solve technology. Actually, we changed the mindset. We changed the culture, and we really wanted them to transition from being system administrators or platform administrators to platform as a service, platform as a product.
Self-service operations: that mindset and culture and understanding product is not just in the customer-facing value streams. Product is everywhere in what we do, and we should start having everybody follow through that.
John Ediger
When Nagesh is talking about the customer, we're talking in this case about all those engineers and those cross-functional teams and those stream-aligned teams. If you saw that Pareto chart up on the top right, which you cannot really read, it's actually an analysis of those stream-aligned teams: where their biggest dependencies were, what took them the longest to get from the infrastructure teams, from the operation teams, what impacted their speed to value the most. Prioritize those, and that's what the team is going after in order, so really important.
Nagesh Kapalavayi
We have just unlocked the infrastructure and operations team. It's a long journey. I wouldn't say it's amazing at this point, but our learning is not everything is rosy. We only took 30 minutes to tell you the good stories. I'm sure there's another three hours I can tell you what didn't work and the challenges we had.
Transformation is continuous. If we have overused the word continuous, it's still meaningful. You're not going to get it done, and you're going to learn continuously. The last part is patience. That is very important in this journey that we learned. I cannot get upset. I cannot get mad that somebody else is not changing. We're dealing with human complications that Paul Gaffney spoke about. You have to work with the way the customer or the other person can understand, and hang in there.
Somebody else talked about pressurized middle managers in the last 48 hours. Leadership has a vision, they have a strategy, and then there is the engineer from the floor that actually executes. But the communication, the prioritization, battling between business and continuous improvement, I think that's the layer that actually gets sandwiched a lot. That's a journey, and that's where I think working through prioritization is only going to continue to help us.
The next one is DXC's immersive dojos. I don't know how powerful that is. We have not done something like that in the past. Just having that framework where everybody leaves their biases outside the room, and you put them into a group of people to actually go solve a problem. You don't push people to solve a problem. You bring people that are passionate, committed, engaged. All they need is time and allocation and calmness from being distracted from the world.
I think all the things, the Team Topologies, the dojos, the Improvement Kata, these are tools in your toolbox. Every organization is solving a similar problem in a different context. You pick up the tools from that toolbox and apply them to your own environment. We're working through that same journey right now. It's just identifying what tools fit into our ecosystem.
I'll close that with saying such large transformations, sometimes we think we know everything until we don't get an outside perspective of how we actually are, and how biased we are, and how integrated we are into our DNA in the things we do. Opening ourselves out to get a different perspective, and DXC has been amazing for that, John especially, working with this patience for the last two and a half years and helping us through this journey.
What we had not done two years ago, what we are doing in this organization in the last six months, I can count a lot of things that are happening around the organization. Continuous improvement is a theme, and all of these tools are being used in different parts of the organization, and it's a long way to go for us to continue.
Thank you very much for joining the session. We really appreciate it.
Q&A
John Ediger: Any questions, anybody? If there's time, come on up. You can ask anything. Question in the back. Go ahead, sir. I think so. We're not being kicked out yet.
John Ediger: That's a really good one. I think through coaching. One answer is have coaches work with the team and walk through that and talk about their measures and their goals and what success looks like. I think it's hard to do without some kind of enabling-type function or some kind of coach. I also hand them the Team Topologies book if they're readers. There's some good videos on that too about what makes a really good product mindset.
John Ediger: I think also the leadership is the other key part of that: the leaders understanding that, and how they're talking to their teams and asking the teams about what success looks like. It's a multifaceted thing. Leadership and coaches are the two things that I would say work the best.
Nagesh Kapalavayi: I think that also, it's an investment. Engineering and operations, most of the time you don't find scrum masters. You don't find product owners. But we do have all of those people in the value streams, and we have a whole product organization. It's an investment because essentially they are providing a service to IT as their customers. Our software engineering teams are providing a service to their business and customers, so we can't treat them differently. I think we should invest the same way in both places. The concept of product will follow it, in my opinion.
Alison Vernamonti: As you probably have heard in other sessions about OKRs, if a company is utilizing OKRs, you can leverage that to help you with those cost-of-delay factors. When you're thinking about the value add or even the time criticality, you can refer back to that area's OKRs to say, is this actually going to get us what we're wanting? Think about those as your cost-of-delay factors. It wouldn't relate anything to the bottom line with your actual size. But when you're really looking at that, you can be saying, what is the risk if we don't do these things based upon our OKR? If you have a particular measurement, like a KR that is trying to get to a certain goal, then you could say if we don't do this, then that probably would be a higher cost-of-delay factor. It's just one example.
John Ediger: In some sense, WSJF is used for large projects and trying to get the project smaller. But if you completely flip it to funding value streams and product areas, and have the local teams actually do that prioritization, it takes a lot of that pain away, of course, because they're constantly doing small iterations within that backlog. WSJF is especially good for these large organizations that are project-based, but the organization will morph over time to where we want to go.
John Ediger: For the first one, mostly the first, but it turns out when they go through a dojo that they learn new ways of thinking, new ways of working. So that's a byproduct, but it's mostly to get the right people together and solve a problem with coaches there to help facilitate.
John Ediger: Whatever the outcome, whatever the problem is, whatever the goal is, it can be applied in many different ways.
Nagesh Kapalavayi: In our case, I think the examples that we're referring to, the dojos in both places, decoupling architecture was you bring the people from different value streams that contribute to the product to come together and talk about a common problem, and detach them. When we did the node provisioning infrastructure, you had middleware, you had network, you had storage or compute, brought all of those people together to really talk on that focused problem. The facilitation of that dojo is helping them to remove their biases and bring in this dojo.
Nagesh Kapalavayi: You also develop something. Rapid prototyping happens too, so there's a learning that can happen around it. We may not get an outcome explicitly out of a dojo, maybe, or maybe you might actually get a quick win. It depends upon the problem you're solving in the moment.
John Ediger: It actually relates to her question in the back, or the other question, which was how do you get product platforms to act as true product teams? If you have a dojo, you bring your customer, the engineers that are using the platform, into the dojo, and you have the platform team in the dojo forming that platform, working together. That is a great way to switch to that product focus.
John Ediger: I think we've overstayed our welcome here. I think people are leaving. Thank you very much. Thanks for joining.