Log in to watch

Log in or create a free account to watch this video.

Log in
Las Vegas 2023
Share
Download slides

Value Stream Reference Architecture — Breaking Free from “The Way We’ve Always Done IT” Mindset!

This session will discuss the challenges that remain in DevOps adoption concerning toolchain sprawl, mid-tier maturity stickiness, and lack of respect for Conway’s Law. We will demonstrate how these challenges can be managed using the Value Stream Reference Architecture (VSRA) to organize teams and software architecture for efficient flow of value to end users. VSRA provides an holistic approach that mitigates flow entropy and improves organizational change management using analytical techniques that draw from graph theory, Team Topologies, and a new approach to reason about flow that we call the FINE equations. A case study at SAS Institute will be presented that highlights how the improvements delivered by VSRA allowed its Customer Intelligence division to go faster and smarter with fewer team dependencies and lower impediments to flow.

Chapters

Full transcript

The complete talk, organized by section.

Stephen Walters

Welcome. Thank you for joining us here today.

Just a little bit of background before we get started. The Value Stream Reference Architecture is a paper that was published in May, the authors being Craig and myself. We launched it at DOES in Amsterdam. So that gives you an idea of the timeframe from then, when I announced this in our lightning talk, to now, where Craig is going to provide us with a deeper update on what's happened so far and the kinds of successes, I think we can say, that have been received so far.

Before we get into that, I'll give you some of the information that came from DOES Amsterdam, that lightning talk, a bit of background into the paper.

First of all, why does a reference architecture need a Value Stream Reference Architecture?

There are three particular problem areas that Craig and I were looking at. We'd already been having some discussions and talking about value stream management, but there were particular problem areas that we'd identified with DevSecOps transformation.

Problem number one: mid-tier stickiness. Anybody who's read the State of DevOps reports will recognize this one, that for four years in a row there was this mid-tier of 78, 79% of organizations that's stuck in a medium level of maturity. They haven't been able to achieve everything that they want to get from their DevSecOps transformation.

The reason for that, for the vast majority, it was discovered, comes back down to this quote from Patrick. In the room or not? Always worrying when you quote someone who's in the room. But DevOps is what he says, his definition after 10 years: "DevOps is whatever you do to bridge the friction created by silos. All the rest is engineering."

Typically that 79%, they were doing the engineering. They'd done the tools, they got the CI/CD pipelines in place, but there were other elements. It was the culture. It was the processes. They weren't getting all of the value out of the DevSecOps transformation.

Problem number two: Conway's Law. I find it kind of amusing, I don't know if that's the right word, hysterical, but for the last year AI's come out and we're all jumping about like it's the next great big thing. Yet in the 1960s, Conway's Law has come along and we're still battling with it to death. It's a law. It's not a guideline. It actually happens.

The problem is that our organizations are not set up for this. We still set up organizations in a pyramid hierarchy according to military tactics, which is how businesses were set up in the 18th century. We're still using that technique. Or we're doing matrix kind of processes because we believe this great DevOps rule that silos are bad, so we should all be talking to each other.

But the problem is, if you follow Conway's Law, if we're all talking to each other, then in our systems design, all of our systems are going to be talking to each other, and you end up with masses and masses of complicated integrations and dependencies within the system. Or you end up with dependencies that are based upon something defined by.

So good DevOps: identify strong identities, clear responsibilities, high degree of autonomy. We're all agreed on all of those so far. But, and most importantly, well-defined interaction paradigms and communication channels with other teams. Which means where you don't want a silo wall, don't have a silo. But if you don't need it, if the silo's perfectly fine, leave it.

Problem number three: toolchain sprawl. There are two surveys to back this up, and you've heard it at this conference multiple times now, where you've got information that's come from GitLab's own DevSecOps survey where they've identified both development and operations teams that are dealing with a plethora of tools. Now we can orchestrate those, but all of those have got different data repositories. So to get a view across our entire stream is turning out to be near impossible.

Add to that, the results: the way people get around that is with DIY dashboards. And DIY dashboards just mean that either it's not runtime, or it requires a lot of maintenance. There's a lot of work required on your part to keep that in place as upgrades are required, maintenance is required.

So how does the Value Stream Reference Architecture help? Well, there's three things. Those three things have something in common, and it's the word "organization." We're not talking about things at a team level here. We're talking about this at an organization level, the entire enterprise.

Now this is the value stream management implementation roadmap. This is something produced by the Value Stream Management Consortium, for which both Craig and I are influencers. They kindly updated this on the 1st of October. So it's slightly out of date, but the new step is good because between go and start, they put in a test step in there, which means assessing the current state of your organization.

It's then about defining the vision, identifying your value streams, organizing around your value streams, mapping them. So do a mapping session, a value stream mapping workshop, to map your people against your processes, against your metrics. And then with all of that defined, then you can connect within a tool and provide a toolchain to do the automation that's required. And then it's about inspecting with correct metrics, flow metrics, value realization metrics, and depending upon those, adapt to them.

You can see three areas there. You've got DevSecOps platforms, you've got value stream assessments. These things happen, but there's a gap at the beginning, and we'll go into that in more detail later.

As well as the implementation roadmap, we lean heavily into Team Topologies as well. You'll see those mentioned before. But for us, it's not just about the team types. Everybody concentrates on the team types, and they're important. But because we're talking about this being at an organization level, it's about those communication types: collaboration and facilitation.

Hey, let's hand over to you.

Craig Statham

Thank you. Thank you. I assume I'm to share what we've been discussing and the excitement I've got, the same when we first got together and talking about all this, right? Do you remember that first conversation?

Stephen Walters

I do, but shall we remind everybody actually how that conversation went?

Craig Statham

Yeah, how that conversation. I think it went something like this. I think I started by going, "Craig."

Stephen Walters

"Craig, you're on mute."

Craig Statham

You know, working remote for two years now, we think we've figured that out already, right?

But you know, Stephen, this value stream management stuff, it really, I think, is great. I think fantastic. Value stream mapping, looking for those bottlenecks is awesome. Team Topologies, wow. I'm mind blown, right? Matthew and Manuel, what have they given us? It's something brilliant.

There's one problem, Stephen. I've got an existing organization. I just can't make sense of my organization. I don't know what my value streams are. I just can't see them. We've got a very complex team. I don't know what team we have. How do you do that?

Stephen Walters

I guess you guess, and you experiment, and you try it, and maybe it fails and maybe it's something.

Craig Statham

No, no, no. Look, you know SAS Institute, right? We're all about analytics. We take data and we create meaningful insight from that data. There's got to be an analytical solution to this.

Stephen Walters

That sounds like we could be onto something really big.

Craig Statham

I think it's going to be fine.

Yeah, that got a bit cheesy at the end, didn't it?

So we should have warned you about the bad acting that we both had, actually. I'm going to give you another warning. First of all, I'm going to show you something that you can't understand, so apologies. Because once you see this, you'll either have one reaction. You'll go, "Hmm, this is pseudoscience." Or you'll probably go, "Actually, that just makes so much sense. That's really common sense." And we hope you'll find the latter, that you'll see this as common sense.

But once you see this, you won't ever not be able to see it again. And this really backs up some of the theories that Gene and Steve and Mik Kersten, the things that these guys have been saying. It's kind of like the proof behind all of that. I'm going to show you something that's very obvious in terms of explaining why all of this kind of happens.

And we start with these four dimensions, what we call the FINE dimensions. That's flow, which is flow of work, flow of value. It's the only dimension that has a function of time.

Then there are impediments. There are things that slow the flow down and things that get in the way. That could be a lack of resource. It could be a new security requirement that slows this flow of anything down, anything that gets in the way of flow.

And then there's needs. Needs is what we need. It's going to be the requirements. It's going to be the specifications. Hopefully, if we're doing TDD, it's going to be the tests. It's the pulling force. It's what pulls the flow towards the end user.

And then there's energy. What is energy? What is energy in knowledge work? What goes on up here? It's cognitive load, really. It's very good as similar to cognitive load when we think about energy.

And these four dimensions are actually all related to each other. And if you think about it, because this is a very simple mathematical equation, and I should give you the other warning here: there's a bit of maths involved here. So I just want to do a quick check because there are some prerequisites for all of this.

Can you just raise your hand if you know how to multiply two numbers? Everybody raising their hand? Good, good. Can you, another raise of hands, tell me, do you know how to divide two numbers? We're good. We're good, Steve. Everybody's good.

So this first relationship: flow is needs over impediments. If you think about this, when the needs go up, when we place more requirements on teams, we want them to do stuff faster. So we're increasing that, we're trying to increase their flow, we're trying to make them do more. So the flow of work is increasing when we place more needs upon them.

So when needs go up, flow goes up. When impediments go up, things are getting in their way, flow goes down. So the relationship is that divide relationship between those two things.

The other equation is this one. Now this one's a little bit harder to get your head around, but hopefully I'll kind of explain this. So energy is a function, it's a product of flow and needs. Because on this one, if flow's going up, if we're doing more work, then we're really expecting the teams to use more energy, more cognitive load, keep things going. Flow's going up. If flow goes down and we're doing slowification, then things are probably just going to get a little easier and energy's going to drop.

Same with needs. If we're throwing more needs on the team, then it's just going to go up because they're going to need to get more flow. Remember on that previous equation, flow is related to needs. So these two things are a product. If either goes up, then flow goes up. If either goes down, flow goes down.

So we have these two equations and they're both related to each other. So by simple mathematical deduction, we give you the FINE flow circle.

Stephen Walters

This is where you said they'd all stand up and applaud. And I think you do need to let them know the butterfly net moment as to how we came up with it.

Craig Statham

Yes, we do.

So are there any electrical engineers, electronic engineers in the room? Anybody did electrical science in school physics, right? Recognize this? Do you see what this is, what it sort of resembles? This is Ohm's Law. It's exactly the same model as Ohm's Law.

So poor old George Ohm has been laying in his grave now since 18-something or other, kind of screaming at us, saying, "Guys, guys, you're pushing electrons around. That's all you're doing. You're pushing electrons around. I gave you these laws years ago."

So when Stephen and I kind of saw this, we did kind of think maybe we're going a bit nuts here because this seems a little too simple, really, in some ways. But we started applying it and started to try and look at this.

But what we need to do, first of all, is we need to understand organizations. Having these equations is great, but it doesn't really tell us anything until we start looking at the organizations.

So the VSRA is way more than just FINE. It's more than FINE. What we started to do is we started to look at the relationship between different teams, and in fact, in particular, the work from Team Topologies, looking at the four different team types.

What we recognized was that there was a generalization in terms of flow of work that happens between the teams. Generally, it's not the only flow that can happen, but generally this was the pattern and the flow. Well, this is a graph, right? This connection between the teams is a graph. So we started thinking, is there something around graph theory that we can use?

SAS is analytics, so graph theory is one of our tools. So we started diving into this and we actually found there are two centralities in graph theory: PageRank and betweenness centrality.

PageRank is the centrality that drives the internet, right? It's the billion-dollar piece of code that's like 10 lines long of code that helps you through the internet. But what PageRank tells you, it tells you how important a particular node in the graph is to any of the nodes.

Why is that important? Well, when a node of your organization becomes important, there's one of two things that are happening. Either that node is amplifying value because they're producing something that everybody's using, and that's great. And if it's a duplicate thing that they're providing, a close replica, then that's great. The value amplification really works.

But if they're not, if they're providing replication and it's slightly different, different people that are using it, then really what this centralization tells you is how much potential they have to impede the flow.

Generally, whenever we centralize things, what we're actually doing is really doing that replication thing. We're really creating impedance in our network because now we have a central point that essentially has a large blast radius. If they don't do what they're supposed to do, then everybody is going to suffer.

But betweenness centrality tells us how much a particular node is on the path to reaching value. So in this case, we can look at that centrality and understand which of the teams are sitting between the other teams. So if they have a high betweenness, then they're kind of in between.

What we found was that by looking at these two centralities, we could do classification. We could look at whether they had high betweenness and high PageRank. It's basically a table. You see this table here. If they have low betweenness and low PageRank, they're probably a stream-aligned team. If they have low betweenness but high PageRank, they're probably an enabling team. If they have high betweenness but low PageRank, they're probably a complex subsystem. And if they have high betweenness and high PageRank, they're probably a platform team.

Now, when we started looking at this a little bit more, we wanted to also understand what did energy mean? What did cognitive load mean?

In Team Topologies, the interaction styles that exist between teams are of three types. You either do stuff as a service because you're a kind of platform team, or you're collaborating because maybe you're a complex subsystem team, or maybe you're doing facilitation because you're an enabling team.

And what we figured was that there is actually a relationship between how much cognitive load is being taken on in those different interaction styles. And we came up with this idea of what we call cognitive slope.

Cognitive slope's actually a double-sided slope. And what it tells us is how much cognitive energy either side of the interaction is actually taking on. When we measured this, we could apply that into our graphs to give us an average value of cognitive load for any particular node in the system based upon the interactions that were happening with all nodes, and some example case of how cognitive load shows up for those different team types.

The next thing we did was to look at flow in terms of what the relationship meant with the FINE flow equations, and this thing we call flow entropy. Because when you think about flow, flow is made up of things that we want the user to have and the things that we don't want the user to have.

So things we don't want them to have may be a missed requirement, or it could be a defect, or any other things that we see that get to the user that we go, "Oh my goodness, I wish they didn't see that." So there's good flow and bad flow.

But the thing about energy is that just like the real world, it's finite. There's only so much energy to go around in the universe. And that's very true of cognitive flow. Who's ever experienced developers that are feeling burnt out? They've reached their limit. They can't do any more. We can't push them any harder. There's no use trying to push them any harder. They've reached their cognitive load limit.

And so what we find is that when teams reach that limit of energy, there's no more flow to be had. And in fact, flow starts to decline. The thing about bad flow is that that becomes rework. And the rework is an impediment to new good flow.

So this cycle that happens, and teams that get caught in this cycle find themselves very quickly grinding to a halt, and nothing good's ever getting done. How many of you found yourself sitting in a team and you're kind of wondering, "Why the hell is this team not doing what it's supposed to do? We just seem to be getting nowhere." That's flow entropy.

So let me talk a little bit about how we applied this at SAS, because this is a talk about how we've improved things at SAS by using these things.

I'll give you a brief history of SAS. First of all, this is tailored towards my own division at SAS, which is Customer Intelligence. But you can see that we've been around for some time. 1976 is when we were founded.

But the bean counters in the room are looking at those revenue values, right? Those revenue values you can see climbed pretty quickly over the years. In fact, we prided ourselves on having double-digit growth for a long, long time.

But in the 2000s and 2020, when DevOps was kicking in and we should be accelerating, growth kind of didn't follow the pattern. It's like DevOps isn't really doing it for us. We're in that stickiness, that DevOps. It's not really providing the value we'd really hoped for.

Why is that? Well, this is because flow is messy. Organizations are messy and flow is messy. This was the Customer Intelligence division teams. We have 120 developers, this sort of 12-ish teams, but they're all interconnected with each other and we have no idea.

So I said to Stephen, I have no idea where my value streams are in all this. We treat the teams the same. They're all just teams. We don't know what team type they are.

So what we do when we apply the FINE analysis, or the FINE flow analysis, is if you can take that graph that you've got of your teams and at least put them in this matrix, which shows the dependencies between the teams, you've got data. When you've got data, you can do analysis. And when you can do analysis, you can create insight.

What we did is we created insight around our teams by doing the analysis and looking at what team type they were. You can see from the names that actually what you see is, you go, "Oh yeah, of course. Of course the digital channels team are a stream-aligned team. They're closest to the customer. Of course the data engineering team is a platform team. They provide services as a service to all the other teams." That's at least what we should have them be doing.

And when we look at these values, the flow, what do we notice about the stream-aligned teams? The flow's the highest in the stream-aligned teams. What do we notice about the impediments? Impediments are high for those platform teams. They don't do their job properly, they're getting in the way.

Needs, again, the needs we place are high for the platform teams because we've put all that work on them. And complex subsystem teams coming out in the middle. But the stream-aligned teams, they're kind of having a bit of more easier time with it, but they're still getting a lot of good work done.

And energy, we can see where the cognitive load is in all of this. So these numbers just make sense. It's important to realize that these numbers, they're not standard-unitized numbers. There's no standards institute for these numbers, and you shouldn't look at these numbers and compare them to other organizations that are not connected to the one that you're focused on. These only apply to the organization under scrutiny.

But they tell you something. What did this tell us? What it told us was that this was our architecture, this was our team topology, this is what our teams really looked like. This is much easier to understand. It's much easier to view and see those team types.

But more importantly, only 27% of our organization was stream-aligned. That means only 27% of our organization is actually providing value to the end user, the end user we really care about.

So we had to change this because we had too few stream-aligned teams, too many complex teams, too many enabling teams. So that's what we did. We maximized the flow with the Value Stream Reference Architecture, and we get all these great advantages when we did that.

Just to show you outcomes that happened from this, this was a quote from our VP of our division recently. Once we started to do this, he made a statement: "We've delivered these net-new revenue-generating capabilities. Absolutely when we said we would. We delivered them on time."

So it's really a lot of confidence building within the business. The business now has confidence that we can deliver. And actually just recently, I'll share this, he did say Q3 results. I can't give you all the details, but we now have growth. We have double-digit growth, and the first number doesn't begin with one. That's pretty powerful.

So I'm going to hand back to Stephen.

Stephen Walters

How quickly can I get through this last bit?

So what feedback have we had on the Value Stream Reference Architecture so far? Four key areas.

Agile organizational change management. This now gives the capability for us to implement Conway's Law in a positive way with the inverse Conway maneuver. As we're making agile, iterative changes to our system design, then there is a threat under current structures that our system design will slowly deviate away from our hierarchical structure.

If we can make incremental changes to our organization that reflect the changes in our system design, no more big-bang reorganization every three years, or even worse, a big re-architecture exercise. So we keep the two in line. We use Conway's maneuvers as a tool.

Optimize continuous improvement, using the FINE flow equations to identify when there's potentially cognitive load stressing on a particular team, when flow seems to be being impacted within the team. When is flow entropy getting too much within a team? So we need to remove impediments.

Cognitive load measurements and analysis, identifying the teams that are under. What can we do to reduce that stress? Create models of different types of organizational structures through the FINE flow analysis to find the optimal way that our organization can work.

And finally, team identity alignment. How many of those complicated subsystem teams do not have to be complicated? They could be stream-aligned teams that are delivering value to an end user or to a customer.

So what's next? Pilots. We have some underway. You've heard about SAS. We're in conversation with a major European telecommunications provider who is wanting to implement the VSRA under POC. And from that we want feedback, results, good, the bad, and the ugly, case studies, and the calculations that you see on there.

There is a FINE flow toolkit. There is open source code available. That's about us. Quick, take a picture. Please do connect with us on LinkedIn if you want to know anything more.

You can download the paper from this address. Don't worry about getting the link. I've put it in the chat channel on Slack for this room, and also for the FINE flow toolkit, which is available from GitLab. Please, everyone can contribute. Get involved. Do your part. Have fun. See what happens. Your organization, we advance.