Deliver Faster: Rethinking Software Team Velocity
Everyone is looking for ways to deliver quality software faster. In this talk, Kate will cover models, patterns, and ideas to help you think about your org design, communication structures, and system parameters to optimize for velocity.You will come away with new ideas for thinking of collaboration, strategies to help you navigate matrix organizations, and tips to improve your teams' execution.
Chapters
Full transcript
The complete talk, organized by section.
Kate Matsudaira
Hello. Hopefully everybody's doing well right before lunch. I'm super excited to talk to you today about software team velocity, and in particular, how do we build products faster?
A quick bit about me. I am the VP of Technology at SoFi, responsible for all of the engineering and product teams that build the financial services products. Before that I was at Splunk, and before that I was in startups. And I think no matter where I've worked, the one question I always get from people is, how do we do it faster? How do we execute better?
I'm going to tell a story about a time I took over a really large organization in the hundreds of people, and my manager told me, 'Kate, we have to fix it. This team is not executing well.' And it went with: how do you fix a team that isn't delivering? What do you need to do? I always think about it like the book The Goal — which Ian referenced in his presentation — the theory of constraints: you have to find the bottleneck and then you have to fix it.
So I'm going to go through the story of five different teams — what the problem was and what we did about it. I'm going to try to weave in lots of tips and tricks that I've built over the years around managing large organizations.
01Team One — perception vs reality, and Direction (not Speed)
The first team — the feedback was: they haven't delivered anything in the whole last year. But when I got on the ground and started meeting with people, they were doing a ton. People were checking in code and lots was happening. The thing is, what they were working on wasn't actually moving the business forward. There was an issue of perception and reality.
When we think about velocity, velocity is two things — speed and direction. In this case, they had a problem with their direction. They were running towards a light at the end of the tunnel and you can't really see what it is — but then you get on the other side and it's not working. They were doing a ton of work and building a ton of features, but when they launched it, customers weren't using it. They weren't engaged. So they weren't having any impact to the business. And so when upper leadership or people outside of the organization look — they're like, what is this team doing? There are 50 people and we're not seeing any progress.
What you have to do as a leader, especially a senior leader, is figure out how to connect the people working on problems with the customers. If any of you have ever worked in a really large organization, one of the biggest challenges is that the people working on the product are so far removed from the people actually talking to your customers. You have to build systems to fix this.
So we did things like: - Created a monthly listening-calls meeting — we would play recordings from people calling in to talk about their challenges with the product. - Created a shadow rotation where engineers and product managers could go sit with the operations team and actually hear from customers using the product. - Created weekly reports that went out to the whole organization with feedback and a digest of all the tickets coming in.
This was really helpful in helping people get a better signal of what are the customers experiencing?
The reason this is so important — and I love this equation — is that your success is the summation of all the decisions everyone is making on your team every day. If they make good decisions (should we build this to scale to support millions of users, or do we just need to get it out as an MVP?), the better decisions they make on the ground, the more successful you will be. So as a leader, your challenge is: how do you make those decisions good? How do you build alignment from top to bottom?
Some of the ways we did it: a clear mission and vision; bringing people close to the customers; OKRs and roadmaps. I'm a huge fan — every team should have an 18-month sliding roadmap. I'm not a big fan of annual planning. My teams plan every single quarter, and we build brainstorming, innovation, competitive research into that all the time, all year. When we're asked to do our annual plan, it's the easiest because we already have it. Think about how you can build systems and do that all the time.
02Team Two — name the project (or you'll never get credit)
Team two — same feedback: they aren't having an impact. But in this case it was a totally different problem.
The team was delivering a ton of work, but they had a legacy product. Their product was 10 years old and they were a well-oiled machine. They were doing Agile, shipping every week, doing tons of work. But a lot of it was very iterative. They didn't have a big splashy launch, they didn't have these big things to speak to.
As a leader, one of your best tools in your arsenal is naming a project. One of the biggest mistakes I see from senior directors and directors — they have to learn this lesson the hard way — if you have a project like 'operational excellence' or a 'quality push,' or any number of these things that is a whole bunch of small tickets, put a name to it. Project Orion, Project Apollo, whatever. Use that to communicate status upward, group this work together, and then at the end the amalgamation of all this work — you can measure the impact and talk to it. It also then gives you something to put in people's promotion packets versus, oh, I worked on all these little tickets.
You've got to package your work in a smart way. We did this and it drastically changed the opinion of this team. And all we did was name the projects and report status differently.
03Team Three — Speed, throughput, contention, and the Universal Scalability Law
With Team Three, the problem was actually really different. They said 'the team is spending all this work, but it really shouldn't take as long as it takes them.'
The interesting thing here, if we go back to our velocity equation, is that their challenge actually was with speed. Their direction was just fine — they were just taking too long. If you think about the things that go into speed, you have distance (or the work you're doing) over time.
When you think about speed, there are two factors: throughput (how much work your company can get done in an amount of time) and latency (how much work a piece of work takes to get through the system).
This team suffered from throughput. These are my definitions, not someone else's — bear with me. I think about throughput as the things that are universal with every project the team does: tools, long build times, tech debt. It's a thing no matter what project you're doing — it's going to impact the throughput.
How did this apply to Team Three? Well, let's go to the Universal Scalability Law — if you're not familiar with this, Neil Gunther wrote a book. He talked about how in an ideal system the load on the system scales linearly with the throughput. As you add more capacity, in an ideal system, it works linearly. Adrian Cockcroft wrote a blog post where he took the same concept but applied it to organizations. I'm shamelessly stealing their work, but I love this model.
The amount of work scaling linearly with people — there are some tasks that work like that. I always use the example of Christmas presents — if you add two people, it goes twice as fast.
So in Team Three, when you added more people, it didn't make a difference. Why? Every decision required a meeting. The manager was really, really involved — he was almost a micromanager. There was contention for his time and his approvals. But this can show up in a lot of different ways. It doesn't just have to be a leader who doesn't delegate well — it can be a senior engineer or subject-matter expert that has to review MRs, or access to an environment, or an infra team's tickets. There are all different ways contention can show up in your system. As a leader you've got to think about how you remove those bottlenecks.
If you look at this applied to our graph, when you have contention, as you add more people, it flattens out. You can't add more people because there's contention for a resource.
The way we solved it: we pulled a page from Amazon's book — the one-way and two-way doors framework. A two-way door is a type of decision that is reversible — you can change it, you make a decision, and if it's not working out well you can go back. Those decisions should be made very quickly and pushed down as low in the organization as possible. But for decisions that are not easy to reverse — hiring someone, vendor selection with a big contract — those decisions should be made with a lot more scrutiny and maybe take a bit more time. As a decision-making framework, you can actually empower your teams to go faster and really think about what decisions do I as a leader or my leaders need to be involved in?
The other problem in this team was that every meeting required follow-up. The interesting thing about process is that there can be really good reasons — if you don't have enough process, it's chaos. But if you have too much process, it can really impact your ability to execute. Process is a Goldilocks theory — you've got to be somewhere in the middle.
In large organizations that have a lot of memory, you get process and approvals and controls the way you get barnacles on a ship. At first they don't really slow you down — they might even be invisible. But after a while you get so many that you're actually going really slow. As a leader, you've got to figure out how to remove your barnacles. One of the things we do in my teams: every six months we use part of our staff meeting to retrospect on our process. We used to do it faster when I was newer to the organization — we would ask: are the meetings working for us? Do we have a good agenda? Are we making progress? Do we find the meetings valuable? We would use fist-to-five — Google it — to gauge value, and if the meetings weren't valuable we'd say, okay, what do we need to do as a leadership team to fix it? Institute this sort of retrospective process in your leadership organization to really think about whether you're working effectively, and how you remove those barnacles.
So what we did: pushing decision-making down, and improving our process.
04Team Four — Latency, distributed systems best practices, Conway's Law, and people problems
Team Four — totally different problem, but one we see a lot in big companies: working with Team Three is making us slow, we have a dependency and they're not giving us what we need. How many people have seen that happen? Everyone.
I like to think of this as latency. The difference between throughput and latency, in Kate's definition, is that latency is the things that are maybe inconsistent between projects — the unplanned things that come up and are harder to predict. Incidents that have a lot of aftermath, people going on leave, complex team structures.
In Team Four's case, the challenge was technical. Their systems were really tightly coupled. They had one big monolith — to do a deploy of either team, they had to work together. These teams reported to the same VP. It was very problematic.
The interesting thing about best practices in systems design and distributed systems is that if you apply them to your team and your architecture, it makes you move fast. I don't have a lot of time so I'll whip through this — my slides will be available:
- Smart APIs and interfaces. - Being smart about your data — a single-writer principle. - We have one of our design tenets in every single design doc in my organization: embrace the single-responsibility principle for all of our services. - Standardizing your platform — people using common technology. - Small iterative releases and investing a lot in testing.
These all work well when you're designing systems, but it doesn't really help if you have a current bottleneck. When you're dealing with tightly-coupled teams, the fastest way to fix velocity is to start with the people problems.
What was the people problem in Team Four? Any change required people from both teams to get in a meeting and make a decision.
Why is this bad? If you think about communication pathways in one team, there's a lot. With two teams, there's a lot more. You have to keep everyone aligned, and there's an exponential (N²) increase as you add more people. Going back to our scalability model — we had the ideal system; with contention it flattens; with coherence (keeping people on the same page) it actually gets worse as we add more people. This matches our intuitions about how big companies work — the more stakeholders you have around the table (risk, legal, security, infra), the harder it is to get things done.
As a leader, how do you fix that? How do you streamline those interactions? How do you create systems and templates? How do you design your organization? I think a lot about org design to make it so teams are easy to work well together. Use distributed-systems best practices both in systems and in teams to minimize the amount of coordination between people.
05The matrix-org probability math
How many people work in matrix organizations? A lot. I've worked at Google for a long time — heavily matrixed. One of the big challenges you see is 'we can't get the support we need from our partner teams. There are competing priorities.'
Let's go through a math exercise. If you have one person working on a project, their probability of on-time delivery is roughly equal to their effort. I don't believe in 100% in anything — maybe it's dealing with SLOs for a long time — so 99%.
- Two people both giving 99% = 98% chance of on-time success - Three people all giving 99% = 97%
Now imagine one person is on a different team. You need 100% of their time, you budgeted for that, but something comes up — a competing priority — and they can only give you 80%. Now the chance of success drops to 78%.
The more people you add to your project, the probability is exponentially related to the number of people: - 15 people at 99% each = 86% on-time chance - 15 people at 95% each = 46% on-time chance
95% sounds really good — but you only have 95% of a person. As a leader, you've got to think about this when designing your teams. How do I bring teams closer together? How do I create virtual or permanent structures that will help me achieve my goals?
A lot of people have talked about value streams in different ways. This is yet another way of visualizing the same thing — where you align for the outcome versus the activity they're doing.
The most important lesson, though, is that it doesn't actually matter what your structure is. If you ask me what's the best organizational structure to move fast? I would say: it totally depends. You have to understand your goals. Sometimes you'll move fastest by having deep domain experts that go deep on a product. Other times this aligned-for-outcome model can work well. But this model can also go really wrong. If you catch me on the side, ask me to tell you the story about the team that has a small product with 300 microservices but should only have like seven. When you build things in a silo and you're really focused on execution, you're not always thinking about those around you. There are pros and cons.
What you want to do is take what are my goals for the next two years, and who are the people. I always think about org design as a two-year horizon — if you can design an org and not reorg for two years, that's really good. Design your organizations and your teams to optimize for internal cohesion and aligned goals.
06Team Five — Escalations, friction, and locality
Team Five — this one's really interesting. They had a lot of dependencies, but for them, everything required an escalation.
If you have any pair of people working on a project, there's always the probability of some friction. Friction can come from personalities, work styles, disagreement on approach, uncertainty in a problem space. There are wrong answers, but there are seldom right answers is the quote I always tell people.
These two people working together can have a disagreement. If they're on the same team, this is pretty easy — they can just go to their manager who resolves it. The manager resolves it.
But let's say they have a disagreement and they report to two different managers. What happens? They meet with their managers — that's two meetings — and then their managers meet with their manager to resolve it. The escalation goes up. And of course, Batman can fix anything (according to my children).
The further up you go in the org chart, the more overhead is required to collaborate. The further removed from the teams, the more costly it is to do an escalation. Just think about it — if it's just a manager, you can maybe have an informal conversation. If you go to a director, maybe it's a little more formal. But going up to a VP or multiple VPs, you put a lot of effort into those presentations and conversations — and the teams do too.
I'm a big believer that escalation is a good thing, but high-up escalations can really slow things down. As a leader, how do you make it very easy to remove friction? Make sure your leaders know how to do this well and are trained on how to resolve it, because you don't want to let it fester — it will poison the well and make collaboration even harder going forward.
07Conway's Law as a tool
I'd be remiss if I didn't say anything about Conway's Law. The way I think about Conway's Law is maybe different from other people: Conway's Law happens to you. I always think about it as how do I use Conway's Law to further my goals? If I know two teams are going to be working really closely together over the next two years, I might go to the other organization and say: how do we make these teams closer together? How do we put them under one leader?
Having these long-term roadmaps that I talked about at the beginning, and being able to really think about what is the right org design — it doesn't even have to be a formal design, it can be virtual. How do we bring them together? How do we create an identity around this project and working together to build something?
As a leader, you've got to figure out how to increase locality and proximity and really minimize the amount of hierarchy in your organization so people can move quickly. That's the TL;DR for org design for speed.
08The two most useful slides — Tufte-style roadmap and resourcing trend chart
The end of our tale — but the really important part about this was, this was a multi-year journey to diagnose all these problems (technical debt, operational issues, all the people problems). But there was one thing we did really well that I want to share with you, and they're probably the two most useful slides of my whole presentation: the slides we actually used to manage a lot of non-technical stakeholders into this journey.
If you go back to the beginning of my presentation, a lot of it was about perception — how do we help people see that actually we're doing really great work and it just takes time?
09Tufte-style roadmap
So we moved to presenting our roadmaps in this way. We took a page from Edward Tufte's book — how do we fit as much information in a visual as we can that will resonate with a non-technical audience? You can have any categories you want (they can even be project names). But this notion of what was in progress and what was planned — there are little icons that show compliance work or risk; there's actual text that shows how much money it is, or if it's designed to improve NPS comments.
The nice thing about this — one of the misunderstandings we had — we'd represent projects how a lot of people do: 'here's our quarterly plan, here's what we're doing next quarter,' as a list of projects with maybe what was below the line. This actually helped leadership that was not technical understand these longer projects and how they were staffed.
10Resourcing trend chart (replace the pie chart!)
The other slide, and this is probably my favorite — if you do your resourcing, a lot of leaders do this pie chart: 'here's my percentage working on KTLO, here's my percentage of, fill-in-the-blank.' We moved to a different model — it has that exact same information, but it allows you to actually see trends. In this example you'll see the yellow bar (compliance and loss reduction) going down — the team is doing a bunch of work to bring that down, and you see the trend. You also see the green (grow-the-business investment) trending upward.
The other nice thing versus the pie chart: it gives you a lot more information than percentages — you can represent hiring and attrition by actually seeing the numbers. Doing this consistently for all the different teams in my org was really helpful.
11Close
So with that — how can you go faster? I hope I gave you some ideas and lessons. And to help me, I'd love to hear what worked for you or other leadership lessons you have managing at scale. I think a lot about: how do I have better systems and better designs to produce higher-quality software faster — and a great culture that people love to work in. Thank you so much for coming. I appreciate it. Have a great rest of your conference.