Reinventing your SDLC for the GenAI Era
Reinventing your SDLC for the GenAI Era
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
So the next speaker is Stuart Pearce. He is our Portfolio CTO of Hg Capital, which backs 50-plus software and services companies with an aggregated enterprise value of $160 billion, made possible by over 126,000 employees.
I met him in January, thanks to Dr. Itay Nathan at GitHub, and like Rie's talk, his observations so much shaped how I view what agentic and vibe coding will do to technology organizations.
He's in a unique role where he has regular contact with technology leaders at all his portfolio companies. He will share some of the dynamics of how AI-native companies can appear out of nowhere and potentially pose threats in what were thought to be defensible and slow-moving niche markets. He'll share some of the dynamics of how those organizations responded and how they're mobilizing their portfolio companies to imagine what is possible for the customer, and the need to accelerate even further the pace of software delivery.
Here is Stuart.
Stuart Pearce
Thanks, Gene. Good morning. I'm pressing the wrong buttons already.
I'm Stuart Pearce, Portfolio CTO of Hg. Gene already gave a little background, but for those unfamiliar, Hg is a private equity investor specializing in software and tech-enabled services businesses. We help businesses scale from regional champions to global leaders.
Like all investors, we have a lot of slides with big numbers on them. I sit on the operations side of the house. I am technical, so I'm going to stay focused today on how we're using GenAI to accelerate that journey for our software businesses.
I think it's fair to say that, as a firm, we are all in on the potential of GenAI to transform productivity. We think we know enough about the underlying models, the ecosystem around those models, and everything is pointing to continued acceleration. We're really betting on GenAI to transform productivity, especially for the white-collar knowledge workers that we serve.
We're right at the start of seeing how GenAI can augment the intelligence of users and drive tangible leverage in people's day to day. I think dev tools have really led the way, and we're seeing that crystallize in better products being delivered faster than ever before.
From a customer perspective, GenAI lets us offer levels of intelligent automation that have just never been possible before. This is where the technology is still incredibly young. We're expecting, as the models keep improving, more and more use cases will unlock.
It's also really clear that the risk of disruption to incumbents is equally high. All of our competitors have access to the same technologies, as do a flood of really well-funded AI startups. The barrier to entry for startups has never been lower. Your features can be reproduced in weeks or days. We've got a head start, but the need to innovate has never been stronger.
We see this crystallizing in our own portfolio. Gene mentioned that one of our businesses, in what we thought was a very niche, slow-moving, very defensible sector, had lots of deep workflow complexity and the kind of sticky relationships that we love when we're looking for businesses to invest in.
A startup appeared on the scene, and in six months, really from when they were formed, they brought a product to market that was automating huge chunks of users' daily administrative workload. It had incredible wow factor for those customers. Then they started actively targeting our customers and moving to displace us.
You can imagine the CEO reaction as that starts unfolding. He pretty much did look like that.
This is a space where roadmaps typically get measured in years, maybe quarters. The obvious reaction from the CEO is: how on earth are we going to respond to this? How do we catch up and get in front of this before our entire customer base is gone and we drop to zero?
We mobilized a Tiger team with resources from the portfolio company and from Hg. We got to work on our own agentic capabilities, and that whole team rose to the challenge. I'm really pleased to say that soon we've got our own response in front of customers. Customers love having that agentic capability integrated inside their existing platform, so crisis averted.
Now, startups coming to challenge incumbents isn't anything new. The wake-up call for Hg here was seeing in practice how incredibly powerful customers are drawn to this kind of GenAI capability that solves their day-to-day struggles, but also the speed that an AI-native startup can move. Not just on the shiny new things, but on that long tail of complexity that historically people might have considered a moat. We certainly did.
It was really clear to us we needed to go and reimagine what was possible for our customers across the entire portfolio, and dramatically step up our pace of delivery.
We weren't starting from zero. We'd been an early adopter of copilots, and people were pretty familiar working with code assistants. We felt pretty good about the 10% to 15% productivity gains people were getting with them.
Coding agents started to appear in the market, and they were iterating and evolving far faster than we'd seen copilots move. It got clear to us that they were going to graduate from interesting experiments to valuable teammates pretty quickly.
The nice thing about the Hg portfolio is that it gives me access to a really large-scale lab where I can test out pilot programs and theses at scale, find out what works and what doesn't, and what we can scale. I want to share some of those results because the results we saw there are really pretty different from the patterns we've seen when we deployed copilots.
First of all, size of the prize. We saw that some teams were able to extract many multiples of leverage, and that for some tasks we're talking 20x or more in terms of how much faster they could go. But we also saw a really wide span of outcomes. There were lots of teams where people were getting stuck at the 20% to 30% level.
When we deployed copilots, there was a learning curve, but pretty quickly everyone converged on similar levels of impact. We were all in that 10% to 15% range. It was pretty level across most task types.
With agents, we were seeing that leverage was very uneven. Some teams could make huge step changes in their speed of development, but others really tapped out. Normally, a 30% boost in productivity on top of the 10% we'd already captured, I'd be really happy. But when you know that 5x or more is on the table, I wanted to drill in and understand what was setting those 10x performers apart.
I spent a lot of time talking to everyone who had been in our pilot projects. As I spoke to the teams at the percentage-gains level, it became really clear that they had approached this predominantly as an evolution in tooling. That got them so far.
But the people getting the multiples had taken a far more transformative approach to how they think about the SDLC in general. It's the usual transformation slide: naturally, any new technology has a bunch of kinks that need to be worked out. If you solve those, that goes a long way to the amount of leverage you can get.
If we're successful there, we start generating a lot more code a lot faster, which means we need to rethink our processes for the end-to-end SDLC. Then the biggest hurdle of all, as always, is the human change: establishing that culture for your teams to experiment, adapt to a very different way of working, and develop the new skills that are required.
If I start with technology, what I see is that there is a real minimum bar of technology enablers in order for agents to start working and deliver meaningful value over and above a copilot. Below that threshold, for some teams, they might actually hurt you and slow you down a little bit.
Everyone's own list will vary. What I'd call out is that mostly what I see is the same list of development enablers that boost productivity for human engineers: how quickly can I validate a change and get feedback that this is working and not causing regressions?
The difference is that the very limited context of an AI agent means it relies far more heavily on automated feedback loops, on documentation, and on the process rigor around this. Any gaps you've got in those areas are really magnified.
The good news is that the vast majority of things likely to be on your list are essentially solved problems from the DevOps community. There are established patterns for how you tackle them, and the AI tools are really good at helping you fill any gaps really quickly.
As you clear the tech barriers, code production starts to accelerate pretty dramatically. I don't know if there are any bikers in the room, but anyone who has ridden a motorcycle for any length of time probably knows that period where you have enough control over a bike to start going really fast, and that's a lot of fun. But you've not yet really learned how to read the road, understand what's going on, and understand the risks that you're exposed to, which can get you into all sorts of scenarios that create great stories.
Unfortunately, the out-of-the-box training for AI coding agents is similar. They know enough to write code quickly, but not enough to keep themselves out of trouble.
As a problem, this is actually getting worse because the models are improving. The nature of errors that I see coming through are ever more subtle. It's not like before, where it simply wouldn't compile and you'd very easily spot the flaw. Code often passes basic tests, but it might be insecure or fragile under load. We want to go fast, but we're working on mission-critical software here, and we can't tolerate those errors getting to production.
We could double down on code review and all the established quality gates that people have in their SDLC. The extra cycles we spend then cleaning up errors simply swallow any of the gains that we'd made earlier. What we went in search of was the equivalent of advanced rider training for our agents.
The models are really good at language syntax and frameworks. The areas where they need additional training are defining what is a robust software engineering practice.
I speak to a lot of different organizations through my portfolio role and in diligence when we're looking to invest. Everyone always says that they have a well-documented SDLC, and I've never seen one that covers all of the fine detailed topics that engineers have to make decisions about on a day-to-day basis.
They typically set out high-level coding standards, principles, and patterns. What they don't do is say: these are the very specific architecture choices, tech stack choices, approved libraries, quality strategies, all of those things taken to a very granular level and adapted to your specific product context.
There's a lot of room for interpretation in those documents usually, which is great for humans with deep domain context. It's really bad for AI agents.
If you don't have this granular context, every task that you give to the agent is exposed to risk as the agent tries to infer your standards from existing code. Often, for whatever task you've set it, there are probably multiple legitimate options it could go with. You can see how pretty quickly that compounds into a situation where you get code that functionally works, but it falls a long way short of expectations for a production-quality, mission-critical product.
A codified process narrows down the span of options that an LLM has to work with, and it greatly improves your chances of success.
If I turn to the people side, which I think is the most interesting area, we have to recognize that if you want to get the exponential returns we're looking for, you need an equally large change in how people work.
I think of the working model in three broad categories. We've got the copilot model that slotted in very easily to how everybody had traditionally worked: no real changes to your day to day, but the gains were pretty modest.
There's another category of tools that I'm going to call an AI-first model. With this, the engineer role is very hands-on, but now it's more like pair programming with an AI pair. You're asking some behavioral change to delegate microtasks, but it's still very familiar. It gives lots and lots of micro-control, very tight human-in-the-loop feedback. The gains are much bigger, but it's ultimately throttled by the fact that you are so tightly in that loop.
To break out of percentage gains, we need a radical change to the role. The agentic model is a fundamentally different role. We're asking engineers to think strategically about the architecture, the quality of the acceptance criteria, and to break work down into units that can be handled by an AI agent and give them leverage.
When the outputs come back, they need to pragmatically review the output in the product context. They need to think about where the code needs to be perfect and what defines good enough for a particular task.
This probably sounds a lot like an engineering manager role, because there's a lot of overlap there. The difference is that now we're asking junior team members to perform that same level of pragmatism.
Gene rightly challenged me that this sounds too good to be true and asked how we evidence it. I have a case study from one of our leading businesses, Litera, a market leader in legal tech. As they grew, the usual scale and growing pains that slow down delivery came through.
Hg has a great relationship with Cognition, and Litera built on that to develop groups of specialist agents that could operate beyond coding. They could drop in for test strategy creation, test case creation, test automation, test execution, and those kinds of things that created space for the QE team to escape the treadmill of just running to keep up and shift onto higher-leverage activities.
Very quickly, we got 40% more automated test coverage, regression cycles 93% faster, and they could support five major product launches in three weeks, which is just unheard of in that space.
The QE team really leaned into this. Very quickly those people shifted into agent builders, building out a library of ever-richer and deeper agents that could give them leverage on more and more tasks.
That cultural change is so important as it spreads through the whole org. It builds a flywheel of productivity, and the end result is really transformative to the pace of innovation at Litera and how engineers think about the value that they deliver.
This level of transformation is pretty hard. A few people have mentioned it already: there are definitely a subset of engineers who just intuitively understand and like this way of working. But it is a pretty radical change, and there's a steep J-curve that goes with it, so it's not surprising to face some pushback and challenge along the way.
A lot of those initial objections are probably summed up as needing to catch up on the enabling foundations. You just need to work through that. The more challenging emotive ones take longer to resolve.
This is a fundamentally different role in how we're asking people to behave. People who used to be your top performers may now be struggling, and that's not a nice place to be.
A lot of the time, I see that they've just had a bad initial experience, or they have some lovingly crafted personal workflows that are not transferring well. In many ways, that experience counts against them in terms of how they transition. They've got to get used to a bigger difference than, say, a smart grad who's not had time to develop those things.
It's absolutely worth persevering with. Once they get it, this group can really fly. They have all the skills to generate incredible levels of productivity. I've seen one person who can now deliver 80 PRs in a single day, which is just insane to me. But they do need careful managing during that transition period. They can be extremely loud detractors, so it's important to get them on board quickly.
I normally spend a lot of time talking about managing constraints. I don't think I can really add much for this audience that Gene hasn't said before, far better. The only thing I will say is the amount of leverage you get as you go through this transition is very uneven across tasks. That is clearly going to cause a lot of churn as you think about where your bottlenecks are in your process. You can expect some thrash at each stage for an extended period of time.
In summary, our leading businesses are showing that when the foundations are in place and teams lean into this new way of working, GenAI can give us some incredible leverage on the investments that we make in software delivery. It allows us to deliver better products faster, greater wow faster to customers.
I can't pretend it's an easy journey. There's a lot of hard work and some challenging conversations along the way. But the destination, I think, is a world of software abundance.
In terms of my asks of this group: if you are looking for a sympathetic investor to support your journey, please get in touch. More directly, if this is a journey that's familiar to you and you've got skills that can help us up-level the entire portfolio to get this very broadly adopted, then I would love to speak to you some more.
Thank you.