Exorcism of the Haunted Codebase — Modernizing Our Oldest, Scariest Code with AI
Edith Harbaugh, Co-Founder and Executive Chair at LaunchDarkly, and Zach Davis, Principal Engineer at LaunchDarkly, deliver a hot-off-the-presses experience report on using AI to rewrite the company's oldest, most feared frontend codebase — a 66,000-line React targeting system carrying twelve-plus years of accumulated technical debt. Working within a six-week timeline, a $10,000 inference budget, and a strict no-customer-disruption constraint, the team put agentic development patterns to the test on a real, business-critical legacy system rather than a greenfield project. In this talk, you'll learn how to make a large legacy codebase agent-ready, where autonomous AI workflows break down and human steering becomes a force multiplier, and why the true unlock of AI-assisted development is not velocity but ambition.
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
One of the things that you hear all the time: everybody knows they can use AI only to tackle new software projects, the greenfield projects with their own constraints. Because everyone knows that when you give it a real code base, it'll just get confused, choke, and probably ruin it and delete the repo.
I think some of these stories were probably grounded in real experiences, which made me so excited to meet Zach Davis, one of the principal engineers at LaunchDarkly — the company who brought feature management to the masses and beloved by so many in this community. He's been at LaunchDarkly for nine years, starting as their first engineering manager.
And as a super switched-on engineer, he wanted to tackle one of the oldest and most used parts of the code base, which is called targeting. This is part of the code base that even the most experienced engineers were at times afraid to touch. Many entered, few left.
He will tell the amazing story of why he chose to tackle targeting, why it was him — did he do something really good or bad in the past? — and what resulted. And I'm so delighted that he'll be co-presenting with Edith Harbaugh, founding CEO of LaunchDarkly, who will share the story of why she rejoined as CEO because of AI, and how and why she wants to transform her company.
Here is Edith and Zach.
Edith Harbaugh
Thank you, Gene. So this is a hot-off-the-presses experience report, to use Gene's term.
Gene came and visited our office. We started talking to him, and he said, "You should come and present what you're doing." So this is the first time we're giving this talk, and it is our lessons learned from it.
So I'm going to kick it off and then Zach's going to join me after.
So first, a little bit about LaunchDarkly, if you don't already know it. I co-founded LaunchDarkly in 2014. I started it because I was very frustrated with the state of software at the time. I'd been a product manager, and before that I'd been an engineering manager, and releases were so painful.
I got super excited about the DevOps revolution, but I saw that it missed this key component of feature management — of if you have something that you want to ship, you want to be able to cut it into smaller pieces, which are features, roll them out, launch them, and then after release, control them.
I know that we have a lot of customers in the audience, so you're probably nodding your head. "That seems very basic, Edith. Everybody does that now." It wasn't at the time. It was a lot of going out and just convincing people there is a better way to build software.
So here is where we are today. We have 5,000 customers. We serve 45 trillion — with a T — features every single day. And one of the things I'm proudest of is just the love that our customers have for us.
Here is a quote: "Few things have fundamentally changed the product development life cycle as much as LaunchDarkly. Once you have it, you realize that you can't live without it." That feels wonderful.
But now we're in an AI era where releases — what used to be you could brag about a release every day — people are just like, "Uh-huh." And also there's this world of AI which is pushing changes faster than a human can even comprehend. And we are now, I'd say, somewhat like a lot of our customers. We have a lot of legacy code.
So here is a screenshot that made me cringe. This is literally a case study old enough that I wrote it back in 2015 — a screenshot from what our customers were doing back then. They were running a pilot where they were having some of their own customers in one version and some in another version. And you can see that we let them target different users and roll out.
This is kind of the heart of what LaunchDarkly does: allow you to segment out in real time who sees what when. And what had happened over time is that this code base had become haunted.
It was the oldest, the most complex, most business-critical front end. Twelve-plus years of accumulated debt. When Zach joined, this code base was already about a year and a half old. Just layers and layers and layers of code. A decade of outdated patterns and files.
I compare it a little bit to the Winchester Mystery House. If you're familiar with San Jose, there's this famous house where a rich person just kept building rooms and rooms and rooms on. This was basically this part of our code base — when we tried to layer something on, we were like, "Oh, let's try to jam it in here, do this, do this." And it had just become completely unwieldy.
A recent team wanted to make a change to our rollout menu and just kind of gave up. "We can't do this." And this was just a real tax. People were spending weeks on a change that should be an hour, and also it was a person doing it. Where it should be something that should be very easy, or even an agent doing it, people were just literally trying and trying and trying.
So the challenge was: can we use AI to rewrite this?
There were some conditions on this that made it easier and some that made it harder. The easy one was: let's not do scope creep. In my past life as an engineering manager, I'd been in projects where it's like, "Well, let's modernize the UI and also add these four features." Those projects usually went about as well as you might expect, where suddenly you're like, "Well, if I'm adding this feature, let me do this and do this and do this," and then you end up not getting as far as you want. So that was, I think, a benefit that the team got — it wasn't "rewrite it so that it looks much nicer also," it was just rewrite it.
They also wanted to scope it to six weeks, which seemed like a palatable amount of time that would create some pressure. Again, this is following Agile best practices — longer projects get less done sometimes. If you have a contained scope of six weeks, we hope that we will see real progress.
And then when we made the slides — Zach and I made them together — the $10,000 budget was actually a constraint from him. I was willing to spend as much as it took to get this thing fixed. If it had cost $100,000 or $200,000, but that we could have meaningful progress on these screens again, it was good for us.
And again, no disruption for our customers. We have 5,000-plus active customers using these screens every day. We can't afford for them to not be able to access it.
So with those constraints, I'm going to turn it over to Zach to talk about how it went.
And I will say that one of the funnest things I have done lately is just hearing these stories about how AI is changing things — in fun ways, stuff that seemed impossible before can now be possible.
Zach Davis
Thank you very much, Edith.
If you're lucky enough to have a successful product, you've run into the same problem, right? You have a huge code base. Its success begets complexity. And so you may want to do something similar.
If you're just getting started, don't start with what we're doing. I promise you, you need to walk before you run, as our friend Toggle here is doing. Maybe you need to stretch a little, and make sure you're ready.
So what did this mean for us? It meant that over the last year or so, there were a few things that we needed to do to be ready.
One of those things was to make sure that we actually understood the tooling. You need to understand the limitations of the modern AI tooling in order to know whether something's possible, not possible, almost possible. You really need to steep yourself and explore. I've heard other people say in the last few days that nobody knows what we're doing. You do not know what's possible unless you try it.
I went out to find if people have been doing dark factory patterns in enterprise legacy code bases. I couldn't find anything. There's nothing out there. It's all sort of greenfield stuff that people are doing.
So you have to understand the tools, and you also have to make sure your code base is agent-ready. We pulled in context to make sure the agents knew how to operate. We spent a bunch of time improving the feedback loops. We added better guardrails. We were early in agentic code review. We onboarded a tool called Meticulous — I'm a big Meticulous fanboy. If you're looking for visual regression testing, it's like my favorite product that I've found in the last however many years. And we just made it easy to verify changes, to add guardrails, stuff like that.
And one of the things that I loved about this is that it became very obvious early on that the things that make agents more effective are the same things that make humans effective — and it's just all the stuff that we've been putting off for years and years and years, that we knew we should've been doing, and we just never made time for. And now, not only is there more of a reason to make time for it, it's actually way easier to make some of these changes in your code base.
So now that we're ready — simple idea, right? We're going to rewrite 66,000 lines of React code, and we're going to do it in six weeks.
I pulled in one other engineer, the only front-end engineer who's actually been there longer than me, and came up with a plan. It was a beautiful plan. Week one, we plan. Week two, we build this kind of autonomous system.
I honestly, truly, and really thought that we were going to be done in four weeks. I was convinced. Spoiler: we did not finish in four weeks. We also did not finish in six. We will get to that.
So we had this plan, but then we ran into reality.
What are some of the things that we ran into? I knew it was a big problem, but I thought, "We have agents on our side," — and agents are great at scale, but even agents struggled. What we did is we looked at the old code and we had this great source of what actually happens in our application, because we had the legacy implementation, the legacy code base. And so we just pointed agents at that. We said, "Pull out all of the stuff that happens on this targeting view." And it would go do it, and it'd come back and say, "Great. I did it. Here you go." And then you'd say, "Well, can you double-check and make sure that you got all the stuff?" And it'd be like, "Oh, okay, I missed some stuff. Here's some more things." And then you do that over and over. So it's a bunch of stuff to actually try and wrap your arms around.
Another thing that we ran into was just human bandwidth. You are in some ways limited by how much you can parallelize and think about. I'll talk a little bit about the autonomy trap in a second.
But basically, by the end of week six, we had written 36,000 lines of code. We generated most of that over the course of less than two weeks. But we weren't even close to done. Because despite all of that planning, despite all of the stuff that we had done up front, we were still missing a bunch of stuff. This is week eight — literally right now, this is week eight — and we're still finding little things that the agents have missed, and we're trying to get through all of that.
So I'm going to talk a little bit about what I call the autonomy trap. I know Steve Yegge is here in the audience, and he was the inspiration for a little bit of how I decided to approach this project.
I read a couple things leading up to this project, and a lot of them were by Steve. I read "The Grug Brained Developer" — if you have not read it, it's fascinating. And I also read his thoughts on code being throwaway. That was part of my inspiration for this: okay, if code is eventually going to be throwaway, then we should be able to recreate the entire LaunchDarkly front end in some amount of time. What does that look like today?
The thing that I ran into is that it feels like Rube Goldberg development. You spend all your time trying to create this little mousetrap-type thing where this thing has to run into this thing, and then you need some stuff to check all the agents that are going awry. And it's really easy to get sucked into that, and you end up trying to perfect the Rube Goldberg machine instead of actually trying to get to your goal.
Focus is hard. Multitasking is great until you're trying to rewrite tens of thousands of lines of code over the course of multiple weeks, and then you lose track of the goal sometimes.
And then maybe two more controversial takes. One is: I believe that right now, despite all of your best efforts, human steering is a force multiplier. I just haven't seen it where the agents are actually making enough good decisions, even with as much context and upfront steering as I can throw at them. I find that if I'm in the loop, I still get way better outcomes.
And then I also believe that friction is signal, not noise. When you're working on something, you feel the places where you slow down too much. Even if you're doing agentic pair programming, you feel where the agent's getting stuck, and you can actually course correct. You can make better decisions.
So what did we do? Remodeling room by room.
We broke it down. I said, "I'm spending too much time trying to chase this autonomous dream, and we just need to break it down a little bit." We had actually already defined 22 discrete phases. The thing I was trying to do is build them contiguously, or sometimes in parallel — basically trying to one-shot more or less by building this complex system around it. And so we stopped doing that and basically went phase by phase. Again, these phases were still very large. They were entire features in the targeting front end, and they were thousands of lines of code.
The system that we had built worked at a slightly smaller scale and with more human-in-the-loop interaction.
So what did we end up with? As of earlier this week, we are shipped internally. Everyone at LaunchDarkly is using this new version of the front end. We're still finding small inconsistencies, and we're working towards customer rollout. We've written 39,000 lines of code across hundreds of files. Our 22 phases actually ballooned into 34 by the time we were done and found all the things that we had skipped. And of that $10K budget — as Edith said, that wasn't really a goal, it was just something that I thought was interesting to track — we spent about $7K in inference.
So a few more things on what we learned.
One is garbage in, garbage out. I believe that AI is an intent amplification machine. If you have vague intent, maybe you'll get something great, but chances are you're not going to get something great. And so you really have to understand what you're doing. You have to do a lot of thinking upfront. It does not replace the thinking that you need to do.
The bottlenecks don't vanish — they move. I think we're all aware of this, but we really felt it. They moved downstream to code review. Because we isolated this and put everything behind a feature flag, we could really just shove code into the code base. We still wanted to make sure it was good, and so we were still doing agentic code review. You get stuck in the code review loop. The bottlenecks don't disappear.
We talked a lot about the autonomy trap, and my biggest thing is that I think we talk a lot about speed, we talk a lot about velocity. And I think the true unlock is not velocity — it's ambition. The reason we think this is great is not because we can just move faster, it's because we can do more ambitious things. And in order to do some of the ambitious things, you really have to rethink all of your preconceived notions about the way things work.
So if I could go back and do this again, I think we would approach it very differently. Even though we were using AI, we followed a very familiar pattern for how we approach the problem — writing specs, writing plans, doing all of this. And I think if I could do it again, I would go back and just use the existing code as a source of truth. That's the best spec you could ever have. And I would lean more heavily on that instead of trying to do these intermediate steps.
And I will bring Edith back up to wrap us up here.
Edith Harbaugh
I don't know if it's a joke, but what Zach might not have realized is I really wanted to test all this dark factory stuff.
If you're not familiar with dark factory, it's this concept that you have a factory and basically robots are doing all the work where humans aren't really in the loop at all. So part of why we approved this project was: let's take pretty much our most experienced two engineers. Lexi, the other person who worked on this, was our second engineer who is still at the company a decade later. These are the people who not only know where the bodies were buried, they probably buried them themselves.
So it was a learning experience for us in terms of dark factory could only get us so far, but it was really for the purpose of what Zach just said — what could we do next time.
And the thing I also really liked about this project is that I don't think we're that different from our customers. When we were a startup, that screenshot I showed you was when we were four people in a co-working space. Now we have 5,000-plus customers, and we have to be a lot more careful with changes, and that's just like a lot of our customers. We have healthcare companies, we have banks who have the same constraints. We can't just YOLO and vibe code. We have to make sure that it works.
So the thing I think about a lot coming back as CEO is that the reason why I started LaunchDarkly was because I loved building software, and I wanted to make it easier. And all these new tools around AI and new ways of thinking — I think we're all excited about them because it makes it easier to do what we love, which is building software.
So thank you all, and Gene asked me to ask for help at the end. If you are also thinking about dark factory or dark software factory models, or have experiences of your own, we'd love to hear from you during the break, or you can send us emails. Mine is super easy — it's edith@launchdarkly.com. And Zach's is zach@launchdarkly.com, because the benefit of being very early at a company is you get the best email addresses.
Thank you, everybody.
Host Outro (Gene Kim)
Thank you, Edith, and thank you, Zach.
By the way — I'm not sure you saw the Slack message — one of the funniest parts when Zach told me the story was he knew it was going to work when he would start mixing up the old version and the new version, which I think is awesome and also terrifying.
Thank you, Zach and Edith.