Win the DevOps Race with Feature Flagging

Log in to watch

Las Vegas 2023

Win the DevOps Race with Feature Flagging

Director of Architecture and Delivery · Shelter Insurance

Create a strategy to win by moving your features to production faster than your competition. You will learn how to use feature flagging to win the race by removing the hand-off that keeps most teams from being successful in feature delivery.

Chapters

Full transcript

The complete talk, organized by section.

Travis Fritts

Welcome, everybody. Thank you for attending Feature Flagging.

So we're going to talk about flags today, just to make sure everybody is in the right place. We're going to be about winning the DevOps race with feature flags and thinking about what that looks like.

I'll go ahead and introduce myself. I'm Travis Fritts. I'm the Director of Architecture and Delivery at Shelter Insurance. Shelter Insurance is a Midwest insurance company, and so if you're in the Midwest, you've probably heard of us. If you're not in the Midwest, you probably have not.

Today I definitely want to spend a little bit of time talking about feature flagging. But in general, at Shelter, I run our cloud center of excellence. I run our enterprise architectural practice. And I'm also over delivery across all of our domain teams. We have agile teams, so across that space we have lots of different teams with different domain knowledge, and I'm in charge of the delivery across all those teams. That's why feature flagging is really interesting to me.

I'll take you down the journey that we've had, but we'll go ahead and get started here in thinking about winning the DevOps race with feature flagging. I'll also take you through some of the stuff that we're doing currently today and how we've integrated that into our pipeline and how that's helping us today.

I would like to start by encouraging us, as a group of people attending this conference, to be thinking about adopting practices that allow you to win. That might sound a little bit weird, but I think the language that we use sometimes is really important in this space, because I spend quite a bit of time with our teams trying to encourage them to use practices that allow us to win.

I hope that resonates with everybody, that we're competitive people and that we really want to be in a position of understanding what are the practices that are really helping us win. I think that's really, really important for us, especially in leadership. My expectation is everybody in this room here is at some type of leadership position at your organization, or if not a leadership position, even a position of influence. I think that's also really important as well: to really encourage teams in the delivery space, the feature delivery space, and prepare them to win.

Today we're going to talk about winning the DevOps race and really what that looks like for us. But I wanted to share that with you.

The other thing I'll share is I do think today's talk is very practical. If you're like me, when I come to a conference, I'm constantly trying to find those nuggets of information that I can take back to my organization to make a difference, to help us be better, to help us improve. Today, I'm going to go through some things that I think could be very valuable for you.

This might be the session for you to take something from this to go back to your organization and say, "Hey, this is interesting. I heard something that might make a difference for us," and try to investigate that, for sure. If you're doing no feature flagging at all, you have none of that in your infrastructure today, I definitely think it's something that you should probably look at. Maybe this talk will give you that opportunity to go back to your organization and say, "Hey, maybe we should investigate." If nothing else, just start an investigation and see what this would look like for you.

Now, my experience is not going to be your experience. You are at your organization. I'm going to show some examples of some of the things that we've done with feature flagging, but those are specific to our examples, the things that we were having need in our organization. You guys are closer to your space, but I also think that you guys have influence. When you go back to your organization, you can speak into this and say, "This might be something that's important for us."

Because the reality is, it truly is a race today. Let's think about this: speed is a competitive advantage in your organization. Speed is actually a force in your organization, and everybody today needs to move faster. Everybody's got to move faster. I have these conversations with my teams, and it's, "Okay, faster, faster, yada, yada, yada." That's not enough.

We're going to talk today about what it really looks like to be faster in a delivery pipeline. I'm going to show you an example of where we've been as an organization and where we've moved to, and still yet more work to be done. That's with the help of LaunchDarkly and how we've implemented that in our systems.

But the reality is, everybody needs to understand that we're in a competition today. The nice thing from a DevOps perspective is you guys are people that are leading those charges, and you guys can make a huge impact within your organization when it comes to speed. Now yes, we need to understand the value that we're delivering, no doubt about it, but there's also an opportunity here to really become efficient in how we deliver software. Those two things combined, I think, help us win in that regard.

But that's really what it's about: trying to figure out the speed piece, where everybody is competing. If you're not moving the needle on speed, your competition is. What does that mean for you over time?

So I really want us to think about what are those practices that we can start to implement that are going to help us win. I'm not looking for third place. I'm looking for first place. That's what I'm telling my teams, because if you go into that mindset, you're going to find growth. But if you're coming to the game saying, "I'm okay with 10th place," that's not going to work. Not in today's market. It will not work.

There are opportunities with feature flagging to actually increase your ability to deliver software. That's some of the stuff that we'll be looking at. But that reality of moving faster, I think, really is a driver for everybody. I'm not too sure there's anybody here that doesn't fall into that category, that everybody needs to move faster. You need to move your software at a different pace than even you did last week.

In the insurance industry, it's one of those things for us where we're always seen as kind of a monolith and moving slow. But for us, the reality is we're finding those same market demands as well. You have to compete.

For my teams, I actually tell them it's an internal competition as well. We want to make sure that we understand what it looks like internally for us to compete and to build and deliver software, versus externally. Those are always things that I think you need to consider.

But I think everybody needs to understand that speed's a driving competitive force in your organization.

When I think about this race, it's interesting because I think about a race, and if we were to just take a race, in a multi-lap race, and what that looks like. One of the challenges for us as an organization is we find ourselves in a relay race. Hopefully everybody understands what a relay race is. It's a race where you pass a baton. You're passing a baton at the end of the lap.

For us, that's a race that we've fallen into that we actually really didn't even know we were in. I'm going to go through this a little bit, but it's interesting, some of the analogies around a real race. This is a foot race, but there are opportunities to be passing the baton, and we found that those were challenge spots for us. Those are opportunities for us to lose, to not win that race.

So we want to work on removing the handoff. In a relay race, you're either going to win or lose the relay race in the handoffs. So we're looking at our delivery pipeline, and trying to find all the places that we think there's a handoff happening.

You'd be surprised, if you really go back and start to look at your delivery pipeline, you might find that you have lots of handoffs going on that you didn't even realize were happening. We're trying to take an opportunity to look at every one of those, what I'm going to consider handoffs. We can think of these like a relay race, passing the baton.

We have sometimes these type of handoffs in our delivery pipeline. I'm trying to figure out ways to get rid of those handoffs. I do not want those handoffs, because the reality is you either win or lose at the handoffs. So if I can take an opportunity to remove that component, I increase my chances of winning.

The question really comes up: do you really need it? What was the reason that it's really in your pipeline anyway? I'm always amazed, absolutely amazed, when you really start to look at these handoffs that are in your pipelines when it comes to delivery and start asking questions: Why do we have this in here?

"Well, it was because Joe really thought we needed to have him, and he retired three years ago."

Or, "Mike said that that's the part we have to have." Mike went to another company.

So why do we have that in there? It's because no one is really actively overseeing the handoffs to make sure that we have the most efficient process. That's when we've kind of started a transformation effort to say we're going to start to reset when it comes to delivery. It's become that important for us that we're really going to start to transform how we deliver software when it comes to pipeline, making sure that we're using the right platforms.

We're using the right platforms because we really want to try to figure out: how do we remove the handoff? How do we just get rid of it? I don't want to run the risk of dropping the baton, because it happens. Actually, I've gone back and I've looked at it, and sometimes we've lost races and we didn't even realize we dropped the baton. It wasn't until the race was over that we figured out, "God, there was a huge gap here that nobody could pick it up." We just let the baton fall before.

Sometimes that happens to us when we're delivering software. That's why I want to completely remove those away.

So let me talk just a little bit about how we're removing that handoff. The first one is choosing best-of-breed solutions.

When it comes to best-of-breed solutions, I'll talk a little bit about what we've done around feature flagging. I have a story for you. My delivery teams would come to me and say, "Hey, Travis, I think that we've got a need here in this delivery space. We've got feature flagging. It just doesn't work very well."

"Okay. Well, that doesn't sound like feature flagging to me then. So tell me more."

"Yeah, we've done some things. We've got some tables that we can store some information in, and we can kind of turn things on and off, and so it works and it does things."

"Okay. Well, that's probably not good enough for us."

It's like, "Okay, it's not really good enough for us."

I said, "So do this." This was internal teams, and we've got internal teams that like to develop, that like to write code. I support that for the right reasons. I said, "I'll tell you what. Why don't you go out and help me understand, come back with an understanding of what it's going to take for us to build a feature flagging platform that would be good enough that I could sell to someone else. Just, generally, how long do we think it's going to take that we can have something in place to do that? Because I need it. I don't need it today. I needed it last week. What we have is not keeping pace."

Didn't take very long. Development's back against Travis: "We're not going to. The timeline is probably not going to work for us."

Because they knew the amount of effort that we would have to spend in building a platform that effectively really would be a good platform for us to use, to really try to figure out: how do we flag our features? How do we roll things out in such a metered way? How do we have the right tooling around that? How do we have the right visuals around that? How do we expose this to the business from a product owner perspective? How do we do that?

Well, we quickly figured out that that was not the business that we're in. We're insurance. So we needed something different. We went out to market and said, "You know what, if we're not going to build it ourselves, we're going to select best of breed." And so we selected LaunchDarkly.

We had a good experience with LaunchDarkly, and that is the opportunity that we had. We were either going to build something ourselves internally that we felt like would be good enough that we could sell, or we were going to go after best of breed. This space was really important for us, to figure out how do you bring a best-of-breed solution into your pipeline? How do you allow that to help you transform your ability to deliver?

This was a decision we didn't make lightly, but we absolutely believe that LaunchDarkly helped us to date transform our ability to move features and be much, much better in pipeline delivery.

So that's the first part of this: choosing best-of-breed solutions. But this is really an effort to remove the handoffs. We do not want those handoffs. We do not want those gaps in our pipeline. We were finding that we had some gaps in our pipeline, and we decided to go out to get a solution to help fill those gaps. That's what LaunchDarkly does for us today.

Reducing the risk of failure. We knew that we needed to take an opportunity to make sure that when we continue to deliver software and we deliver it faster, we don't want to just create more problems. We need to try to figure out: how can we mitigate that risk? How can we bring a solution to bear to help us understand and mitigate that risk at scale, to not be caught in a situation where we're doing things faster, but we're just causing more problems faster?

Because it's pretty easy to fall into those kind of traps. You want to automate everything, and you're automating problems. So one problem last week has 10 problems springing up next week. It's going to be 40 problems. That's the way those things happen. So you want to make sure that you understand, as you move forward, you're trying to mitigate risk.

We understand this pretty well. We're an insurance company. So we wanted to take a good approach to say, what can we bring into our technical ecosystem to help us in that regard?

The last one here is transforming our practices. That's why I'm saying we took an opportunity to say, "You know what? This is such a big deal for us right now, having to increase our ability to deliver, that we're going to go out and we're going to get the right feature flagging platform for us, and we're going to move that forward and transform the way that we're delivering software."

I'll show you part of a pipeline and what that looks like in just a few minutes.

All right. So let's look at kind of our DevOps architecture. I'll tell you that the big pieces here: Jira, Bitbucket, JFrog Artifactory, Jenkins. That is our DevOps architecture. Those are the pieces that we have, and most likely everybody at least recognizes some of those. We're probably no different than anybody else.

Now, this is a bigger picture of kind of our stack and what it looks like. I'll start at the top here and walk you through. There's a business need. Something happens. There's a business need. We move it to create stories and epics. Those feature stories actually get into a backlog, get pulled out for feature delivery, end up in code repository.

Then we have Jenkins down here, those Jenkins builds with our continuous integration. Then we have Jenkins deploy that moves us through different environments with gates on them around testing. So we have application testing, integration testing, IST testing, and smoke testing. All of those.

Then the last part of that, out there on the end, is there is a trigger step here prior to production. Trigger step prior to production.

Now, we're a regulated industry. We're insurance. So auditing is a huge deal for us. Hopefully everybody saw the auditing talk earlier today, this morning. But auditing is a big deal for us. We've got a separation there of getting all the way to our pre-production, and then there's a trigger outside of that that gets you to production. That was our pipeline, kind of our DevOps architecture.

Now, I do want to speak a little bit though about removing this handoff and how that's going to impact our architecture.

All right. So around feature flagging and cloud migration flagging: those two pieces. These are two examples. They're very high-level examples, just to give you a sense of both these pieces, how we're using these things for true feature flagging and also cloud migration flagging.

Now, I'm sure everybody in here is doing some level of cloud migration. Maybe you're further down the road on this than we are, but I think everybody is finding out that there are some challenges with cloud migration. I'll talk a little bit about that and how we're flagging those things, because it can be a real bear, and you might not even understand it until you get there and start to experience a little bit.

Let's walk through these.

The first one's feature flagging. Anybody using feature flagging in here right now? Ever? Anybody? We've got some folks that are using feature flagging, some folks maybe that know a little bit about it. So let me just kind of cover it at a very high level. Again, this is really high-level architecture stuff. This is just Travis and some drawings here to try to explain the point. There's a little bit more when it comes to cloud migrations.

This first one is, on the left here we've got an application, and on the right side we've got LaunchDarkly and a flag-on, flag-off state. There's an application there. LaunchDarkly is a SaaS application. This is a custom application on the left that's got a menu item.

Now there's an SDK that allows you to wrap features within your applications that are going to point back to LaunchDarkly and be able to manage that flag effectively from their system into your system.

So we've got an application. It's got a menu item. We've got an SDK that connects to LaunchDarkly. We've got "show a menu item." If the flag's on, show that. If the flag's off, don't show that. In the simplest case possible. Simplest case.

The more important thing, however, is no deployment needed. No deployment needed. So once I instrument my code that goes into the production environment, I have managing that through LaunchDarkly, and I'm not touching that code again.

That's the high-level picture of feature flagging, but it really gives you a lot of flexibility and the ability to move some of that, abstract the management of releasing your features out into a SaaS solution that you don't even touch once you go to production.

So what it allows you to do is really instrument your code in such a way, when your development teams are done developing, move it to production. Move it to production. Then give your opportunity for some of your business folks to manage it the way they want to manage it.

Let me move on to the next one. This is around cloud migration.

Now this one's a little bit more interesting. So we've got a browser application with a menu item, and we've got a service. We've got a microservice out there running. Well, I'm going to move this microservice to the cloud. I'm going to move this out there. Pretty simple. LaunchDarkly, I'm going to put a flag in here to say, "Okay, why don't you then route over here to the cloud?" No big deal. Seems pretty simple.

But I've also got up here, if you look at the top here, there's bookmarks. So in the browser there, there's bookmarks, and I've got a service that was connected. But I've also got to try to figure out: how do I manage the browser bookmarks that go to the service that I want?

So at some level, I have to have dual. If we back up here just a little bit. So I've got this service that comes out. Now I figure out I'm going to run dual infrastructure right now, because the architectural approach here is that I'm going to replatform the service. I'm not rearchitecting the service. This is a replatform play. So I'm moving that out to cloud. I'm just replatforming. I've duplicated that service.

So I do have now infrastructure running in two different places that I now can decide how I want to gate traffic to. I can gate traffic to either one. But the challenge that I have at the top is the bookmarks that I have are still bookmarked to the original service I had out there. I'm trying to manage that with low fanfare.

So now I take an opportunity to say, "You know what? I'm going to go ahead and make a connection to the service in the cloud, use LaunchDarkly to put up a banner to say, 'Hey, the bookmarks are changing. Why don't you go ahead and update your bookmarks?'"

We can do that. We can show that to allow those systems that have internal bookmarks. That was something that's kind of built in our system, that we needed a way that we could make sure that we could gate both sides.

Now the interesting thing is, when this happens and you move it forward, then you figure out, "Well, I've got a service out here in production now, in the cloud. I don't need any of the extra things in there." So you take them out.

But the thing that wasn't obvious here was, if you go back and you look at these pieces, I can figure out how much access that I'm getting to service A that was on-prem that I moved out to cloud. To figure out when I dial that back, I actually can take custom attributes and feed back into the flag based on the metrics of usage that I'm collecting out of other systems, to feed this back in, to determine whether I turn that on automatically or turn it off automatically.

So there's opportunities there as well to instrument these things. Lots of capabilities here around this. But it's just a picture of how you can run almost like dual infrastructure. That infrastructure, if you're a software developer, you're just strangling the other infrastructure that you had out.

Now all of this can be managed external to your application. It's just a matter of turning on, turning off, and having the right pieces in place.

So just showing a little bit around cloud migration. Essentially, I don't expect you can read this, it's small print, but you define a new flag and then you define the method to referencing that flag's value, and then you can move forward and use it.

But again, you have to instrument your code. So when you build new features, you want to make sure that you understand, "I'm going to wrap a feature here so I can turn it on and off at will."

Okay. So again, back to the picture that we had. This is the part where I'm saying I need some way to transform what we're doing here. This is not good enough for me. There are some challenges here that I'm trying to work past. This is not good enough. That's what I'm telling my teams. I want to win. I'm not looking for third place. I want to win.

We come back and say, "What's this? Why do I have this handoff? What's going on here? You know what? This is where I'm going to lose." Actually, I found out that this is where we lose, because there are times that the business is involved in this race as well. And the baton, we turn this off. But if I don't have separation here, my development teams get tagged with metrics that look like we're a lot worse than we really are. A lot worse than we really are.

So I'm going to bring LaunchDarkly to bear here to help me manage the ability to separate those two things. So I'll pull it back into our DevOps architecture.

Now adding LaunchDarkly into the mix, and I'm really building in deploy from release. That's what I get in the next slide here. It helps us simplify our process and our path forward because I'm consolidating some things for the right reasons. I'm removing those handoff batons.

And so this is what my end game result ends up being. In this picture, it's very similar to the architecture we looked at before, but just realize at the top part here around feature development, there's opportunity to be using feature flag configuration and SDKs. That's from a development perspective. On the far right side out here, that's from a LaunchDarkly perspective around release. That's more around strategic release and trying to figure out the feature release, but it's really an opportunity to separate both of those things.

Now, the part that got compressed was that now I really have true continuous deployment, is that I can move through every one of my environments and get to production and defer turning on the release. I'm separating my deployments and releases. Before, I did not. There was no way to do that. We actually had a gap there. We had a manual process.

So now I go to each one of my development environments all the way to production, assuming that all the gate checks pass, get all the way to production, my development team is done. Then some amount of time happens out here on the far side, and then I get into a strategic release scenario with my product owners. They get to decide when to turn on the release.

So there's a separation there. Now, there obviously has to be some coordination across if you have multiple development teams that are all working for an initiative. But that is the reality.

There were times before where the development team would get beat up because their metrics didn't look very good. It's because actually their business partner said, "I'm not running this race that fast. I actually don't want to put it into production. I'm going to wait."

That's why it's really important to try to figure out the separation of those two things: deployment and release.

So we now have empowered our product owners to be responsible for turning on the releases, which literally means go flag, let users use it. Now again, that's a very simplistic scenario, but that is where we win with this last picture of being able to have continuous deployment, where we can move through every one of our environments all the way to production when we're done with the development.

In the past, there were times where we did not do that, and then we actually get backed up. We've got multiple branches sitting out somewhere because someone else's team didn't finish on time. Then you get to a day where everybody's trying to merge, and it's two days worth of nightmare for development staff.

But this is how you win: trying to figure out how do you bring a platform to bear to solve these kinds of problems for you.

If you are not using feature flagging today, I guarantee you there's some efficiency to be gained. That would be my encouragement for you today: take that away from this talk, take that home with you, investigate. Just look at this and see what it could do for your pipeline.

It has made a difference for our pipeline. I guarantee you that, in our ability to deliver features for our customers. It's something that was really, really, that we worked hard on to try to figure out: how do we close the gaps?

It wasn't until we brought a platform to bear, to give us that opportunity to separate deployment and release, to allow our product owners to manage a strategic release. Developers develop, move their code to production, let the product owners.