Biz & Tech Partnership Towards 10 "No Fear Releases" Per Day
We are currently transforming how work gets done, more specifically how we deliver software in a consistent, secure, reliable and efficient manner.
This experience report focuses on a large customer facing value stream, servicing over 80 million accounts. It is made up of 25 Agile teams with approx.150+ engineers geographically distributed.
We were previously using a vendor product running on a mainframe and manually releasing core capabilities to production on a monthly cadence.
Today, we have transitioned to a reactive based micro services architecture in the Cloud that has enabled us to achieve far greater throughput. With close to 60 pipelines frequently releasing in small increments (~4300 releases over the last 12 months), we have minimized risk by maturing our engineering practices and increasing our adherence to automated control gates.
This talk is centered around our close collaboration between Business, Engineering and Product to understand user needs as well as the alignment on shared roadmaps. The goal is to partner on achieving business AND tech outcomes - both delighting our customers while also being well managed.
Was this easy? We are here to share our learnings, challenges faced, and how the team overcame these to achieve the goals set in place!
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
I met Dr. Topobrata Pal in 2012, and I was immediately in awe of what he was trying to do at Capital One. He has presented about their amazing DevOps journey here at DevOps Enterprise almost every year of the conference.
Topo has presented solo, he's presented with the Director of IT Governance, and amazingly, with one of their internal counsel who helped support their open source efforts. And this year, he's helped make possible one of the most amazing experience reports yet.
This experience report is from their credit card division, one of the largest business units inside of Capital One. Their story is told by a trio of their business leader, their engineering leader, and a product owner of an enterprise shared service who supports them.
Bose Biswanath is currently Senior Business Director of Anti-Money Laundering for Machine Learning and Fraud. His previous role was as the business and product owner responsible for building the credit card servicing platform of which this experience report is about.
Rakesh Goyal is Senior Director of Technology Engineering, who was brought in as Bose's technology counterpart. And Jennifer Hansen is Director of Product Management for Delivery Experience, whose charter is to help developers across the entire enterprise be productive and secure. And their platform, offered as a shared service, supports over 50,000 builds per day. Here's Bose, Rakesh, and Jennifer.
Jennifer Hansen
I hope everyone's enjoying this awesome DevOps Enterprise Summit. I wanted to start by thanking Gene and the program team for bringing this community together yet again to successfully learn and grow from each other's experiences.
We are here today to share just what can be achieved when business and engineering teams come together, unite, and collaborate to improve the overall customer experience.
I'll start by introducing myself. My name is Jennifer Hansen. I lead the product teams for internal delivery platforms. I am passionate about empowering our engineers in building great products for our customers. Over to you, Bose.
Biswanath (Bose) Basu
Yeah. Thank you so much, Jennifer. My name is Biswanath Basu. I go by Bose. I'm a Senior Business Director in Capital One and currently lead the fraud strategy in the anti-money laundering area using a lot of machine learning and AI techniques in the service of improving our anti-money laundering practices.
I was also, as we speak, the business leader and product manager leading this effort that we will be talking about that we did in US Cards. So that's a little bit about me. Hand it over to Rakesh here to introduce himself.
Rakesh Goyal
Hey, folks. I'm Rakesh Goyal, and I currently lead our consumer identity platform. Today, I'll share with you my experience leading delivery for the significant three-year transformation of the cardholder servicing platform, and more importantly, how I influenced my initially skeptical business partner, Bose, in this journey.
So for those of you that are not as familiar with Capital One, I thought I would share a few fun facts about who we are. We turn 25 years old this year. Still a baby in the financial services industry, but known for our compelling ways and determination to disrupt and innovate. We have millions of accounts, over 90-plus million of these, but for us, every single customer counts. We are passionate about the experience.
We are close to 49,000 in terms of employees and 10,000-plus of these are engineers. We've made it to the Fortune 500 list 20 years, which is amazing. And we are known among the top 10 DevOps leaders because of all our passion that we have around the entire transformation journey. And finally, I think we are one of the largest digital banks today.
So as I move to the canyon, one of the things I wanted to call out is it's an amazing experience when you try and cross a canyon. But it doesn't happen overnight. It requires a lot of planning, preparation, and core fundamental elements to get to the other side.
So within Capital One, we've been on an agile and DevOps transformation, and this has been underway for the last seven-plus years. Some of the things that are notable in our transformation that are also impactful during this particular experience journey has been moving from Waterfall to Agile. Fundamental changes transforming how we work.
In addition, we had a lot of outsourced vendor software, and we made the commitment that we needed to insource. We needed flexibility, we needed agility, and we love open source. We're a company where open source first is a mindset that exists all the way from the top right down to every engineer on the team.
In addition to that, we had a lot of monolithic architectures, and we needed to move to microservices. And then finally, we had really optimized our data centers, but then we realized that we could never quite match the value proposition of being in the cloud. And I'm extremely proud to say that today we are all in on the cloud. No more data centers at Capital One.
So as we think about this journey, I want to call this out as these are foundational elements that drive transformations within business units as well. The most important passion that we have retained is that for doing the right thing for our customer.
I wanted to play a little short audio clip for you on the value the transformation can bring to your business when you have customer information at your fingertips, how you can improve the service experience for a customer who is looking for our support. So go ahead and listen to quite an amazing story during this current time.
Customer Call Audio
Capital One, this is Sarah. Who am I speaking with and what can I help you with today?
I'm calling because I just tried to use my card so that I can put gas in my car and... Well, I was just at a gas station, and I want to use my card. So I have what they call that Overdraft Protection Plus. So if I go ahead and overdraft it or it's the next day grace period, but-
Mm-hmm.
... it doesn't let me do it.
Sure. No, I completely understand your frustration, everything. I can't do anything about the overdraft line of credit option that you have. But what I can do is I see there was four fees that you received in March. I can go ahead and refund those. That'll give you a little bit more available balance so that you can get gas.
Oh, wow. Wow. I didn't call for-- Oh, wow.
It's the least I can do. I just feel bad that your overdraft line of credit option that you have for the next day grace isn't helping you when you need it to.
Thank you.
Hey, no problem at all.
Thank you.
Yeah. No problem. Capital One, it feels great to have you.
Jennifer Hansen
Speaking of customer, let me now transition over to Bose. He will tell you all about the journey from a business leader perspective.
Biswanath (Bose) Basu
Thank you so much, Jennifer. Before I start, let me give a little bit of a context on the business problem we were facing.
The problem at the very core is that of an aging customer servicing platform. And to give you a size of the scale, this platform is servicing tens of millions of Capital One credit card customers and, of course, generating about hundreds of millions of dollars in annual bottom line value to the business. So a really critical component, a really critical platform, both in terms of the value it provides to the customer, as well as sort of the shareholder/economic value it provides to the business.
The challenge was this was an aging platform, and it's got the classic problems of aging that one experiences on any platform. In fact, as I grow older, I'm experiencing some of those. But if you go to the next slide, you could actually see some of those problems that we were facing, primarily around the fact that, look, aging platforms, so we have problems with not meeting customer needs. Customer needs have evolved over a period of time. So we are not meeting them. We have a bunch of batch processes that are slow, inefficient. Those are really things that we really don't expect in the current day and age, especially given where we are.
And last but not the least, it's also like Capital One, the primary horsepower that Capital One runs on is data. So we want to use the power of data to make strategic, intelligent, real-time decisions. And the platform that we had was not really allowing us to do that. It was sort of constricted in that way.
So when you think of the problem here, or the objective, not only are we solving a technology/cyber risk problem with an aging platform, there's also a lot of business value in terms of tens of millions of dollars in NPV, as we kind of try to solve these problems for the business.
As we started this work, what were some of the principles that we agreed to? The first one, as you can see, is around sort of working backwards from the customer need. So customer needs are absolutely quintessential. We exist because of serving our customers, and our customers might be varied customers, but one of the core principles we worked on was making sure that we are meeting the customer needs, meeting them where they are, and quite frankly, we wanted to get an A+ in terms of where we wanted to go.
The second objective that we had in mind was to make sure that as we do this, we are both iteratively delivering value, maximizing learnings, minimizing risk, and we knew this was going to be a long project, at least a couple of years. So we wanted to make sure that as this project goes on, we are able to deliver value to the business as well, both in terms of customer experience as well as in terms of sort of the actual economic impact to the business.
And the last one that I would call out here is sort of a bias that we all face as we've used to working in an old system for a long time, which I call as the anchoring bias. What we were very conscious of right from day one was to make sure that we are not building a faster and a stronger horse. What we are really trying to do here is solve a problem, and that is to move in this analogy here, to move from point A to point B in the fastest and the most efficient way possible.
So with these guiding principles, now let me tell you a little bit about how we actually went about doing it. The first thing we did was essentially look at our platform or the set of customers that we had within our portfolio and divided them into different segments. Segments and groups of customers based on, A, what their needs are, and B, what functionalities they needed to be serviced. And as we identified them, we sort of graded them on the sequence in which we would deploy or rather test these as we go on.
This was by no stretch a big bang. It was a pure risk-based approach. And what we did essentially was take a thin slice, and you can take a thin slice here, just like the picture that shows here of that of a cake. We would try that thin slice, see what works, what doesn't work.
There's a key point that I mentioned here as we talk about thin slices. As much as we were looking for MVP, we were not looking for the least common denominator here. We were looking for the minimum viable experience that we would give to our customers, and not just any small product that we could come up with.
Now, once we test that piece out and it works, the next thing we will do is just essentially scale it up, and that's what you see on the next slide. And this is actually an actual slide for the first few months of delivery. In fact, on the second month, our first release was that of just converting two accounts. And trust me, we had tons of learning as we were doing it.
It's almost like sending a man to the moon where you don't know what's going to happen from the old system to the new one and making sure those people are safe. So, a lot of great learnings in this very carefully calibrated, agile approach, which was giving us the learnings, was minimizing risk, at the same time was giving us a lot of value.
And I'll finish my part here about the business problem with what I mentioned earlier with the customer back thinking. And a lot of it is about strategically thinking about who our customers are. Our customers are not just the credit card holders. They are our regulators. They are the business analysts within the company who are working on it. There are customer servicing agents. There's a ton of people who are actually the users of the system. And we used very heavy human-centered design to ensure that we are actually meeting the needs and not just replicating what was there in the old system.
So with that said, now that I've given you the business context, I'll hand it over to Rakesh, who will lead you through some of the technology work that went in the service of this.
Rakesh Goyal
Thank you, Bose. Wow. So when I started in this role and got introduced to the goals that we just saw from Bose, I was truly struck by the transformation plans. I had led various large-scale transformation initiatives. However, in this case, we had to migrate from a system frozen a few decades ago in terms of technology.
What we had was a mainframe-based vendor product where it had been bandaged to the point where the workaround systems and operational teams were as large as a product in itself. You see this old beat-up car? That's what we had been used to driving and maintaining it for years and years.
So what we needed was a modern system to deliver on the business promise. And yes, we had to run everything in the cloud. So you see the shiny car, and you may have heard various common debates about building a Chevy versus a Cadillac or a Prius versus a Tesla. Regardless, we needed a new core to host these tens of millions of accounts, serving a wide array of business segments that Bose just shared.
It was going to be a multi-year journey with a large number of engineering teams. Our business partners wanted reassurance. While we changed the engines or these cars, we could not in any way jeopardize this heavily regulated billion-dollar business.
So where do we get started? Well, early in the cycle, we recognized the need to invest and evolve the tools in our toolbox. To borrow from Lincoln, we needed to sharpen our saws before we took on this journey, and we decided that investment in tooling was really required. We also had to invest in reskilling engineers and provide them the appropriate tooling to be agile in this transformation journey.
So when we shared the decision with our business partners, of course, there were skeptics in investing in DevOps, as no business features would be delivered for the first short time. Today, as I'm going to share this journey and outcomes, you're going to see the benefits of such an approach were manifold.
So to have a future-state platform meeting the needs of our customers, we settled on building an API-driven microservices-based architecture system. The goal was sustainable and build incrementally, as Bose suggested, and expand into various business strategies. You can think about this as having a fleet of smart cars built for specific workloads rather than one futuristic car.
To normalize the developer experience across various teams, we spent a couple of months investing in DevOps. We took a pragmatic approach, leveraging proven enterprise tools that would work for us as opposed to getting distracted by shiny technologies. There are always many options, and by standardizing, it helped us to keep the engineer staff fungible. We could react faster in situations where engineers needed to contribute to other teams or move from a team to another, et cetera.
We got laser-focused on building our CI/CD pipeline as it would provide a tremendous lift. It would empower the teams in reducing cycle time and, importantly, reduce risk by enabling small, fast, and frequent releases. We also had to address regulatory and compliance controls. In there, we would block releases as part of the pipeline when certain controls are not met.
So once built, the pipeline became an integral part to the project operating model as we started to build and release incrementally. It enabled the teams to focus on product features while the pipeline tools were just a utility to leverage rather than requiring investment from each individual team.
This core infrastructure allowed us to scale the number of teams working on the project. We are teams distributed geographically at four of our people centers. And at the height of this effort, we had 25 teams working and contributing simultaneously. Just imagine the scale, how we were able to scale using this approach of agile incremental delivery and microservices to scale out and all deliver at the same time at their own pace.
A big part of such a transformation is also how you organize the people, the people front, right? You need to make sure we follow the two-pizza principle by mixing subject matter experts with legacy skill sets and those skilled in target-state technologies. This also helped with reskilling the engineers with legacy skills and keeping them motivated.
To bring out the contrast, if you think about the past, next slide please. For a vendor product, you have to deal with collecting requirements, lengthy product development cycles followed by test and release cycles. It was very formal with handoffs amongst the various stakeholders. To overcome such shortcomings, we had to build costly workarounds.
Let me share with you how we operated on this project. So while we scaled across geographies, we took an approach where we co-located the product owners and engineers to make sure they are able to work together closely. This allowed us to stay agile and release frequently with a rapid feedback cycle.
I wanted to share one unique situation where all of these came together remarkably well. The customer-first approach, the tooling, the co-location, and the agile principles. Our servicing agents were facing some difficulties with the release. The product owner would collect that input overnight, get the team to work on it, accept the release, and the agents would see the updates the following day. While this was awesome compared to what we saw with the vendor product, when we couldn't even make a change to the product by ourselves, it wasn't good enough for us.
The team decided to take a different approach. The entire team, the product owner and the engineers, decided to travel to the call center. They worked side by side with the agents, made several releases at the site itself, receiving continuous feedback as they addressed all the concerns. I can say it was truly liberating to make changes and releases at will. Just think about how you can just make a change and release all in the same sort of window, if you will.
For all of this, we relied on Jennifer's enterprise delivery experience team and her support in enabling this rapid release cycle.
Jennifer Hansen
Thanks, Rakesh. So I have to tell you, growing up, I enjoyed success at many a track and field event. But the one that made me most nervous was the relay. Like any other team event, it requires strategy, teamwork, practice. But success is so critical when you do that baton handoff.
In a similar manner, if you cannot translate all the verbal and non-verbal needs flawlessly on the day, despite all the awesome work Bose and Rakesh had done, we became the hold-up.
In the spirit of shared services and doing the right thing, and I know a lot of you out there work and live and breathe shared services each day. We have this passion to empower our engineers to meet the business needs to get speed to market. However, at the same time, we have this need to be well managed, to demonstrate it to our board, our regulators, our risk partners, and our shareholders.
So this is always a challenge, and what we had done within shared services is design a meticulous, well-maintained, streamlined process on how to review risks, categorize them, escalate them up all the way for the right level of review and oversight. As you can imagine, this wasn't quite what Rakesh and Bose needed. The automation, the thoughtfulness that they had put through had now become manual checkpoints.
We needed these to avoid the non-compliance. Segregation of duties is a critical component, especially when you think about financial systems. And we had this understanding on how do you meet these SOX controls? You do need this segregation in role to occur.
But then we realized that this is not the partnership that we needed to be. Yes, these pre-release controls were essential, governance was essential, but what was happening is this well-oiled assembly line was running into manual checks. We were doing those pre-deployment, pre-release checks, identifying issues at the last minute and sending them back into the assembly line.
Now, if you had a good day, you could just get cleared instantly, and maybe in the next one to three days, you could get on that road and get moving. But you remember these inspections that took you back? At times, these could result in days and weeks of delay. So this was not a fun process and not the optimal partnership we wanted.
So what we needed to do is think about the entire partnership end to end, come together, understand what is it that we want collaboratively to achieve. So we wanted to promote growth and innovation. We knew we wanted to deliver high-quality working software, but faster. And we also knew that if we were to be an organization that could attract and retain talent, we had to foster that culture of innovation.
I think Rich, our leader, says it best: We take so much time to recruit great people, and we were actually holding them back in some ways on the opportunity on how we could partner collectively and be great.
So what we needed to do was build trust. However, how do you build trust? I have to steal from Jez and David's book. It's an awesome quote. How can you automate something that isn't repeatable? And just because we put all our checks in manually, it doesn't mean that we are error-free. There could be an issue come up there because it's something done differently all over again.
What we needed was continuous delivery. We needed to figure out how to simplify and standardize our patterns. We needed to reimagine the entire experience. We wanted to get to an automated release. We needed to mitigate risk. So how could we build in some preventive checks?
Security, vulnerability, remediation, we needed to get ahead of these. These were critical enterprise objectives that we couldn't compromise on. But we also wanted to improve the engineering experience. We wanted to increase productivity. We wanted to truly believe and trust in the journey. And to do that, we had to empower our engineers. So the more we shifted left, the more they understood the value, they appreciated building and fixing, incorporating the changes that we needed, and being able to move faster.
So these were some of the thought processes that were underway in the shared services area as we were transforming. So how are we going to do that? By now we had thousands of teams. We had multiple lines of business. We wanted to give them the autonomy and the flexibility that they needed. So we needed a software delivery clean room.
Yes, very much building upon the clean room analogy, we needed to ensure that things that were important to us, things that made a difference in terms of quality, speed, vulnerability, remediation, reliability, were built into the pipeline. This is just some of the examples of the checks that we have within our software delivery clean room. It allows you to go ahead and rethink the control differently. Yes, we had a SOX control. Yes, we needed segregation. How could we achieve this in a more automated manner?
So these became the combined discussions that we started to have as we built out the clean room. The good thing is we knew what the target state looked like. We wanted to get to the no-fear release for the business. We knew that we needed a push-button deployment, the ability to deploy at will, as Rakesh called out, to production based on business demand. But we needed those monitors. They needed to be integrated into the pipeline. They needed to be recording and monitoring compliance and quality and security, all the elements that we needed assurance on.
Once we were able to achieve this, I'm happy to say we could offer this capability back to the team, which was a really exciting part of the journey.
So coming back, what were the key outcomes as a result of this partnership? We definitely saw the improvements, as you'll see, in terms of experiences. Bose called out the human-centered design thinking that was at the core of his entire strategy. Rakesh was focused on the cloud-based real-time analytics and servicing platform to help the business move faster.
We ended up with operational efficiency, and we have today a scalable, resilient platform. We've improved, and we have a more sustainable risk management. In addition to that, we've been able to empower our engineers. Our auditors are delighted that we have thought through compliance, and we've been able to automate this in real time.
The testimonials I really want to share are the ones of the call center agent, the agent that's excited about focusing now on servicing and meeting customer needs. The business analyst who had been waiting for two days on data could now work real time and support our customer base. The back office agent, no more of those Excel sheets and notes for him to sift through. And then finally, our leadership, who were equally excited in this tech transformation journey because it's serving our customers better.
Now, I know we always think that if you're a high performer, do you have to compromise throughput and stability? Well, this is where we got to, from once a month release to 240 times deployment frequency. I'm going to use some of the DORA benchmarks on elite performers to give you comparison on how we're doing.
Our delivery lead time has improved significantly, both for feature and infrastructure releases. Our mean recovery time, equally important, we're getting faster at improving that. And what about change failure rate? It's a lot lower than where we were. So you can see that both throughput and stability are possible.
Now, some of the key learnings that I want to come back to, and I would love for Rakesh and Bose to come in and let's talk about this, what we thought and what we did. So over to you, Bose.
Biswanath (Bose) Basu
It all starts with having a vision and sort of the paradigm shift that we wanted to make, and which is a very common trap that business leaders fall into, is thinking of the constraint first and then the objective. We wanted to think big. We were looking for an A+.
So big learning here is we have to think big and we have to think bold. We started, we had a ton of constraints. We had never done this before, but that wouldn't stop us from thinking. So if there's one key element of learnings that we got was around just the ability to just think as big and bold as we possibly could.
And of course, iterative delivery, some things that I've mentioned on, cannot just undermine the value of that. Tons and tons of value in terms of maximizing our learnings, delivering sort of the NPV or the business value that the business is looking for, as well as minimizing the technology risk here.
I don't think we would have gotten there without the tremendous partnership, but most important of all, I think what really struck me was that entire culture of acceptance and empowerment. We took a shift and we started to go towards blameless postmortems. Let's try to understand what we need to do to improve and iterate through it. So I think that was huge for me.
I think this is an important one and creates the right dimension for the problem in the sense that the change curve, it's about transformation. And transformation, not just in terms of technology, which is extremely challenging, but also the human element. Think of agents that Jennifer talked about. You have thousands of agents who've been working for years with an old system. How do you get them comfortable using the new system? How do you get business analysts comfortable to using new systems? That completely, in some ways, are very different from the way that we've worked on in the past.
So in terms of the key learnings and probably something that we would, as a team, love to hear from the group is all about the experiences people have had in terms of not just the technology transformation, but also transformation in terms of winning people's minds and hearts. Because I think that's an equally important challenge as we go through this technology transformation, both within the company as well as, I would say, the industry overall.
You need transformation, new technologies. People need to learn new ways. People get habituated, right? And we really need to figure out how we can get help and get people to start thinking changes for the better, and any help would be appreciated there.
And I think the most amazing thing is you can't undermine the need to transform, but at the same time, you have to understand the human change curve. It's equally important for your business. And looking at it multiple times, the change management and communication strategy is a critical part of your transformation.
So I guess we are really open. We hope you've enjoyed this session. In the spirit of Halloween, which will be here soon, we don't want you to live with fearful releases. We want you to feel empowered for your engineers to build trust, to think about how you too can get to a no-fear release push-button deployment model.
And finally, we are extremely excited to share the transformation, but we would love to hear about your journey and things that you would have done differently. So you've got our names. Go ahead and reach out, and let us know what you think.
Jennifer Hansen, Rakesh Goyal, and Biswanath (Bose) Basu
Absolutely. Thank you all.
Thank you.
Thank you.
It's a pleasure talking to all of you.