DevOps at Barclays: From Oil Tankers to Speedboats
In this talk, Jon will share the story of how Barclays, a 325 year old organisation in a heavily regulated industry, with breadth, diversity and complexity, is adopting Agile and DevOps at scale (130,000 employees in 50 countries) and at pace. Jon will share lessons from the organisational-wide transformation so far.
- How to go from oil tankers to speedboats at scale
- How to have agility, innovation and compliance to controls
- What are Agility Levels and how do they help?
- Why a holistic approach is important
Chapters
Full transcript
The complete talk, organized by section.
Jonathan Smart
So I think this is an awesome event. This is absolutely fantastic. So a big thank you to Gene and the programming committee for organizing this mutually exothermic event. I had to Google that.
My name's Jonathan Smart. I'm leading on agility at Barclays. We've started this journey about 18 months ago, at the beginning of last year. I've been working in an Agile manner for over 20 years. I started as a developer on the trading floor, and really, you wouldn't work any other way. At some times in the past, it was fragile. At other times, it was Agile. And I'm really delighted to have probably the best job in Barclays, to be leading on Agile across the whole firm.
So first of all, some context. Barclays is a financial services firm. If you haven't heard of Barclays, quite unlikely, I would imagine. We're 325 years old. Barclays was founded in 1690, four years before the Bank of England existed, by two goldsmiths. So we're going right back to the origins of money.
And paper notes, so paper currency in the Western world originally came from goldsmiths. So you had some gold coins, you would deposit your gold coins with a goldsmith, and they would give you a note which would say, "I promise to pay the bearer on demand the sum of 10 pounds of gold." And so the founders of Barclays go back to the beginnings of paper money in the Western world. China had a head start. They were using paper money about 2,000 years ago. But in the Western world, we go back right to the origins of paper money.
So we're quite an old company. We now have 130,000 employees in 40 countries. We cover all aspects of financial services, and we have 48 million customers.
We have a history of innovation. The reason I have this slide is really to do, it's a cultural point. As a firm, there is a culture of innovation, and in doing the role that I've been doing for 18 months, there is a willingness to embrace change, which is great for a large financial services firm.
We have the first female bank manager. We have the first credit card outside of the U.S. The first cash machine, so a great example of technology and financial services. We got the first mobile payments transfer only using a phone number. The first for mobile check imaging.
And to do that, we actually had to change the law in the United Kingdom, and we lobbied the government, and it's now got royal assent. And so there's a new law coming in in 2017, which we alone have led the change of that law. We invited other banks to that conversation, but the other banks said, "No, thanks."
There are currently 330 trucks driving around in the United Kingdom with paper checks in them, with physical paper checks. There are 11 planes in the sky at every moment in time with physical paper checks. And now, what you can do with the Barclays check, right now we have, I think, 58,000 customers who can take a photo of that check and it instantly clears. Instantly.
So in terms of innovation and technology, that's another great example.
The Barclays Accelerator. So in terms of fintech and many small companies, we have six fintech accelerators globally. We have three cohorts of 10 fintech startups that we sponsor. So that's 180 fintech firms that we are partnering with and sponsoring, giving them access to senior leadership, helping them to develop their ideas and pivot their ideas. So we're really working with the ecosystem of fintech startups.
We have the world's first blockchain smart contract. So this is a legal use case to legally agree trades using blockchain. We have the world's first working prototype of the R3 Corda platform. R3 is 42 financial services firms all coming together on blockchain. So again, this is a world first, and we've had the interest from the Ministry of Justice. So we presented this to the chief legal person in the United Kingdom, and there's interest from the government.
And embracing Agile and DevOps. So we are embracing a new way of working across the whole firm. Barclays, as I said, is 130,000 people, and our remit is the whole firm. We are not just doing this in technology.
Mission critical. We process payments worth 30% of the UK gross domestic product every single day. That is 220 trillion pounds per annum or 600 billion pounds a day. Unlike Netflix, where if there's an outage, sorry, you can't binge-watch Orange Is the New Black. Instead, we will probably go bankrupt if we have 24 hours of complete outage. We've probably gone bankrupt, and we probably bankrupt a bunch of other companies at the same time.
So absolutely mission critical. We need to be Agile, but we have to have a large amount of control because it's absolutely mission critical. And we are the most regulated industry. I did some research. There are 222 financial regulatory authorities globally. I counted every single one of them on Wikipedia. Financial services is the most regulated industry.
We have hundreds of internal standards. Literally hundreds. Between 300 and 400 internal standards. No one human being can possibly keep 300 to 400 standards in their head. And when you think of the age of the company, you think of the archaeological layers of audit reports and audit findings and regulatory findings, and every audit item somehow gets embedded into standards with the right intent and with the right reason.
So really, what we're doing now is probably the biggest thinking about how we run ourselves that we probably have ever done, in terms of taking a step back, looking at everything we've got and the way we work, and actually then challenging it.
Where we're going from, so the departure point, is we had a waterfall lifecycle. So our change lifecycle, as agreed with the regulators, it is waterfall. It has seven gates. It used to have 22 gates. There are 28 mandatory artifacts. So even if you want to deploy a one-line piece of code that says, "Hello, world," and this is a new piece of code, a new binary that says, "Hello, world," you will have to fill in 28 artifacts.
We did some value stream mapping on this. The average elapsed time to go through the process is 56 days. So that's nearly three months to get "Hello, world" shipped, potentially. And in that 56 days of elapsed, there's 20 days of effort. So we have a large number of project managers historically who have been spending 20 days of effort filling in forms.
So plenty of room for improvement.
Why agility? Why not? It's a better way of working. We know it's a better way of working. We don't need any type of survival anxiety to know that it's a better way of working. We know that it reduces risk, delivery risk. We know that it increases quality. We know that it reduces the concept to cash time. The DevOps surveys 2014, 2015, 2016 all show the evidence to that, and we have evidence as well from our own work.
Also, it's to delight customers and to engage colleagues, and this is the reason for companies. The reason for companies is to create customers and to delight customers. It isn't for shareholders. Shareholder value will follow. It will come. Returns will come. Above and beyond everything else, we have to delight customers. That's the number one job of any company. So by taking an approach with more agility, we can delight our customers and we can engage our colleagues.
Also, we're in a period of very disruptive innovation. So we have new entrants. We have non-traditional competitors, such as the unicorns. So we have Apple, Google. There's a currency symbol in Gmail. You can transfer payments. Messenger, Facebook, you can transfer payments.
There are non-traditional competitors, so fintech startups. The investment in fintech startups is £10 billion per annum. That is an enormous amount of money, and records are being broken every single quarter for the amount of venture capital money going into fintech startups. That's a huge investment.
So in addition to that, we also have governments who are increasing competition in financial services. So the Payment Services Directive 2, which is coming in next year, will force all banks to expose their systems through APIs. So then this opens up a whole new category of disintermediation, where companies can come in in the middle between the customer and the large financial firms, and they can call the APIs of all of the banks, and they can see what your balance is, with your permission.
You can also look at the product. So you could have a set of companies that could do smart current accounts and/or smart savings accounts and will automatically move your money between banks. So we have that type of competition coming up. That's going to squeeze the profit margin down. There will be increased transparency, easier switching. So challenger banks, Mondo Bank, Atom Bank, mobile-only banks. There's a huge amount of disruptive innovation at the moment, and companies that do not change will not survive.
And survival. So what is it? Survival of the what?
Fittest. Not the fittest.
Fastest. Not the fastest.
Biggest. Most adaptable. Most adaptable.
Well done, whoever said that. The most adaptable. It's survival of the most adaptable.
So agility at scale. 130,000 people, 300 years old. Are we crazy? Are we trying to teach elephants to dance? Am I mad? Are we mad?
First of all, a definition. This is the definition we're using. Agility equals Agile plus DevOps. So we're using the core definition of Agile, the core definition of DevOps to pursue a business strategy of agility. We are not doing Agile for Agile's sake. We are not doing it only in technology. We are pursuing a business strategy of the whole business exhibiting agility. And when I say the whole business, I mean HR, audit, security, compliance, investment bank, the retail bank, everything.
My charter. So I'm a servant leader on agility across Barclays. We have an awesome team across Barclays who are leading on the adoption of agility. Like I said, it's across the whole organization.
So this is the organizational construct. Due to legislation, we have a ring-fence bank, Barclays UK, personal banking, credit cards, wealth. We also have corporate and international. So we have corporate banking, investment banking, credit cards, and international wealth. So it's across all of those business areas.
And again, like I said, it's holistic agility. It includes audit, HR, finance, real estate. So we have our real estate department using Kanban boards for nothing to do with technology. They are visualizing their work in progress, and they're doing it in iterations. Audit are doing audit planning and audit work in an Agile manner. They have Kanban boards. They're doing it in iterations. What they used to do was time-slice the auditors onto multiple audits all at the same time and do them in a waterfall manner with a ka-chunk. Here's the audit report at the end of it. So this is not just technology.
How are we doing after 16 months? So after 16 months of doing this across the whole firm, we have gone from 4% of our strategic change, the spend on strategic change, to over 50% of our spend on strategic change being spent with Agile practices. That is enormous. We spend over a billion pounds per annum on strategic change, and over half of that we are now spending in an Agile manner in the space of 16 months.
We have the equivalent of over 800 teams who are now working with Agile practices who weren't at the beginning of last year. That's over 10,000 people who are now working in an Agile manner. We've had over 30,000 training attendances. That's computer-based training, that's classroom training, that's webinars, TED Talk-type sessions.
As far as we know, and I'd love anyone to come and grab me at some point and correct me if this is not correct, as far as we know, it's the world's largest and fastest Agile adoption.
Fifty-six percent of our strategic applications deploy every nought to four weeks. We have thousands of strategic applications, so we have hundreds which are deploying every nought to four weeks. Our lead time has come down, and we can see that both through the release cadence going up and the throughput going up.
So throughput, what this graph shows is the average number of stories completed per month, per application, for a sample size of 145 applications. It's gone up 300%. We are delivering three times as many stories per month, per application. That will both be increased productivity and more smaller stories, both of which are good. We want more smaller stories.
Quality has gone up. So from a sample size of 87 applications, and this is using Sonar, the code complexity has come down 50%. The code complexity measure is half, on average, of what it used to be, and that's the average per app.
Also using Sonar, looking at the test code coverage, test code coverage has gone up 50%. It's 150% of what it used to be, and that is both new code and old code. And that's with a sample size of 50 applications.
And this graph I like the best. This is hard data which shows that the more frequently you deploy, the lower the level of absolute incidents you have in production. So the dark blue line, which is climbing up to the right-hand corner, that is the percentage of applications in each business area that is deploying with a cadence of nought to four weeks. So in the top right, that is a business area that is deploying all of their applications every nought to four weeks. In the bottom left, very small percentage deploying frequently.
The light blue line that is trending down to the right, that is the count of incidents in production. So these are not defects. These are logged incidents in production. These are bugs in the wild that have manifested themselves. And you can see a direct correlation that the more frequently you deploy, the lower the level of incidents. This is a very useful slide internally. You can't argue with that.
Satisfaction. Satisfaction has gone up, so we've run a survey. It's based on the Yahoo survey that Gabrielle did, probably 2004. We kind of used the same questions that Gabrielle used at Yahoo. And the Net Promoter Score is plus 21. We can see in terms of, would you recommend an Agile way of working, we can see that 79% of people were better or much better. When we look at the more mature Agile teams, it's an even higher percentage. It's something like 84%. So we can see engagement has gone up.
An example: a derivative system in the investment bank. Testing duration has gone from eight days, so this is a regression test, has gone from eight days down to 20 minutes. It's 192 times faster for testing.
The release frequency has gone from seven times a month, so about once every three days, to 70 times a month. So that's three times a day. On this system, £70 billion worth of payments flows through this system every single day. This system can't have an outage. If there is an outage, we have to report it to the regulators. And it's deploying three times a day.
And this system has gone from being a monolithic system that used to deploy, go back about six years' time, seven years' time, it used to deploy every six months. It used to be monolithic. It's now a microservices architecture. There's been an evolutionary revolution, and it has evolved using the strangler pattern into a microservices architecture, enabling this degree of agility.
Quality: zero recent production incidents. Go back five, six years ago, a release was a disaster. There would be lots of incidents. Now, none. No incidents.
And it's the first system, in terms of business outcome, it was the first system to clear $1 trillion of notional in OTC clearing. In fact, Barclays were the first to clear $1 trillion of notional, even ahead of the OTC clearing houses themselves, for any of you that are familiar with this business.
So we've made a great start. You don't have to be a unicorn. You can be a horse. You can be an old horse.
Lasting culture change takes years. We know it takes years. We have lots to do. We know we have lots to do. We know this is the beginning of the journey.
And what are our lessons learned so far? So I'm going to share some of our lessons on this journey. And the journey never ends.
Culture is huge. Above everything, culture is huge.
And there's a great martial arts term which is very applicable, which is aiki. So aiki is to pull when you're pushed. It's to push when you're pulled. It is the spirit of slowness and speed, of harmonizing your movement with your opponent. And the kanji for aiki is joining spirit. So this is using the energy of the organization to move the organization to where it needs to be, not fighting against the organization.
There are three components to aiki: blending, not clashing; use of internal strength; and leading the assailant. So how are we using these three components to aiki?
First of all, blending, not clashing. So blending with the culture. One size does not fit all. Very important. For a large organization, there is so much diversity, one size does not fit all.
I would never apply SAFe, for example, across the whole organization, because personally, I find SAFe to be quite prescriptive and quite one size fits all.
In terms of scaling, first of all, don't. Don't scale. The very first thing to do is to descale. Descale the work.
For example, we had an area, we had a change project taking place in the bank. There were two attempts to do it in a waterfall manner: 100 people, three years, and it didn't succeed. Instead, that was replaced by 10 very good people, and within five months, they deployed it to production and they had customers using it.
So descale the work first, especially going from waterfall to Agile. Start with 10 people. No teams of 100 people.
Enterprise scaling is breadth, diversity, and complexity. It isn't more of the same. It isn't cookie-cutter. It's wrong to apply one pattern across a large organization. It's breadth, diversity, and complexity.
And you also need to allow for Shu Ha Ri. So there will be teams in any organization who are experts in Agile. I'll be very surprised if there aren't. There will be islands of Agile. There will be people who've never heard of it and will resist it.
So again, the approach to scaling needs to cater to: are you a beginner? Is the team a beginner? And therefore, they need prescriptive practices, need to be told what to do because it's very frustrating when a coach says, "It depends."
Or are they experts? It's like, "Well, we just do it this way. I don't know. It just seems to make sense, really."
And then there's the product team cardinality. Do you have one product with one team? Do you have one product with multiple teams? Kind of maybe a bit more like a LeSS-type pattern. Do you have multiple products with multiple teams? Do you have multiple products with one team? Hopefully not that last one. Again, it will alter your approach to the adoption of agility.
And then in terms of the size, is it tens of people, hundreds of people, thousands of people, or hundreds of thousands of people? The word enterprise and enterprise scaling means very different things to different people. Some people will use the enterprise scaling meaning hundreds of people. We're using it in the 130,000 people size.
And then very importantly, practices equal principles plus context. So principles remain the same. The principles are principles. Context varies, and given the different context, the practices will differ. And this is how we're approaching it across the organization. We expect the practices to differ.
Three common scaling frameworks: Disciplined Agile, SAFe, and LeSS. Our overarching approach is Disciplined Agile. The reason for that is because it is goal-based. It provides enough of a framework, enough of an umbrella, and it's not one size fits all.
So in terms of us working in a very regulated industry, we have to have a lifecycle that the regulator has signed off. We need to have consistent role names. So we're using Disciplined Agile as our departure point, not the destination, but our departure point, and we're using it to guide how we do this in Barclays.
It's not mutually exclusive. So within that, teams are welcome to use LeSS, they're welcome to use SAFe, if it suits their particular context.
Now, I'm going to contradict myself. In some cases, one size does fit all. The reason why: why are we doing this? And that's unique to any company as to why you are working, why you're changing. We're changing because of the reasons I said earlier: significant disruptive innovation in financial services, and if we don't change, we won't survive.
Principles should be consistent. The change lifecycle should be consistent. Roles should be consistent. So we're using the term Agile Team Lead, not Scrum Master, because not everyone is following Scrum. We also use the architecture owner role from Disciplined Agile, which I think is a great addition, to have an architecture owner on every team.
And targets need to tread very carefully here. Here be dragons. However, at the same time, across an organization, there need to be clear targets. And in terms of our culture, and playing to our culture and blending, not clashing, targets work.
Use of internal strength. Top-down meets bottom-up. This would not be possible if we didn't have top-down meets bottom-up. So how this all started was we had a new Chief Operations and Technology Officer, Michael Harte. And Michael joined the firm, and the vision was Agile and Lean with the customer at the center. At that point, a bunch of passionate people put their hands up to say, "We'd like to help run this," me being one of them.
Without having that, and Michael reported to the group CEO, without having him at the top and having the groundswell at the bottom, it wouldn't be possible. And I speak to many people in other firms, in other industries, and in financial services, who are trying to move the needle on Agile and DevOps, and they're not succeeding because they don't have the buy-in from the top. It's absolutely critical.
The bottom-up bit: communities of practice. We have 35 communities of practice with 10,000 members of staff in those communities. They're voluntary, they're grade-agnostic. The law of two feet applies. We have 2,500 people in the Agile community of practice. And so we have that groundswell of passionate practitioners to help on this journey.
We have a champion network. So deliberately, in terms of how we're organized, we have champions, named champions for each business area, who then have their own champions for each business area. So we have almost like a cellular-type organizational construct without a solid reporting line.
So in the center, we have a very small team, and then the investment bank has its transformation team. Barclaycard has its transformation team. The retail bank has its transformation team. And they have autonomy, purpose, and mastery. They are empowered to, within our guardrails and within the overall framework, to take whatever approach suits the culture of those parts of the business, because the culture is different. The culture in the investment bank is very different to Barclaycard, is very different to the retail bank.
Storytelling. "Communicate three more times than you think you need to and you're a third of the way done," was what a professor of marketing once told me at London Business School, and I completely agree with it. We can never communicate enough. We can never do enough storytelling. We're not doing enough storytelling.
Training and coaching. Like I said, we put a lot of effort into training last year. Training's carrying on. Survival anxiety needs to be up, but the learning anxiety needs to be lower. If the learning anxiety is higher than the survival anxiety, it's gridlock. No one's going to move. So there needs to be an element of survival anxiety, but lower the learning anxiety. So provide help. So you can go and get CBT, computer-based training. You can go on a classroom course. You can go to an Agile Lean Coffee. You can go to an Agile surgery. And coaches, we had about 140 coaches last year in Barclays. I think we sucked up the market for Agile coaches in London last year.
Leadership training. This is something we're not doing enough of. We need to do more of this. It's the same for any culture change. It's the pressurized middle. Senior leadership get it. The troops get it. But then the poor people in the middle who've got to deliver come hell or high water, no matter what. You've got to deliver, and by the way, everything you thought was true is no longer true. You've now got to work in a completely different way.
So we need to do more in terms of the middle management, the middle to senior management training. We need to do more there.
Leaders at all levels. Firmly believe in that. We want leadership at all levels. Turn the Ship Around!, David Marquet, intent-based leadership. Really like that theory, that way of leading. Push the authority down to those with the information. Don't push the information up to those with authority.
And there's a whole bunch of things that we need to focus on. We need to focus on technical excellence, and you've seen some of the benefits of that in the previous slides. We need to focus on Agile architecture, for example, having a microservices-based architecture which supports agility.
Agile HR. We are working with HR in terms of roles. There's more we need to do here in terms of how we appraise people, how we hire people.
Portfolio management. We're currently piloting Agile portfolio management. So we historically have spent about six months of the year, starting around about now, finishing around about March, planning next year in a very waterfall manner, historically. What we are now starting to do and experimenting with is quarterly rolling-wave planning and Agile business cases focused on business impact measures, not on are you on time, on scope, on schedule of a fictitious project plan, which was put together when you knew the least.
And working environment. We have also done a fair amount of work in terms of working environment as well.
DevOps at scale. We have a DevOps leadership forum. Again, there's autonomy, purpose, and mastery. It's not a solid reporting line. It's really a community. It's federated, not centralized. We have a DevOps champion network.
DevOps is not a role, in my opinion. We can have DevOps champions, but DevOps is not a role. And you can't buy DevOps. It's a practice. You can buy tooling, but you can't buy DevOps.
Leading the assailant. In large organizations, Dev and Ops is not enough. It has to be holistic agility. It has to be biz, finance, PMO, HR, legal, compliance, audit, Dev, Sec, Ops. Some people will say that that's what DevOps means. You ask 10 people what DevOps means, you'll get 11 answers.
Our definition is DevOps is the origins of the meaning of DevOps. And really what we're looking at is holistic agility, and I think this is a trend for the next 10 years, is actually businesses working in an Agile way. There was a Harvard Business Review article very recently, a month ago, two months ago, around Agile for business.
And holistic feature teams.
Agile and control. So we've done a lot of work, but like I said, we're heavily regulated, mission-critical. We've done a lot of work around control. So we've created control tribes. So normally what would happen is you do your waterfall piece of work, you'd get towards the end of it, and then you'd have to engage multiple different control owners. So information risk, compliance, security, audit, so on and so on.
Now we have control tribes, and you engage them at the beginning, and you have a conversation with them. We've created a control tool, and in the control tool, you put your piece of work in at the beginning, your initiative, your epic, you're about to do your funded piece of work, and it will assign a control tribe to you. That control tribe know the domain you're working in, which could be equity trading, and you'll have a conversation at the beginning, not at the end.
The control tool has leaned the process. So we've gone from seven control points down to two. We have one place for everything, where previously we had multiple sites for control. It's 20 questions, not 700 questions as it used to be. And it's gone from 58 days elapsed to one day elapsed. So we've massively leaned the process around control.
We've also defined agility levels, and this helps us to have a view in terms of how much of the organization is at an immature level and how much is at a mature level. And there's a J-curve, and we expect teams to get slower before they get quicker.
At the beginning, at level one and two, it's more practice-based. Level three and four, it's more principles, and we can't tell people what the practices are because they're unique.
It allows us to forecast and measure, and that enables us to plan. Planning is indispensable; plans are useless. It allows us to plan ahead in terms of what we need around test automation, around co-located teams. It allows us to think ahead.
So to summarize: one size does not fit all. You need champions, you need leadership bought in, and you need to communicate. Agility levels we find to be very helpful. They can be misapplied.
An important point here: the agility level is not the reason why. The reason why is separate from the agility level. They're a bit like a base camp. They're a point on the journey. And holistic agility.
So last two slides.
The forward path from business to technology and product, from CIO to CPO. Controversially, perhaps, the CIO role shouldn't exist. The CIO role should become the chief product owner, apart from maybe infrastructure. Nearly all change involves technology. So it's really about product teams coming together, blurring the boundary between business and technology.
Holistic business agility. It's everybody. It's HR, it's real estate, it's security, it's compliance, it's everybody. It's the whole firm.
Interconnected ecosystems, fintech startups. Like I said, we sponsor 180 fintech startups. We are standing on the shoulders of giants, where those giants are comprised of lots of little fintech companies.
Evolutionary revolution. It's not revolution. Architectural gardening. If you build something and leave it and come back to it later, the weeds have grown to the point where often you need to kill it and start again. So it's evolutionary revolution. Evolve your way to a revolutionary state. The example I gave earlier, going from a monolithic system to microservices, it wasn't a slash and burn. It was an evolution with a strangler pattern to microservices.
Experimental revolution. This is the wave two and the wave three. This is where we're doing some funky stuff around blockchain smart contracts, and we're looking at the wave two and wave three around innovation.
And continual and sustainable changes of competence. It bothers me when people say they've got change fatigue. They're doing it wrong. It's not about sprinting. It's about iterations continually.
And last slide. Here are some areas we would welcome input.
In a highly regulated environment, information security in the public cloud. We'd welcome input on this. There's some stuff around GovCloud that I saw AWS published a couple of days ago, FedRAMP and GovCloud. So we know it can be done. We know it's being done with the U.S. government. Interested in any input on that.
Containers at scale. Interested in things like auditability, debuggability, traceability. As a highly regulated firm, we need to know who ran what when and what's out there.
And Sarbanes-Oxley and developer access to production. I think we all have CAB processes and segregation of duties, and we want to move to more peer-review-type process and dev access to production and "you build it, you run it" within the regulation.
And we also want to hear any stories of holistic agility at scale, which is why we're here today. This is a totally awesome event. I can't wait to hear all the other stories. Thank you very much.