We Connect for Good
BT's agile journey.
Chapters
Full transcript
The complete talk, organized by section.
James Moverley
Hi there. My name is James Moverley, and thank you for coming to watch my session, which is going to be about BT and our agile journey and the road we've taken to introduce continual delivery in the mobile part of the business.
Now, I want to talk a little bit about BT. So what is BT? BT's slogan is "We connect for good." They provide television services to over 2 million subscribers; 14 million homes are connected via broadband and fixed line; 27.5 million mobile phone users; 1.2 million enterprise customers; and over 5.8 million public Wi-Fi hotspots connecting people all over the UK. We have global enterprise customers in over 180 countries, and we provide some of the UK's critical national infrastructure, rolling out our next generation of emergency services network, which will support all the UK's blue-light services.
Prologue. I've decided to structure my presentation in the realms of a very familiar book that you should all know. See if you can guess what it is. Let's talk about the status quo. In nearly every company I've worked in to date, most of them, both operators and vendors alike, have always delivered in a proven model. That is waterfall. We work on projects. We use funding that funds the projects that deliver change to the network where identified necessary. There's a program board that identifies which projects take priority and where the capital should go. Usually, that leads to limited resources, a strain on people, and cognitive load between teams. We also see the long wait times as we hand over work between departments. There's a test department, there's a design department, there's an operations department. It's classic territories. Until recent years, I didn't realize that there was a problem. This was referred to as business as usual.
Chapter one. At this point in my career, I was an automation engineer. I'd taken on board operating Unix systems and virtual machine environments. I'd been looking at NFVI and the deployment of systems and vendor-driven solutions into our network. A lot of this stuff was kind of manually done. There was some automation here for standing up bits and pieces, and we were trying to push forward full automation, self-service portals, this kind of stuff. In doing so, I was looking across the business and all the different areas to see who else had been adopting similar tooling and mindset. It was quite interesting because a few names had popped up, and I could see that some departments may be two years ahead of us, and some even may be a year behind us. By reaching out and contacting these individuals, I was able to establish several working interest groups, which pulled in people from different parts of the organization. The biggest one that we established was an Ansible group. Ansible seemed to be quite well adopted, and also Python. A lot of people were bringing in Python to create their own tool sets. This small band decided to get together every Friday, and in those sessions we would hammer out some of the business's underlying issues and challenges. It was great just to meet really smart folk who were really keen on solving business problems with their code. Hearing other stories inspired me to have a look to see what could be done, what platforms could be provided, and it was good to see that there was a need for improvement across the board.
Monday the 30th of November was the time an email was sent out to my directorate. It was the announcement of a new position to be filled: Director of DevOps. Filling that position was an individual whose name had cropped up across the organization, and I was very keen to see him start. So keen, in fact, that I actually sent an email. I don't usually contact my senior management. They're busy doing senior management stuff, but in this instance I couldn't resist. I felt compelled. I had to announce the fact this is great. This was the thing I was looking for. We'd been trying to get various platforms and services introduced, but to no avail. We didn't have the traction that we needed, and all of a sudden, we were going to get that traction. This was our moment of change. Within the next two weeks, I'd spoken to the individual, and they'd asked me to join them on the journey. Little did I know the adventure that I was about to endure.
Chapter two. We formed a new cross-functional squad. I'd never heard of squads before. I was told to unlearn and get used to learning. The good news was that was already something I really enjoyed, so I looked forward to that. Most importantly, he introduced me to The Phoenix Project. See this book behind me? I always keep it to hand. It's good to review this book, and if you haven't read it, you really need to read it. We were eased into the agile and scrum ways of working. A scrum master was appointed, and agile work sessions were put together where we got to see how things can work and how we should be aiming to work. I felt, as a 47-year-old, very well indoctrinated and institutionalized in the way of waterfall projects. All of a sudden, things started to click into place. I'd actually found my Eric, and my Eric, every time he spoke, was making more sense than a lot of things had done for many years. So I was compelled to follow this journey.
Chapter three. Friday the 15th of January 2021: we'd just completed our first-ever sprint, two weeks. We had a massive sprint planning session that lasted nearly two days, and the squad were keen. We had all our new starters, a fantastic group of early adopters, people that really wanted to see the success and were eager for change. We established collaboration tooling on the various platforms: wikis, ticketing systems, messaging boards, where we would chat, where we'd do our whiteboarding. We managed to get exposure to other agile areas of the business, see how they operated, get an idea, be inspired. We were also given the luxury of our own equipment. We stood up our own sandpit environment, an area where we could safely experiment without causing too much grief to the rest of the business. There's a little ode to the cowbell there, because we always needed more cowbell. The journey had well and truly begun.
Chapter four. From that point onwards, February through to May, we spent the next couple of months inviting others to read with us. We had lots of internal sharing and discussions of the materials that were presented to us. We managed to generate a really good groundswell. There were various other areas and other squads being spun up at the same time. We were encouraged to be inspired. We were led to read lots of fantastic documentation and shown lots of interesting projects. Most importantly, we were given psychological safety. Within our own team, we made sure that everyone felt welcome, we made sure that everyone could speak their own mind, and we made sure it did not matter about failure or success or any of that stuff. We just wanted to learn. We wanted to celebrate what we were doing. Rapid learning can only be achieved by doing.
Influences and inspirations: at the very early stages, and thanks to our director who guided us on our path, The Phoenix Project was where we started. I consumed that book in two weeks. It's the most reading I've done end to end of a book in a long time. Then we moved on to The DevOps Handbook to understand the principles, the ways of flow, the ways of learning, the iterations. The Unicorn Project made a fantastic follow-up story in the world of software delivery. Sooner, Safer, Happier, again, teaching us about patterns and anti-patterns. Suddenly, things were becoming very crystal clear. We could see those things popping out at us. Continuous Delivery: I had to list this book. This book, as a technical architect and as a technical developer, really led me and shone the light on the actual mechanics behind the scenes of what we were trying to achieve. Nicole Forsgren's Accelerate refined that view and gave us the image of where we need to be. It really buffered up and helped crystallize The DevOps Handbook for us. Last but not least, I have listed here the novel version, or the graphic novel version, of The Goal. The Goal itself is a very extensive book, but the graphic novel was something I could breeze over in the weekend to get the gist of the story. The kids loved it as well. Recommend it.
Here are a few inspirations as well. I wanted to list these because I think they're very poignant. The Big K, Kedar, was really the seed of it all and has to be acknowledged. Here he's holding my friend Elmo. We used to use Elmo a hell of a lot. It's enough. Let's move on. I'd like to acknowledge GitLab. They also introduced us to quite a few workshops. They helped get us along and introduced us to other developer DevOps conferences and facilities that were around in our region. UK National DevOps: fantastic conferences. The Better Value Sooner Safer Happier community: they've offered numerous great talks by various inspirational people, and they've also done quite a few good workshops. Let's not forget the DevOps Institute. They're recently doing lots and lots and lots of skill-up days. There are skill-up days and skill-up hours, and people can dive in and see what's needed to do what you need to do. I have to call out Jon Smart; his delivery and absolutely hilarious video on Certified Really Agile Practitioner just had us in stitches, and sometimes, and for some, it was a bit too near the truth to bear. We originally proposed showing this video as part of one of our sprint stand-ups. It was decided against because we figured at that time it may have offended too many. I have to call out Bryan Finster. This guy is absolutely the real deal. You might have heard it before, but speaking with Bryan had really inspired me, and that allowed me to move forward on a number of aspects of what I was delivering. This guy has really been there, and he's really done it. Check out his SADMF framework and check out minimumcd.org to find out more about that. I can also list Dave Farley. His work on Continuous Delivery, the book, and all his videos on YouTube have been an absolute inspiration. Last but not least, let's look at the DevOps Enterprise Summit itself. It was the actual turning point, I think, in terms of inspiration. Via that summit, I got to meet a lot of fantastic people: publishers, vendors, users, people who just wanted to share, but moreover, absolutely great characters. If you were there last year, you'll remember us hanging out at the bar. We were chatting to everybody, and hopefully we'll be there this year too.
Chapter five: the vendor swap-out. BT had been challenged to remove one of the vendors from our network for various reasons, and it was an opportunity to move from bare-metal deployment to actual cloud-native. The area that was being replaced was our packet core. Also, as part of that mix, we're evolving new standards from the European Telecommunications Standards Institute, ETSI, on network virtualization functions and something called CSAR, which is the Cloud Software Archive. Those standards allow the delivery of software between the vendors and the operators that wish to deploy them. This is opening the door to a lot of multi-vendor continual delivery because, of course, each vendor will have its own CI/CD pipeline, and each vendor will be delivering software to us on a much higher cadence than we've ever seen before. Also, there was a new shift with this delivery to move from projects to products, and for the first time, specifically in my squad, we started to build out a product. I think it was a real eye-opener for the operations team to understand that the stuff we were building, we would also be supporting.
Let's wind back a bit. What is a packet core? When you use your mobile phone over the wireless network, or should I say the radio access network, all your data and all the control is done by what we call the core network. Within that core network, there's something called the packet core. That's basically set up to control access to the internet, how much data flows, and provide quality of service. The packet core basically is your pipe to the internet.
Moving back to the ETSI story, ETSI standards were pretty much ratified. It was actually identified very early on that CI/CD would exist, and you can see here by the diagram that's part of the ETSI standard that you will have multiple providers. Each provider will have their own development and test execution cycles. The idea here is that CI/CD and DevOps principles are built into the industry. Then there's the identification of a validation and operator stage. Actually, for our side and for the real-world experience, the validation and operator are kind of the same thing. We always test the software before it goes into production. The trick here, of course, is we need to make that fully automatic. The question is: how can we cope with multiple vendors coming through? How can we cope with multiple vendors delivering their software? We knew we had to build a high-level framework to help that delivery.
Together with the squad, and after all our understandings and inspirations and readings, we slowly started to piece together what our pipeline would look like. This would be our continual delivery pipeline. This would be the high-level north star to which we wanted to deploy. You can see here the various phases we wish to break down: ingestion, scanning, validation, an independent testing phase, followed by a promotion into production, then with the deployment and the monitoring. Classic phases of continual delivery. Subtly within the validation, we have our vendors supplying their own software pipeline tooling as well. Big vendors are coming along with their own CI/CD-type tool sets because they're expecting to deliver an off-the-shelf capability. Their capability allows automatic deployment of their software regardless of environment, so our level zero pipeline needed to interface with the vendor pipelines that they were delivering.
Once the vendor had delivered their software, we needed then to do our own testing and validation. Classic phases which could be run in parallel, such as security, regression, integration, load testing, and end-to-end testing can all take place. A lot of the tooling now in our test beds and labs are fully automated, and we're expediting the creation of new tools to help support this activity. One of the biggest things about the pipeline, as you know, anyone that's worked in continual delivery, is actually the shifting left of security and controls. Under normal delivery circumstances and the old way of working, security was always left to the last. Once the system was stood up and built and ratified as working, security were then rolled out to check that the system was actually secure. This means that a number of things could have gone wrong. This means that the project could be late, and that actually security measures or issues or defects that were found may have to be waived to keep the project on schedule. By shifting security left, we've been able to actually mitigate bad software or bad defects from even entering our network at all.
Moreover, what we developed here was a single framework allowing for delivery management and the visibility of all the software coming into our network. Most critically, and a lot of people miss this, especially the vendors, was actually generating a rapid feedback mechanism. We need to get back to our suppliers the fact that there's a problem so they can react. They can turn around the software fixes as fast as possible. They can turn around the security fixes quicker.
Chapter six: lean. For us, this was an aha moment. We figured out via various proof-of-value experiments within our own sandpit that there are certain mechanisms that would help us, specifically the scanning and the security side of things. We would always brainstorm every week about the different possibilities and how we could set things up. But the penny finally dropped for us when we realized there needs to be an MVP. What can we deliver to the business that will add value right now? We decided to make that a very simple scanning and delivery mechanism. Software has to come into the company. Everything was set up to do the scanning and to provide the support and get the required sign-off to allow that ingested software to be used internally. That's what we made our MVP.
We invited our vendor along to help build us the capability. They took part in our squad as a squad member, and that was invaluable because they saw day-to-day our troubles. They were able to guide us and steer us and also input to the development part as well. We used the show-and-tell sessions of the Scrum ceremonies to show our stakeholders where we were going. To start with, it became very obvious that lots of people were very interested. But as the sprints moved on, the audience started to shrink. The show-and-tells became more of a look at what we've done to ourselves. So it took a while for the stakeholders really to engage and remind them that they have to be there to see what we're doing. Our first MVP demo was absolutely awesome. We were able to invite a member of the operational directorship or leadership team to come and operate a software delivery pipeline themselves. With minimal information, they were able to kick off the pipeline, ingest the software, produce the scans, produce the reporting, and allow that software to be promoted into a lab area ready for testing.
Another aha moment for us was to put our end pipeline into continuous execution. Not quite continual delivery. Continual delivery would mean we were running it quite frequently. But continuous execution, and there's an asterisk there: we just started off lightly. We'd started to run our pipeline once a day, overnight. We found out that continuous execution gave us visibility the next time the pipeline broke, not the next time we needed to use it. This is a key differentiation.
This is where GitLab comes in again, and as they've been kind enough to invite me to speak on their behalf, a little section about how they enabled my team to develop. Via the GitLab environment, which we were able to stand up quite quickly using a Docker container, we were able to set up a rapid development environment. We were able to lint, make sure our code was secure, do unit testing, and build and push our actual deployment stuff into our artifact repositories. The GitLab DevOps platform really sped us up. The ease of use of GUI and the rest of it allowed us to coordinate the team and set up people into our sandpit environment very rapidly. Using the rapid CI development cycle, we could quickly get feedback to the developer if there were any problems, if the code wasn't satisfactory, if it didn't pass the test, if they'd left a password in the file somewhere. This allowed for quick collaboration and immediate deployment, as the successful build artifacts were instantly available to our own delivery pipeline. We also used the output of the builds for code health status. We knew if the code was clean, if the code conformed, and if the code was secure. Contribution statistics from the GitLab engine allowed us to analyze squad health as well. We don't believe in marking squad velocity via lines of code. No. This was more just to analyze how the squad was collaborating. Are certain contributors doing more than others? Does someone need some help? Have there been specific sprints or parts of the problems we've been solving that generate lots more code? It was a fascinating insight.
Chapter seven: outcomes. All this work and all this effort a year later. When we started up our squad, we were told that it could take around two years for transformation. It could take two years for our way of working to really bed in, and for the management to understand and see what we were capable of. On the traditional waterfall path, the target's already been set, whereas with agile, we were finding our way. That was very difficult for management to stomach. There was quite a bit of resistance or misunderstanding of what it is we were doing; no idea what it is we were achieving. We knew what our goals were. We knew we had to build our framework, and we knew roughly how we wanted to do it. So we were drawing the rough outline of what it was we were going to deliver. Without those concrete deadlines and maybe plans, the management were a bit skeptical about what it was we were going to achieve. In one comment, we were very showy and tell-y. How different that is a year later.
Now we've worked as a team. We've managed to deliver a new security framework. The business is really starting to open up to it. Not only are we passing our container solutions via it, but we're also starting to pass quite a lot of our virtualization VNF solutions via it. We're able to deliver SBOM-level visibility that enables rapid tracking of problems, faults, and security holes. We've been able to build out, and we are building out, rapid dashboards. This information is readily available as the software passes through our pipeline. We're able to ingest it, pull out metadata, pull out the release notes, pull out the testing information. We're able to scan for changes. We can throw up anything that's not looking right and immediately flag it. Another outcome which has been inherent to the adoption of DevOps practices via the use of infrastructure as code and continuing environment health checks means we've been able to improve stability of environments. No longer were people just going in and changing anything willy-nilly. Even though the control mechanisms are in place in our test environments, we still have vendor engineers coming in, making subtle changes. We still had JFDI situations happening. They're all hallmark situations of an old way of thinking. Introducing infrastructure as code and version-controlled configuration meant that we knew exactly what should be deployed, and if anyone had tinkered with that, it was clear what would change. Deterministic environments are key for stability and automation. The continuous health checks gave us the confidence that the environment was safe to move forward with critical testing.
Just to summarize some of the aha moments the best I can. As you can imagine, it's been an interesting year and a half of experience. I can tell you now, one thing that opened my eyes was the DORA metrics. Bryan Finster has pointed out quite validly, it's not always about the DORA metrics. There's tons more metrics that support the DORA metrics. But those DORA metrics are actually a north star. If you can look towards delivering them, or at least make a start, you're heading in the right direction. Quite simply put, the facts don't lie. Every effort's been made by DORA to analyze the state of DevOps and how it actually affects companies and how it does impact high-performance companies.
Also, I'd like to call out this summit. The DevOps Enterprise Summit is one of the best summits to be inspired by: lots of fantastic speakers, lots of fantastic people, and also authors and material coming out. When we learned about agile and lean, we talk about small batch, cross-functional teams, Conway's Law. It's all true. You might have heard of these things. You might have discussed them. But I can tell from experience now, looking at these ways of implementing and looking at these ways of functioning, it really expedites the outcome. I could talk another hour about Conway's Law, but for sure, the way we structured our squads and the way we structured our architecture had massive impact to how we delivered. Another simple one here, and a little acknowledgement for work in progress: less is more. Beware of the cognitive load on the teams. You've got a squad full of smart people; they think they can boil the ocean. The moment we started restricting our sprints to one or two items, the more we got done, and the more broadly we were able to look at the solution. Lastly, I think it's true, exactly as The Phoenix Project and The Unicorn Project state: joy can be brought back into software delivery and development.
Thank you for listening to me. I hope you've enjoyed the talk. Feel free to drop by the bar. We should be there. If not, maybe catch up with you on a community event soon. Thank you very much for listening.