Log in to watch

Log in or create a free account to watch this video.

Log in
Las Vegas 2022
Share
Download slides

DevSecOps Journey @ SWIFT

Our DevOps CoE is a transversal organization that supports the DevOps journey of our squads.In this presentation, we will look back at the approaches and techniques used over the last 3 years to develop an 'automate everything' mindset, as well as to equip and empower them in owning their DevOps transformation journey.Building on those foundations we will present how we are now shifting gear to increase agility thanks to consistent approach to DevOps.

Chapters

Full transcript

The complete talk, organized by section.

Christophe Bolle

[00:00:17.500] So I'm Christophe Bolle, leading the DevOps Center of Expertise. The goal of our team is, as I was saying, to help develop practices throughout the company. Before jumping into the topic for today, a few words on SWIFT itself.

[00:00:33.600] So we are a member-owned cooperative. We run a financial messaging platform that interconnects more than 11,000 banks and corporates around the world, across more than 200 countries.

[00:00:48.100] The volume of transactions that is exchanged on a daily basis on our systems is close to 40 million, which makes us a systemic part of the financial ecosystem.

[00:01:03.900] Where does my story fit into transformation? Well, basically, we decided to move to an agile model in 2018. Typically, it was the Spotify model. And after the first tribes got created, it became rapidly obvious that next to the agile transformation, there was a need for more technical support to the teams and more technical improvements on the way we are flowing our features and our products to production. And that's how I created that DevOps Center of Expertise team that I will detail how we are working today.

[00:01:38.700] So you will see the presentation goes into very practical things that we have done, no big presentation figures between facts.

[00:01:47.900] First thing when you create such a team is to align on what are the expectations for the team. And when I went to my peers to know, "What do you expect from the new DevOps Center of Expertise that we are creating?" I was receiving nearly as many answers as I was having peers: "I want you to implement new gates. I want you to implement that tool." Ultimately, I think that DevOps is around increasing business-value delivery. And that's what we took as a reference for the motto of the team: inspire and enable squads to increase value delivery through DevOps practices. And all the words in that sentence are important.

[00:02:24.400] We are a small team. We don't do DevOps on behalf of the teams. We are really there to inspire and make DevOps organically grown in the teams.

[00:02:36.100] More precisely, where are we focusing? If you look at DevOps, we can split it into four different quadrants. On one side, you have the culture, all the aspects of collaboration, breaking silos, being transparent.

[00:02:50.200] If you think a bit further, as we were already engaged into an agile transformation, there is a lot of overlap between those values that are supported by DevOps and the agile values. So we already had an army of coaches working on the mindset aspects, and so we decided not to focus on that aspect but to collaborate with the coaches community.

[00:03:11.300] Another important aspect to DevOps is the architecture, the tools, infrastructure that you need to support your acceleration. There also, we were quite well equipped with dedicated platform tooling, global security, architecture organization. So it's also a place where we decided not to focus, but to collaborate with those existing teams.

[00:03:31.500] The two dimensions that were really our points of attention were the practices: how do you accelerate the flow and streamline the flow of delivery of your applications? And how do you put in place the measurements and the metrics that make sure that you stay on track, and don't take risks when accelerating? And that's really on those two dimensions that we have been focusing most of our thoughts so far.

[00:03:59.300] How did we create the team? And apologies, it's not the very last version of the slides. Typically, the team is created with two different types of profiles. On one side, we have DevOps coaches that are senior profiles that have been traveling a bit on different positions in the IT engineering practices. So people that have been in development, that did testing, that did security activities, that did operations activities, and that are still heavily motivated by doing hands-on. Because I'm strongly convinced that if you want to preach something, you need to go demonstrate it.

[00:04:32.300] So the main part of the team are those DevOps coaches that have an objective to do 50% of their work hands-on, working with the hives. The other part of the team is composed of automation specialists. So technical specialists that can automate, and we have them focused on test automation, deployment automation, pipeline building, and dashboards for reporting.

[00:05:02.700] And those work in a consultancy model. The size of the team is mapped on the size of the agile organization, which means that we have one DevOps coach, or at least that's the goal, one DevOps coach per hive that works together with the leadership of that tribe to develop the whole DevOps roadmap and follow up, act as a single point of contact for all the squads in the tribe.

[00:05:27.800] The model that we have been exploring with success is the model of megaphone DevOps, or DevOps Champion, where we elect in each of the squads members that share that passion to move forward, and we act with that subset of the team as an acceleration group to support evolution.

[00:05:51.900] And as I was saying, we have those consultants a bit on the side, the red dots that we can inject into any squad at any moment to support a particular initiative, to basically break the loop in which the teams are so busy delivering features that they don't have the time to step back and implement the change that will make them more efficient in the future.

[00:06:14.400] What are the different tools that we are using to work with the squads? The first thing is that if you want to develop a structure, you need to know where you are going, you need to know where you are, and you need to have a plan. And for that, we are developing roadmaps. We have been developing a capability assessment, and we are using some metrics that I will detail in the rest of the presentation.

[00:06:35.100] Another important aspect is those very concrete technical supports that we provide to the squads. And last but not least, all the more human aspects, all the people, deserve specific attention, whether it is in terms of skill, whether it is in terms of awareness. You can't be successful without being clear on what are the ambitions, what are the values that you're serving, and what is DevOps, basically.

[00:07:04.600] Focusing on the first element, which is the capability model and the way we structure the roadmap. We defined a model that matches our need in the company. So with experimenting with a few models that were available on the market, we didn't find one that was fitting our needs, so we developed our own that is using basically two dimensions.

[00:07:23.600] On the left side of the pyramid, you have all the practices that support the delivery of features. So all the development path from feature to production. On the right path, you have all your operational aspects. And typically, depending on where you are in the level of governance that you have, in the level of segregation you may have between your development and operations, some of those steps may or may not apply to your own journey.

[00:07:56.400] In SWIFT, we have been focusing on the areas where we have those little stars, which basically means that we are working to optimize the flow from the moment the feature is ready for development until it is in production. On the visibility, which is all the observability, monitoring, logging, understanding how your customers are working, and access to a bit of infrastructure, which is between pale yellow, as we are developing that as we are speaking.

[00:08:22.300] The existence of an air gap or strictly controlled procedures to run or to deploy in production prevented us to work directly in the other elements, the other steps of the pyramid. So the capability model that we developed is our own, and behind each of those steps we have a set of questions, very few, three or four questions that are quite holistic, because we want this model to apply to all our squads, whether they are full-stack applications, but also our enterprise services that depend very often on software-as-a-service type of solutions, or people working more on platforms and delivering infrastructure.

[00:09:08.700] How do we engage with teams and with squads? We've been experimenting a lot on the best way to interact and to help the team figure out what is the best roadmap in the end. That's what came out of those experimentations.

[00:09:22.400] When we engage with a team, the first thing we want to do is understand how the team is working, what is really their business. And so we sit with the squad for a session of two to three hours to build together their value stream: how do they move from a story that is ready to be developed to having it running in production? Focusing on all the elements where the team is happy and things go well, all the elements where there are dependencies or waiting time, and also all the elements that are still manual and slowing down the process.

[00:09:59.700] Once we have that state and we agree, and the team understands how the team is working, the second step is to go through that capability assessment, where we run that in a facilitated manner. Also something that we've been experimenting between sending it as a survey: we found in the end that the best and most efficient way was to work together with the team. We ask the squad to come in a room, maybe not always a full squad, but at least representatives from each of the engineering families that we have in a squad, to go through the different questions and have them assess together.

[00:10:37.600] So typically what we do is ask them to vote on a Menti or that type of supporting platform, so that there is some isolation on the votes. And if we see an agreement around the level of maturity that the team thinks they are on a particular question, good, we know where the team is. If we see that there is a lot of variation in the answers, the fact that we are in the room with them opens the door to a lot of collaboration, engagement, and we learn a lot on how the team is working by working that way.

[00:11:07.700] So at the end of this exercise, we know how the team is working. We know how they estimate themselves, with an external view on that in our capability model. The next item that we add to the equation is some measurements or metrics.

[00:11:25.400] There also, with a lot of discussions on what are the metrics that matter, we came back to very simple ones. For all the stages that the teams are going through, we ask them, or we measure when this is available, at which cadence they go through that phase, at which cadence do they integrate, do they go through continuous testing, do they go through continuous delivery? We ask also what is the time to get feedback from that phase, and how often do they succeed, or how often do they need to redo?

[00:11:53.600] That gives the fantastic color that you have on the side that allows to see a bit where are the strengths and the weaknesses of each of the squads. When we started, we were willing to keep that secret and a bit confidential between us and the squads, because we are not ready to see those materials as a way to differentiate and judge the performance of the squad. But very quickly, the squads themselves asked to open the data, because they were interested to know where their counterparts were, to take benefits from people that were more advanced in some areas. And so today the model is completely open, which I'm very proud of.

[00:12:34.600] So once we have all those elements, as a DevOps coach organization, we analyze and we review a bit what are the best next moves that we would recommend the team to work on, and we go back to the team with that opinionated view that we come with. And we ask them for feedback on that prioritization that we see.

[00:12:55.600] And from the discussion, again, we get a lot of benefits because we have, on one side, the opinionated view from the DevOps theory that is looking at optimizing more at enterprise level and standardizing the evolution of all the teams, conversely mixed together with the local priorities of each of the squads that we need also to understand and take into consideration.

[00:13:17.800] So the outcome of that second meeting is an agreed list of priorities for the squad that spans, depending on the team that you're working with, from one quarter to up to one year. And those elements will be used to fuel the backlog of the squad for the next quarter in terms of enhancements.

[00:13:39.200] Is that working? Well, yes. What we have seen is that over a period of 18 to 24 months, all the squads that we have been working with have been evolving from a situation where they were strong on the CI, on the foundations. CI was good most of the time, not always, with difficulties at the level of continuous testing because we are coming from a waterfall model with a separate testing organization that was testing at the end and still with many manual tests, and deployment was still highly manual.

[00:14:16.400] So we have seen that over 18 to 24 months the teams were really growing in that sequence from left to right, adopting continuous testing, adopting automated deployments. The only element that stays a bit behind, and that's on the blue curve on the right side of the screen, is the visibility that the teams have on their monitoring aspect.

[00:14:41.500] And that is driven down by the fact that the teams are improving the way they monitor the application and the way they get visibility on how the end user is using their application, and adopting common logging formats, et cetera, how those logs correlate across the applications. But in that dimension, we also put an element that the team should pay attention to their own performance, so to their own efficiency, their value stream. And we have seen that very few teams are interested in that. Teams are very interested in improving and experimenting, but really into measuring the actual results of their experimentation, that's an area where, if you have any experience or hints for me, I would be fantastically interested to hear about it.

[00:15:27.100] So conclusion on that capability model: I think it works very well to initiate a transformation because it allows to identify the gaps and to bring all the teams from where they are when they enter the transformation to a common stage. Once they become self-sustainable, I think the model needs to evolve and needs to go away, and that's the position where we are at the moment. So we want to evolve from this capability model to just measuring metrics and measuring the acceleration.

[00:16:00.900] Something that was also related to the four dimensions that I presented at the beginning, and that I underestimated and that we are also trying to address at the moment, is the collaboration between the agile coaches and the DevOps coaches. We noticed that while we have the same values, the level of focus on local optimization versus global optimization was very often not aligned. We saw that there was a lot more push from the agile community to push for local autonomy at the expense of more transverse opportunities. And that's something we are also trying to realign to make sure that as a company we are efficient, not just as isolated entities.

[00:16:45.600] Next to this capability assessment and model that we use to model the roadmap of the teams, we have been also working on upskilling programs, so very linked to the presentation that was just done before. We have been looking for DevOps skills, and it was very hard to find them on the market while we had a huge quantity of skilled engineers that know the way we are working, that know our products by heart. And so we created an academy, which is typically a period of nine months, 30 people that we onboarded.

[00:17:22.400] And the goal was really to train them on all the technical skills that are needed all along the infinite cycle of DevOps, so that we really make full-stack engineers in the scope of our company. Together with that, it came with a number of certifications that we had those people to pass. It came with projects at the end because we are willing to really get the benefits of those skillings. You need to practice to know how you do.

[00:17:52.200] And that was also a fantastic opportunity for us to advertise and raise the awareness around what we are trying to do and what we are trying to achieve with DevOps through a number of conferences and networking events that we organized for the academy, but that were open to all. So on a monthly basis, we have been inviting external speakers for the kick-off and the closing. Gene himself was there to highlight the importance of DevOps. He was there also to facilitate a number of sessions with an extended leadership audience.

[00:18:25.700] We had specialists of different patterns and different areas that came to explain how DevOps was applied outside of our company, or sometimes to challenge established practices. And all that resulted in a quite positive set of added value. First of all, the mindset and the awareness in the full company went up, not only the people that were technical and that we were coaching and teaching knew about DevOps, but all the POs, all the people from marketing started to understand what we are trying to achieve.

[00:19:02.100] We also increased the internal mobility for all the people that were part of the DevSecOps Academy. And very interesting is that those people have been moving largely towards the projects that were critical for us at that time. So it was really seen as a positive incentive to promote the right skills where you need them.

[00:19:25.200] As I was saying, we also had those sessions with the extended leadership and the academy members around: what do we need to do to increase our predictability and business value on one side, and how do we get a real benefit of all the tools and all the standardization that we try to have in the company? That came out with four big recommendations that we are calling out. So it took time to get them rolled out.

[00:19:53.200] But the first recommendation was to revise a bit our OKRs, because OKRs, shared OKRs, introduce shared OKRs as a way to kill the dependencies between teams when there was a common goal. Without those shared OKRs, what we noticed is that very often we have teams depending on each other, which was making a loop of dependencies and slowing down the progress. By putting shared OKRs, we saw that it was helping progress toward a common goal.

[00:20:21.700] We had also an alignment on the need to have a paved road to production, so putting the link between the strong CI that we had historically with all the automatic deployments that we've been putting in place, all the automatic testing that we have been developing. So gluing the CI and the CD together in an orchestrated and controlled pipeline that will deploy it to production. That's something that we are busy with at the moment that should deploy by the end of this quarter, and that will be a real game changer in my opinion.

[00:20:53.500] And the last one is around standardization, positive standardization in a sense that we are not looking at standardization for the sake of standardization, but for the benefits that teams can have. There also, we have a number of standard solutions that we have been promoting positively, and that's something that's running quite well now.

[00:21:13.200] Here that we could not yet start, and that was also put quite high on the list, is to create a process to remove or delete processes. So as you are progressing in a company, sometimes you have governance rules or processes that are there since a number of years, that you don't know exactly why they are still there, are they still relevant. To propose a structure for teams and engineers to challenge them in a more easy way, and to really understand if they are still needed in our context, that's something that we still need to develop.

[00:21:44.900] Not everything was green or whatever. And as we are launching now a second academy, there are a number of points that we are taking into account and that we are trying to address. First of all, we saw that when you equip your engineers with skills that are highly demanded on the market, there is attrition that you will see. People will leave you to go somewhere else. The number that we got was not that high, it was around 15%, but that's still more than we would have liked to see.

[00:22:18.600] So for this second academy that we are starting now, we are shifting from pure skills and certifications to skills applied in industry context. And while we had some doubt at the beginning on that, it was very well received by the academy members for the second academy, as the certifications were also felt as putting a lot of pressure on them. The pressure of failing certification was putting a lot of stress. So people are very happy that they can focus on applying the skills, which is fantastic. We still offer the certification as optional for the ones that are really motivated and want to go for it.

[00:22:54.200] And the second big element that we also want to do better with this second academy is: how do we keep a collaboration of practice that goes far beyond the academy? What we have seen from the first academy is that at the end of those nine months people went back to their team or moved on a new team. I would say that 30% of the teams are still collaborating and actively promoting DevOps transversally in the company, but for the others, we also have 30% that went back into their BAU, and so they don't really make good use of the skills that they were taught.

[00:23:30.600] So that's something that we really want. We want to work better with the chapter leads and people managers to build more longer-term development plans and opportunities for the people that we are upskilling at the moment.