Dutch Railways Scaled DevOps Journey
The DevOps train is running at Dutch Railways. It departed in 2016 and is now increasing speed and reach in 2019, towards a number of 63 DevOps teams in 2020. But the DevOps journey at scale is full of technical, organizational en operational hurdles. Lessons to be shared and learned from.
In becoming a more responsive organization to customer needs, the focus is on becoming Better, Faster, Cheaper and Happier. Experimenting, Metrics, Tools, MVPs, Continuous Learning, Team Performance and Impediment Management are some of the ingredients used to make DevOps work and overcome the organizational silos.
The latest ten years of his professional career, Ard got heavily involved in agility topics, starting out as General Manager at an IT incubator company supporting start-ups. Followed by an assignment as IT Director for an education platform, releasing every three weeks a new version to 1.2 mln users and breaking down the monolith. In the past three years, as Manager Software Development at NS, he became actively involved in Agile, LEAN, Continuous Delivery and DevOps practices, further evolving in being an Agile leader instead of Manager.
Chapters
Full transcript
The complete talk, organized by section.
Ard Westerik
Nice to have you here on this journey, the journey of DevOps at scale for the Dutch Railways. My name is Ard Westerik. I'm an IT manager at the Dutch Railways, responsible for about 28 teams working in the commercial area: information for customers and selling tickets. But I'm also a DevOps lead for the NS IT organization, and this story is about the journey that the NS IT organization is making toward DevOps.
A little bit of background. Our organization is about 150 years old, and of course that DevOps journey didn't start then. It started at about 2016. We're doing that with about 1,500 IT people in an organization of about 26,000 people, so you can imagine we are not an IT company. Most of our assets are in very expensive trains. We are operating the busiest network around the world. We are not the best in operating that; punctuality is better in Japan or better in Switzerland, but our network is the busiest. To make a comparison, we will never be able to beat the Japanese on this, because the Japanese minister apologized for a train leaving the station 10 seconds late. Well, that won't happen in the Netherlands.
Another interesting figure is the train station of Amsterdam: 165,000 people every day are passing through there. So that's about the organization I work for, and that's about the organization that's making this DevOps journey.
What is our challenge? The role of IT. We are a government-owned organization, but we are supposed to make a profit. We are supposed to do that at a good cost level. One of the most important KPIs for us is customer satisfaction. This is a score that the government is taking on us in order for us to operate the Dutch railway network. It sounds actually quite good, because our customer satisfaction is at 84%, and last month it was actually 89%. It's an all-time high. We are proud of it, but it is by far not enough.
Public transport will get stuck in the Netherlands. Why is that? The problem in the Netherlands is that there will be no extra railroads for the coming years, but there will definitely be more passengers. By 2040, we expect a 27% to 40% increase of customer traffic on the infrastructure, but the infrastructure will not increase. That is because it is very expensive. In the Netherlands, we have a lot of rivers. If you build a bridge over a river and a train has to go over it, it will cost about a billion euro. So passing three bridges means three billion euros, and the government can't afford that.
The answer needs to be more trains on this same infrastructure. By allowing us to do that, it means we need to improve on our IT. That is the big challenge we are looking at, besides the fact that mobility as a service also means that the traveler is using, besides the train, a taxi or a bike or whatever there may come.
In order to get there, we created in 2016 something we called a vision on agility. The vision on agility is this book. It's not big. It's only a couple of pages. But it's quite important, because we found out that companies -- and it's in the slides -- McKinsey found out that if you're going this way toward agility, then it means that you're able to be better, faster, cheaper, and happier. That was, of course, something that was appealing to us.
For example, complexity of projects meant that we could go from project-based budgeting to program-based budgeting. Joint business-IT development meant for us that business owners from the real business were becoming product owners for our teams. Reducing defects up to 60% is about technology, the adoption of the cloud, and the adoption of cool tools to support the OTAP environment. Increased motivation of employees is also a very interesting point. As we are not an IT organization, it was quite difficult for us to attract IT personnel. But a month ago, we found out that we are the number five organization that people want to work for in the Netherlands, and that's quite astonishing because two years ago we were at point 23.
What makes it so important and so interesting to work for a company like the Dutch Railways? Of course, there is a lot to do because 26,000 does not equal 1,500, but it's also about what we are doing in our IT environment and what is written in here. The way we created this -- and it was not a revolution maybe for you, but it was a revolution for our organization -- was a co-creation of the IT management with these 1,500 people. They reached out to these people to find out, as you are an employee of the Dutch Railways, how are you looking at agility? What do you think needs to be changed? That was quite different from everything we did before, because before that we were just in one room thinking of how something efficient could be created for the rest of the organization.
Let's go back to 2016, the start of our journey. What happened then is that we were facing a challenge. Our IT director said, "I want to do something with continuous delivery." Not knowing exactly what it was, we started an experiment with two teams. In doing this experiment in a year, we learned that if you are going on this agility journey, then you need to do value stream mapping. That's maybe not so astonishing, but if you are in a team, you have no focus on what's going on around you; you have more focus on what's in the team. By value stream mapping, you create something from the start of an idea to the implementation of an idea, and you learn with whom you are working in that area. By value stream mapping, we found out the easy things we could improve in processes.
We used a maturity model in 2016. It worked very fine because it's a start, and you can help a team find out at what level they are or what next step they need to take to improve in getting agile, getting better in continuous delivery, or getting better in DevOps. I will come back to that later because nowadays that doesn't work anymore.
Feedback loops meant for us that we needed screens like this. The teams were actually working without screens. They had a lot of metrics, but it was not visible. We said, okay, if you have feedback loops, it's easier to work together with other teams, and also with management, on the things you are seeing. So it's visibility. Continuous learning meant we had to work on adopting new tools and new technologies, so learning about the new things in this world.
In 2017, we got with these learnings from two to eight teams, and we supported these teams with, of course, very cool metrics on the upper side. From these eight teams working in that way, we also learned that we had to change from project to product and what it meant. We learned that change-run, which for us was the normal way of working, was not working anymore. We needed platforms. Central command to mission command meant that we could not ask managers to tell the teams what to build. It should come from a different direction: product owners.
By learning that, and by the teams learning on that journey, we were asked, "So where do we need to go as an organization if we want to be 100% agile?" That meant everybody was agile and 50% DevOps. In my opinion, I thought that 50% DevOps was about a great number of teams working DevOps on a journey. But for our CIO, it meant something different. It meant half of 126 teams. So I thought, half of 126 teams actually is 63. Actually, he expects us to have 63 teams in 2020 working DevOps. If he means DevOps, he means in an agility format that you can score being at base camp.
We have very interesting discussions on that, because that is not what teams expect from management in their route toward more agility. In 2019, we created this based on what I learned last year here at the DevOps conference. Nicole Forsgren was explaining things about metrics and outcome-related metrics. It meant that you had to do something with release frequency, so bring out your stuff faster and easier. It also had to do with how good you are at achieving the work that is in your backlog. There are a lot of numbers about incidents in production. Incidents in production for us meant that if there are teams, then there should also be people from operations in these teams so that the team can take responsibility for the product instead of only for the change.
That was a little bit of a mind-shifter also for our CIO, because from that moment on we could have a discussion not only about metrics -- because he loves metrics, and he definitely wants to know whether we are at 63, 40, or 23 teams -- but we also shifted the discussion toward a real valuable outcome. This was made by the Reporting Rebels. It's easy for me to use and find out what teams are experiencing, and it's easy for me to ask questions about where we are going.
But it's not enough, because I told you something about agility in a way that you use the levels that you can achieve. If maturity is measured by somebody else than the team itself, then you get resistance. If maturity is measured by the team itself, then something else happens. We found out by a DevOps team journey -- a DevOps team journey is something like a customer journey, but then for the teams in their route toward DevOps. By focusing on their journey, we found out that they made all kinds of achievements. They got all the achievements for every step that they made. They got all the achievements for their route toward base camp, though they are really a DevOps team in the context of NS.
That's quite cool, but also a little bit disappointing because one team is only one. There are teams queuing up just before the summit, just before base camp. The interesting thing about that -- and we can see that because this is the portal that these teams use -- is that they are queuing up just before base camp, and actually it means that we get sight on their impediments. It's quite difficult to talk about impediments, we learned, but this helps because it gives insight into why these teams can't make their achievements just before base camp. Actually, it has something to do with our culture. It has something to do with managers working together to solve problems. We can address that. By now, we know that when we solve this impediment, our teams will reach base camp on their way to the summit.
That is maybe interesting, but it's also important to find out what we learned from that. In order to speed up, because we are behind schedule -- we are behind schedule of the 63 teams reaching the DevOps end state in 2020. Leadership is not about numbers, and it's easy to state, but it's difficult to execute. If your organization is big, where should leadership look? Actually, they have to go and find their own way around, but it's classical to look at numbers, and it's difficult to get away from that. I showed you that. But it's important, because it's not motivating for teams that somebody else tells them whether or not they are DevOps or whether or not they are fast enough.
CI/CD was for us an important start. We started out with continuous delivery because this appeals to technical stuff that teams can adopt. By giving them these tools, the road started to open up itself. But it was not enough for working together between teams. DevOps did. The discussion on DevOps opened up the discussion about collaboration. It opened up the discussion about culture. So it's for us an important step in getting to the summit.
We also learned that ownership of the subject -- I said here, it's not Dev or Ops or architecture. We try to point out where the ownership of this subject is, and the ownership shifted a little bit around in our organization. That doesn't help, because DevOps is not only for Devs, it's not only for Ops, it's not only for architecture. It's about working together. Solving the problem of ownership is an important one, and we didn't quite succeed there yet.
The last one is that the most important thing for our managers is to focus on impediments. It's easy to look away from the impediments, but actually it should not be about impediments that are coming to you. It should be about impediments that you find out because you're looking at it. You should go to find these impediments. That's also quite challenging for us to establish, but it's a main important thing if you want to change something in the culture of the organization.
The most important lesson that we learned -- and that's something between the difference in working on a project or working on a product -- is: tell them why, be patient, and get surprised. When you work on a project, you have a fixed goal. My experience with projects is that while working on that goal, you get there, but you get there late, and you get there maybe half, or 90%, or close to 100%, but you never go over it. What happens when you are starting to focus on a product: it might still be late, but because the teams are working in an agile way, the end state of the product is different than you would expect as a manager, because you can't predict where it's going. Once it's always about the next best step, it might take a little bit longer, but when you get there, you will be surprised. I will show you an example of what that meant for us at the Dutch Railways.
Last year, we introduced a new service. When you fly in from anywhere around the world and come to Schiphol, the airport, then you can make a visit to the city of Amsterdam. When you go on a visit to Amsterdam, probably the best way to get there is by train. I'm telling you that because I'm from the Dutch Railways, of course. It's actually quite easy, but we as an organization did not make it very easy, because if you wanted to do that, for example, you couldn't buy a ticket by Visa. Stupid, but it was not possible. What a foreigner should do is buy a ticket, go to Amsterdam, to Schiphol, find the Trekpleister -- ever heard of the Trekpleister? It's an amazing store, but nobody from outside knows that. There you have to get a coupon to get on the train. It's a little bit of a nasty process. Customers rewarded it with a 2.3 out of five. We were not able to improve until this moment.
What we did is we did an experiment with Booking.com, and we said we can improve customer experience and we can improve the use of the train by making it easy to buy this ticket. How to do that is when you book a venue in Booking.com, you can also order a train ticket. We were enthusiastic about it, but we didn't actually know how big or how good it is. In the old days, we would have said, okay, this is a big project. To get to this project, you should be able to enter the Netherlands via Amsterdam, you should be able to do it in Rotterdam, you should be able to do it in Eindhoven, and you should be able to go to any city in the Netherlands, and you should be able to get back. But then it's huge. We didn't know if that would work.
So we created an MVP with a hypothesis because we wanted to know whether this is something that customers would value. Doing that meant also that it is not enough to create the service; you also have to measure the service, the success of the service. It should be possible to have insights on the pipeline you're creating the product with. You should be able to see something about the product in production, and you should see something about the customer experiencing the product.
This was a close cooperation between business and IT. It was a small product because you were only able to book it on Booking.com if you go from Schiphol to Amsterdam, and not back -- only from Schiphol to Amsterdam. What we saw from it was really astonishing. People loved it. We got a 4.5 out of five for the customer experience, and 1,206 are the number of tickets sold in the first month. We expected it to be a little bit less than 1,000. You might think 20% is not that much if it is over the expectation, but normally speaking, when we do these big projects, we also have the hockey stick. Somewhere in the future, we are going to have a great number of profit from it, but not now. What you saw here is that in the first month, we also sold more tickets than the people working on this service were able to think of. The numbers exceeded the expectation, and that was great, so we are continuing on this service.
Everything in the example I give you could only be done because we are on this track for DevOps at scale at NS. Everything that was in place makes it possible to do these kinds of things. If we would have done this four years ago, we would not be able to measure success, we would not be able to deliver success, and we would not have been able to make success small. So that is an important step for us. It's only a small example. I have more. I also have examples of things that did not work so well, but this was one of the examples I think is important to share at a summit like this.
To finish, I have one more example where you can see how flexible the Dutch Railways as an organization has become. And it's this one. Isn't it awesome? Okay. Thank you for your attention.
Q&A
There is one minute for questions. I can also imagine that if people want to go for lunch, please go, but if there are questions, please go ahead.
Question: With so many individual teams, how do you make sure they all drive in the same direction and don't go...?
Ard Westerik: They don't. Actually, I think that acknowledging that teams will not go all in the same direction at the same speed or at the same pace helps. The DevOps playbook helps us to make sure that if direction is toward delivering faster software, then the goal is the same, but how to get there and at what speed is quite different. Acknowledging that it's not a one-size-fits-all thing helps us make steps in that direction as an organization.
Question: So that means they're not sharing a common platform. Is that it?
Ard Westerik: That means that we want to go to a common platform, but we accept that at this moment it's more important to find out what needs to be in that platform than telling them now what the platform should be. It's not that all these teams are completely free in choosing, but by experimenting we accept that teams are using tools that might not be in our Topaz environment for this moment. But when it works for them, it might also work for others, and it might be the best tool for the purpose at that moment.
Question: So that means there's a vision for a common platform in the future?
Ard Westerik: You're correct about that, but we don't know what will be exactly in there. We know how to get there. Correct. Any more questions? Thank you for your attention, and have a good lunch. Thank you.