Log in to watch

Log in or create a free account to watch this video.

Log in
Europe Virtual 2024
Share
Download slides

Introducing a Product-based Approach in an Engineering-focused Container Platform Team (DKB)

Your container platform is up to speed technically and your most advanced developers are already running production workloads. But – legacy developers see you as a new operations team, are skeptical about cloud-native concepts – and on top of it, your finance team is politely asking: why do we actually need a team of platform engineers? After all, “you build it, you run it”?


This session will present the speaker’s learnings in platform engineering, consulting and sales engineering roles and show how adopting product management practices can improve the situation.


It will explain how to clarify the value and benefits of your container platform for developers and other stakeholders, especially by creating a platform architecture which is not only focused on technical features. It will also explain how to solve some of the specific challenges of container platform – for example, how to make platform capabilities visible and understandable to a broad and varied audience.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

So the next speaker is Stéphane Di Cesare, platform engineer and architect at DKB, one of Germany's top 20 largest banks.

I love that we have so many talks at this conference about platform engineering, such as the ones from Legal & General this morning, VELUX, and KPMG Switzerland earlier today. So like Henrik and Josephine talked about in their talk, Stéphane analyzes the problem and really starts from first principles and derives what should a platform, what could it do and should it do for its customers, and explores the world where there are many platforms and how one might judge what is actually doing what it's supposed to be doing for its users.

So with that, here's Stéphane.

Stéphane Di Cesare

Thanks for the introduction. I actually really enjoyed the presentation by Henrik as well. That was quite interesting. It's a bit the same topics I'm going to cover, but with a bit of a different view.

What I'm going to look at is also platform engineering, but coming from an engineering background, and how we could bring a product-based approach to it.

I'm working for DKB. DKB is Deutsche Kreditbank, so that means the German credit bank. It was founded in 1990, so it's actually the first private bank that was founded in East Germany, just before the reunification. We are both a retail bank, so you can get a credit card at DKB, and an investment bank with a specific focus, for example in renewable energy.

I joined DKB a year ago now. I've been working in different technical roles, with quite a broad scope, and at the moment I'm trying to become a platform-as-product specialist. I'm a member of the CNCF Working Group Platforms, which is also looking at platform engineering from the CNCF side, where there is a lot of activity at the moment. And I'm very interested in linguistics and languages.

This is where we started from: we created a group of teams called the Standard Operations Platform, which was basically formed from different infrastructure teams with the goal of making infrastructure easier for developers. The core of that team in the beginning contained the platform using Crossplane, so quite cloud native. We were joined afterwards by different groups which had different backgrounds and which were not necessarily as cloud native. The vision was to bring this into a common standard platform which is used as a product.

As I was saying, there was focus on engineering. Basically it means we had something that worked technically. We were able from the beginning to run workloads on the platform, which is a good start, because at least you can iterate. But the main thing was the communication was quite focused on engineering. It was not very easy to use for people who don't have a strong technical infrastructure background. This is what we wanted to improve with time. Here you can see the scope of our platform.

The main goal, as I was saying, was to decrease the cognitive load for the product teams. There was actually something that Henrik explained quite well in the VELUX talk. We wanted to make the complexity transparent, so that the developers need to know as little as possible about infrastructure, and also make topics like security or compliance easier. Compliance is of course very big for us, since we're a bank and we are under many regulations. We wanted to provide this as a standard with standard processes and tools, and we wanted as well to improve as much as possible the way we communicate with users from one side, so with the developers, and with the business.

One big challenge for us was: we have this platform, it's running, how can we show which value it has? How can we explain to leadership, this is what it brings, more concretely?

I'm going to talk here about five topics: what we did to define the platform so that it's understandable from the outside, what it actually is; looking at platform maturity, how to clarify it; looking at how we worked with focusing on product; and then two other topics, the communication and how we worked with documentation and information.

Regarding the product definition, we are working on so-called platform architecture, which is detailing which capabilities the platform has. This is divided into a user layer of the capabilities which are user-facing. These are things like: what are the typical user interfaces? What are the APIs? Which resources can you order? And also non-technical topics like which processes are related to the platform and how do we communicate?

Then we have under that a platform infrastructure layer, which basically details how everything is implemented: which VPC, how many EC2 instances, and things like that. And there is a common layer where we have more conceptual topics like security, compliance, observability.

If I take an example in observability, in the common layer you would have things like: how am I supposed to do logging with this platform? Which kind of logging does the platform provide? What can I use? In the user layer, you would have: what is the interface? Where can I look at my logs? And also which logging resources do I have available? Do I have components that I can reuse? In the infrastructure layer, you would have what is actually the implementation: how does logging work? How are logs being sent from A to B so that they can be read in the end?

There is a lot of work at the moment on platform architecture, so I've given a few examples here. This is what we did. It was one possibility; there are many others, so it's interesting to compare.

Within the capabilities, we tried to work as much as possible with use cases. As a team with engineering focus, people thought mostly about the solution directly. We're trying to go more into thinking about what are the use cases: who wants to do what, for what reason, what is the value? Then to follow this up by saying: this use case could be fixed by this solution, it's in this status, here is where it is documented, and to clarify how the responsibility works.

Finally, we started to work with a heat map. Don't be confused by the architecture drawing, because the architecture was a bit different when we started than what we have at the moment. They are not exactly the same layers, but broadly it's the same. Here we have on the left of the squares what is the implementation stage, so basically how much this capability works. On the right side is how well it is defined, so how clear it is what it actually does. We can see that on the platform infrastructure part, we were in a situation where a lot of the things work, but it's not very clear from the outside what actually is there, and this is what we wanted to improve.

It's important for us as well that this is not something that's fixed. You've seen that it changes with time. It has to evolve. It must be kept up to date. We need to listen and get as much input as possible from users. Developers are not your only users. I often hear that the users of the platform are the developers, but they are also people who are coming from the side, from the business, from the finance team, from security people who might want to push some updates. These people are also users and they're actually quite important. I don't need to explain to anyone that it's a good idea to take into account what the finance team can be useful for in the future.

The challenge we had, as I was just saying, was that it was not very easy for everybody to work at that kind of abstraction level. People were used to working more with solutions and thinking less about what is the use case behind it, what is actually the value, and to include this better in the workflow. For example, even someone fixing an issue would think: this issue happened because the user interface was not very clear, so maybe we should put a use case to make this user interface clearer.

Then we ran also into the issue of how do we communicate up, because leadership wants to know: this is very nice, this platform, when is it finished? We have to explain that it's not as simple as this. We came across this platform maturity model from the CNCF, which was quite useful because it shows where we are on the platform journey and we can use these squares as a discussion basis. On the investment part, if we move from dedicated team to as-product, what does it actually mean? What does it cost? Which resources do we need and what does it bring? We found that was really useful to have that discussion.

I'm not going to talk about the actual details for us, but this is something I recommend to use. There is a very good introduction to the maturity model in the Nicki Watt talk that's mentioned in that slide.

Then regarding focus on product: everybody agreed that product focus is important. It was less simple to agree on what does it actually mean to have a product focus. We came also into the realization: what is actually our product? Maybe we have several products. Coming from an engineering background, I would look for what is the certification for that topic, and then you have a list of concepts. Unfortunately, it's not as easy for product management. There are a few resources I recommend, but in general there is no common consensus as in the engineering topics.

What we agreed on was that what is important is that the platform team has the final say on platform features, that we decide what is being implemented and not, for example, someone from a compliance team. They would rather come with their needs, we will translate that into use cases, and then we decide how it is implemented. Compliance usually has a very good way to define value anyway. They can explain: if we don't do it, the CEO will go in prison. Then it's quite clear how we prioritize. But I think it's important that the implementation is in the hands of the platform team.

We also need that the products are clearly identifiable. This is what we're using to define the products: it must be very clear for users to know, when I want to do this, I need to use this product. If we need training for that, it means we haven't defined the products well and we need to. Then we agreed as well that the features and the usage of the products must be as streamlined as possible and well-defined. They must be documented somewhere so that they are clear.

This is a kind of example of what we did. Because we ended up thinking maybe it's best that we talk about several products in particular. We have part of the team which is working with VMs, so we thought about working with a container platform and VM platform. Then we look at the capabilities which are in the architectures, and some of these capabilities can be implemented in different ways depending on the product and be wired to different teams. The boxes on the top are purely interfaces.

That has the advantage that users communicate at the product level. They don't need to know how these teams below are organized. There is an interface here, and then in between we will take care that this is going to the right person. One advantage of dividing into products means that we can take decisions faster because the responsibility is clear, and also we are decoupled basically. We can organize ourselves more flexibly in teams and responsibilities without user impact, without needing to tell users that now there is a new person working with this topic.

Then let's switch to communication. Since we had people coming from different backgrounds, it was important to find a common language. If you have topics like service, when you have people coming from ITSM, Kubernetes, cloud native, everybody will understand service as something different. It's important to find a common basis. It's really important as well to focus on the agreement, not focus on who has the right definition and who wins against the ones who have the wrong definition. We have to be pragmatic.

We found it's often useful to prefix the words when we use something that's very overloaded, like service, that we talk about: do we talk about an ITSM service or platform service, so that it's clear what the context is? This is a bit from the idea of the bounded context in domain-driven design.

Regarding communication channels: communication is really a feature of the platform. This is something that should be seen like any technical feature. It should be thought: how can we streamline it as much as possible while being sure that we cover all the use cases? It's important to think about all the communication use cases and not to block any channel because we want to standardize too much.

There are cases, for example, where someone would like to have a quick question. They are interested in having a quick answer, but they're not necessarily interested in having a guarantee that this is 100% correct. That would be a different channel than, for example, an incident. For many people, communication is incident service management, but there are also other topics like platform consulting. Do we need someone for a day to look at our architecture and tell if it works well with the platform? Or also the communication from the platform to the outside. This is something that's sometimes forgotten in the beginning.

I found this book, The Async-First Playbook, to be very useful. This is for remote work, but it actually boils a lot down to: how do we make the information visible and keep it? This is very useful even in other ways of working.

Then last thing about documentation and information. Same thing: this is also part of the product. We thought about what are the different use cases and personas. This is not only the user documentation. What is in yellow, this is what people in general see as the documentation. But there are different kinds of documentation for different personas. You might have people who don't know about the platform and want to learn about it; they will need different information than someone who is already deep into the project.

Here, our focus was on keeping the overview. We had a lot of existing documentation, so we were not in the position to say we are going to force that you must use this. It was important for us to focus on keeping the overview so that we don't have documentation we don't know about, written for example for a product team.

Host Interjection

Stéphane, Gene has asked me to keep the trains running on time, so if you could maybe just make your last final point, any help you need, and then we'll start to wrap up.

Stéphane Di Cesare

Yeah, yeah, that's the last slide.

I'm interested in any kind of feedback about this because, as I said, there is not a lot of existing information. I'm really interested to communicate about that, especially about how to document the platform's scope and how to help with product mindset in everybody in the team.

If you have questions, feel free to contact me on Slack or LinkedIn. We have openings as well. You can have a look there. A database specialist we need right now. Thank you.