Crossing the Platform Gap
Teams delivering high quality experiences to customers are critical to your business and they need to be as productive, efficient and supported as possible. But with multiple teams in your organisation with differing requirements, how do you balance flexibility with complexity and support a platform that provides the tools required on-demand, whilst also minimising duplication and cognitive load?
In this talk, Paula will describe this “Platform Gap” challenge that many organisations face and provide recommendations on how to solve this by reviewing and improving team interactions. She will explain how your internal platform can be used to support this model whilst still focusing on delivering a delightful experience to the users of the platform through applying product management practices.
Chapters
Full transcript
The complete talk, organized by section.
Paula Kennedy
[00:00:14] Hi there, and welcome to my talk, Crossing the Platform Gap. Thank you to the DevOps Enterprise team for having me, and thank you for watching.
[00:00:23] I'd like to start with a brief introduction. My name is Paula Kennedy. I recently co-founded a company called Syntasso with a great team of colleagues. I've worked in the technology industry for more than 20 years: 10 years in software as a service, and then for the last 10 years I focused in the platform space, most recently working at Pivotal and later as part of VMware, working with customers who are running large internal platforms.
[00:00:54] What will I be talking about? I'm going to describe the platform gap, the challenge that my team and I have seen at many companies we've worked with in the past, and which might sound familiar to some of you in the audience. I'll explain it in terms of why you should worry about it. I'll make some suggestions of things that you could do about it. I'll also be providing some examples where customers have made significant progress in solving some of these challenges. And then hopefully, I'll leave you with some things to think about going forward.
[00:01:29] So, what is the platform gap? Let's say you have a team, we'll call them Team A. They are responsible for a product; let's say it's a mobile banking application. And because you're following the DevOps principle of you build it, you run it, this team is managing their own Kubernetes clusters. They're also managing a database. Let's say they've got some machine learning in there as part of the stack. And then finally, they're using Knative and Kpack for their application.
[00:02:03] Now, if you're a slightly bigger team, maybe you've got a separate Team B. Team B might be responsible for a mortgage calculator. Let's say that their stack is similar to Team A, but in this case, they have a CRM tool as part of the stack. And then as a bigger organization, you might also have Team C. Team C might be taking care of credit cards, and their stack looks very similar to Team A, but they've got things configured just slightly differently. Maybe they're running some different software versions. Finally, you have Team D. They're running the legacy payment system, which is running on bare metal on premises, looks very different to the other teams, but is still a core part of your business.
[00:02:50] Maybe you've got hundreds of other similar but separate teams. Maybe your organization has made acquisitions in the past, and you've got lots of separate teams doing similar things but not working together. This is what it looks like when you've got a platform gap. Each team is juggling all of the components of the stack themselves. They've all got to cross this gap going from infrastructure to actually delivering value to end customers with their applications.
[00:03:23] What are the implications of this platform gap? There are several problems with this model. The first thing to notice is the cognitive load on each team. Cognitive load is a term found in the field of psychology and is defined as the amount of information that working memory can hold at one time. In this case, we can see that Team A is managing quite a lot. The more things that they're managing, the more information they have to remember about each part of the stack, and the less they are focused on the actual application that is delivering value to the business.
[00:04:02] As my old colleague James Watters used to say, the application team should be focused on delivering business value above the value line. The value line can be defined as separating what's core to your business versus what you can and should outsource. In this case, we mean it to say: what is the core specialism of this application team? What should they be focused on instead of also trying to wire all of these infrastructure elements together?
[00:04:31] Because this team has high cognitive load and reduced headspace to focus on the application, this could be having lots of impacts. This could be reducing the speed at which they can react to customer feedback, reducing the speed at which they can add new features or iterate, or patch and keep their application secure. Generally, they're maybe not keeping up with your competition. If you look across the business, you've got all of these teams all facing those same cognitive load challenges, slowing down the whole organization.
[00:05:08] It's clear when you look at this diagram that there's lots of duplication and wasted effort. If we remove Team D and their bespoke stack from the visual, you can see that across each team, there are common items in their stacks where your organization could be pooling resources to save time and effort and money. By sharing these services, your organization could benefit from economies of scope, which can be defined as the average total cost of a company's production decreases when there is an increasing variety of goods produced. In this case, it means that if some of these common parts of the stack could be grouped together and shared, the cost to add new teams and new products would be incrementally lower, thus reducing the average cost per business application.
[00:06:04] In this model, what's happening with team communication and collaboration? We can see it looks like the teams are working in silos. There's some pretty thick brick walls between them. If this is the case, then they're not working together. They're not sharing best practices or knowledge or skills. You might have multiple people across the organization struggling to solve the same problems without any shared learning or pooling of resources. What are the rules of engagement between these teams? How should they, or could they, even work together?
[00:06:40] Maybe there are other hidden elements that your broader organization is not aware of. There could be pieces within the stack of each team that are hidden in the murky world of shadow IT. Teams could be using tools that don't meet internal governance or compliance requirements, which could possibly be leading to security risks. Maybe one team is using the next greatest tool, which would massively benefit 10 other teams if only they could access the service without themselves having to take on the extra cognitive load.
[00:07:14] So what does all of this mean? To summarize, at Syntasso, we define the platform gap as the chasm each team must cross between the infrastructure and delivering their meaningful product value. By having this platform gap, you're facing issues such as high cognitive load on each team, duplication and waste, team silos with little or no collaboration, potentially shadow IT, and also potential security compliance gaps. Generally, just a nightmare all over the business to audit what's going on.
[00:07:55] Aside from our team's combined years of experience working with customers who've experienced some or all of these issues, is there any other data out there? There was a recent report from Dynatrace where they surveyed globally 700 CIOs in large organizations with over 1,000 employees across multiple countries. In this report, it found that out of the 700 CIOs who responded, nearly half say that their IT teams are stretched more thinly than ever. Half of the CIOs say that their business and IT teams work in silos. Forty percent say that limited collaboration is disrupting IT's ability to respond to change. Three-quarters say that they are fed up having to piece together data from multiple tools to assess the impact of IT investments. So you can imagine what type of sprawl of technology those CIOs must be dealing with.
[00:09:00] With all of these platform gap issues within these large-scale organizations, how can we go about attempting to cross this platform gap? With all good questions, there are no easy answers. But it really comes down to two areas of focus: at an organizational level, what changes should be made? And because this is the platform gap, I've got a specific focus on the platform team level with regards to platform team structure and practices.
[00:09:34] Let's look at organizational level first. A great place to start is with Team Topologies, which is an excellent book published in 2019 by Matthew Skelton and Manuel Pais. The book itself covers four team types or topologies and three interaction modes, and it was heavily featured in this year's State of DevOps Report. It's an excellent book, and one of the key benefits is that it provides a common vocabulary for people across your business to understand organizational models and team interactions, which can help to increase understanding and therefore flow of change across the business.
[00:10:18] To answer the questions around the platform gap, I'm going to focus on just some specific parts of their model, but I would strongly recommend that you go read this book because it's absolutely fantastic. The parts I'm going to focus on are two of the three interaction modes: collaboration, defined within Team Topologies as teams working together for a defined period to discover new things, and X-as-a-Service, defined as one team provides and one team consumes something as a service, for example an API.
[00:10:57] Secondly, on the team types, I'm focused specifically on two out of the four. Stream-aligned teams are defined as a team aligned to the main flow of business change with a cross-functional skills mix. This is often synonymous with an application team, such as the Teams A, B, and C that we looked at earlier. A platform team is defined as a team that works on the underlying platform, providing a compelling internal product to accelerate delivery by stream-aligned teams. It's within these two areas that I've focused heavily in the last few years. But what have I been doing, and why do I view it as so critical for success of companies?
[00:11:44] The pattern that my team and I have been encouraging at customers for a number of years now is to start with collaboration. This could be described as a period of discovery or user research, where the platform team needs to work with the stream-aligned teams to understand and define the requirements. This collaboration provides a shared responsibility for both teams and allows the teams to increase their knowledge, their understanding, and have empathy for each other's needs.
[00:12:17] Whilst collaboration is critical upfront to define the boundaries of the services that the platform will be providing, it's not just an upfront activity. It's also useful to maintain just a lightweight, ongoing collaboration. This is to ensure that the platform continues to deliver value as needs can change, tools can change, technology can change. The one constant we have is change.
[00:12:43] The next step, once requirements are understood through collaboration, is to move to an X-as-a-Service mode. This sets clear responsibilities between teams, reduces friction, and reduces communication challenges. It enables faster delivery as the stream-aligned team has autonomy to self-serve the services that they need on demand whenever they need them, particularly if the platform team can make their service easy to consume.
[00:13:17] The problems we previously looked at in the definition of platform gap can begin to be tackled if we have a platform team supporting the application teams, providing the services that those teams need, and if those teams have clearly defined interaction modes.
[00:13:38] I mentioned earlier that the platform team is defined as a team that works on the underlying platform. But what do I mean when I say platform? Am I talking about one specific type of technology? Back in March 2018, Evan Bottcher wrote this great article which contained this definition of platform: "A digital platform is a foundation of self-service APIs, tools, services, knowledge, and support, which are arranged as a compelling internal product. Autonomous delivery teams can make use of the platform to deliver product features at a higher pace with reduced coordination." You can see this definition is technology agnostic. It doesn't mention any specific type of technology. Rather, it's a curated set of tools and services which are arranged together as an internal product.
[00:14:34] This leads to the second part of how to tackle the platform gap, and that is specifically at the platform level and the set of activities that the platform team should be doing as part of delivering their platform and making it easy to consume. This specific set of activities that we have seen drive real success within organizations can be grouped under the heading Platform as a Product. At its very simplest, this means taking everything that you consider for external products that you build and applying that same product mindset and same set of practices to your internal platform.
[00:15:16] To give you some concrete examples, it can look something like this. Firstly, the platform team needs to understand who their customers are. If we go back to the original organization model that I had, you've got Teams A, B, and C working on their applications, and those teams are the customers for the platform team. The platform team needs to treat those people like they're customers. There might be developers, product managers, designers in those application teams. Those people are the customers for the platform team.
[00:16:03] The first thing to understand is: what are your customer needs? To figure that out, we need to use collaboration. This could look like user interviews or a journey mapping exercise. It's really any mechanism that you can use to discover and understand the customer requirements. Then it takes you to be able to figure out how you might go about meeting those requirements and delivering value.
[00:16:33] Once those requirements are defined, the platform team needs to start building the platform, that set of services that they've identified that their customers need. Again, the practices that the platform team should follow, those development practices, should really be similar to other product teams. They should be building in small batches, seeking user feedback early and often, checking that the requirements are still valid, and making sure that they're minimizing risk along the way of the build process. At this point, it would be very beneficial to have a product manager for the platform, someone who can outline a clear product vision and roadmap, prioritize the backlog in response to user needs, and support external stakeholders.
[00:17:22] Once the platform has a service that's ready to be used by a customer, we know that the next step forward is to move to the X-as-a-Service mode. This means providing a method by which the customer can access the service on demand without having to have any communication or intervention by the platform team. This could look like an API provided to the development team or a developer portal, some way to make the platform easy to consume and self-serve.
[00:17:53] As mentioned earlier, there needs to be ongoing lightweight collaboration, maybe a check-in on a regular basis to ensure that the platform is continuing to meet user needs. Because as needs change, the platform needs to develop and adapt.
[00:18:13] As well as this core set of processes that the platform team should be following, there are other steps a platform team can take to really drive value for their platform and deliver to the rest of the business. One part of the platform definition from Evan Bottcher stated that the platform should be a compelling internal product, meaning that the application teams should want to use it. It shouldn't be mandatory for them. They should be excited to use it. And the platform itself should be the easiest path for teams to get their applications to production.
[00:18:48] This could look like lots of different things. It could look like evangelizing the platform, maybe branding it, giving it a cool name, and marketing it internally such that teams get curious and excited to try it out. Maybe it looks like including inside of your platform the specific security and compliance requirements that are specific for your business. Build those into the platform so that teams know if they're running their application on your platform, then it will meet the requirements of security and compliance, and it means there'll be fewer hurdles for them to worry about than if they were trying to run this themselves.
[00:19:34] It could look like your internal platform provides a specific bespoke tool as a service to your teams, but something that's an internal tool, something that they couldn't get from a cloud provider or an external vendor. It would give your platform a competitive advantage. Maybe you want to be focusing on developer experience, sunshine, and rainbows for your developers, making sure that your platform has a nice developer experience, but basically that it's as easy to consume as possible and is delightful for developers to use.
[00:20:11] All of these steps combined could lead your internal platform to being easy to use, which would lead to it being adopted by multiple teams. You'd have reduced cognitive load on each team, which would then free up developers' time, and this platform would fit the exact requirements of your business. Through the lightweight collaboration model, the platform can continue to remain up-to-date and secure and relevant. Your platform team, through this collaboration, can ensure that if more tools get added in or more services get added in, they can remove services that are no longer being used, they can standardize, and you end up with a platform that is fit for purpose and being used across your business.
[00:20:57] At this point, you might be asking, how do we know this to be true? Or what does good look like? Let's take a look at this year's State of DevOps Report. It states that highly evolved firms use a combination of stream-aligned teams and platform teams as the most effective way to manage cognitive load at scale. These highly evolved firms strive to create a compelling value proposition for application teams that is easier and more cost-effective than building their own solutions. An internal platform, in nearly all cases, isn't something you can buy outright. It's something that is built and tailored to the needs of your technology organization. Finally, not every platform team is automatically successful, but the successful ones treat their platform as a product.
[00:21:46] Aside from this excellent research, there are plenty of publicly available case studies where organizations that I've worked with in the past have talked about the successes they have had. In 2019, I spoke at SpringOne Platform Conference in Austin, and I shared just a few statistics of some customers we had worked with in the past who saw huge improvements in their flow of change by implementing some of these practices. We had one customer who was able to support 1,500 developers with just four operators. We had one customer who was able to, over the course of just 10 months, scale their platform to support 29 teams across four countries and also increase their deployment frequency by more than 3X. We had one customer who saw a 90% improvement in release velocity from around 30 days to two to three days to release. Finally, we saw one customer have an 89% reduction in security patching lead time from 45 days to five days.
[00:22:47] It's clear that for these customers, the work of introducing a central platform team and implementing some of these practices significantly benefited them in several ways.
[00:22:58] To summarize how to tackle the platform gap, we need to organize teams for fast flow, specifically ensuring we have a dedicated platform team; begin the learning process through collaboration; and then drive towards the platform meeting those user needs as a set of services available on demand. We need to take everything we've learned from product management of external products and apply those same practices to the internal platform. We need to understand that driving the full value out of the platform means treating it as a product to be maintained. It's not built and finished. It needs to be maintained, improved, and developed over time to ensure it remains relevant and fit for purpose.
[00:23:42] So, what's next? At Syntasso, we're really keen to continue this conversation with other folks in the community, and we would love to collaborate with you. You can find us at our website. We would love to hear stories of similar pains that you've felt, and we'd love to answer questions. We've also been working on a tool called Kratix, which is a framework for enabling platforms. Basically, it enables a contract between platform and stream-aligned teams to be codified. It's very early in the project. We only released it mid-September, and it's still in beta. If anyone would like to play with it, it's available in GitHub. We'd love to collaborate. Please take a look and let us know what you think.
[00:24:20] If you would like to provide feedback on the tool or on this topic, please feel free to email us at feedback@syntasso.io or message me on Twitter. Finally, I would like to say thank you very much for listening, and I hope you have a wonderful rest of the DevOps Enterprise Summit conference. Thank you.