Accelerating Towards Serverless-First with the Value Flywheel Effect

Log in to watch

Amsterdam 2023

Accelerating Towards Serverless-First with the Value Flywheel Effect

Many organizations have migrated to the cloud, but the modernization journey is the next challenge. Orgs must combine business goals with technology goals to maximise time to value and accelerate with the cloud. The value flywheel effect has just been published by IT Revolution which is a playbook on how to make that journey.

https://itrevolution.com/product/the-value-flywheel-effect/

Using experience from case studies in the book such as Liberty Mutual, BBC, A Cloud Guru and more, Dave will discuss the four phases of the Value Flywheel (finding Clarity of Purpose, creating a safe environment for challenge, Serverless First & Next Best Action and Long Term Value with the Well-Architected Framework) and how customers can use the Wardley Mapping technique to create an effective modern cloud strategy.

There will also be some real life stories around serverless myths and pragmatic ways to ease the modernization journey.

Chapters

Full transcript

The complete talk, organized by section.

David Anderson

Hi folks. My name is Dave Anderson, and today I want to talk to you about accelerating towards serverless-first with the Value Flywheel Effect. Thanks for coming to the talk. I really appreciate it. I'm an architect at GP, Globalization Partners, so I'm actually a practicing architect. I practice what I preach.

So really, serverless-first. I want to preempt this by saying there is a lot of interest around cloud migration, but for me the big incoming wave of disruption is actually modernization. That's the thing I see on the horizon for many organizations. We've moved to the cloud, we've got the cloud socks, we've done all that. How do I modernize my stack now that we're actually off-prem? For me, the answer is the serverless mindset, and I'll get onto that as we go.

What's my experience in this? I worked with Liberty Mutual for many years, and when I was a director of technology there, like a CTO role, we started the journey in 2013. I was lucky enough to be at the table when we started speaking with AWS about moving a lot of our workloads. There were a couple of fundamental principles that we put in place. We used AWS as a partner, not just a cloud vendor, a partner.

We had this idea of serverless-first and code as liability. Part of my team drove this messaging: how can we get our engineers to write less code, think more about the capability, and get that as cheaply and efficiently as we can build? How do we build in really good quality controls, and how do we give teams the autonomy to move fast?

I call this the second transformation, the modernization piece. I wrote some of this up as a case story before I moved on from Liberty, and I captured a lot of this thinking. I had a model that I used called the Value Flywheel, with this idea of how we join business and technology. I spoke to the president of the business. He was saying, I don't want to hear the word Lambda or AWS. I don't care about any of that. Just how can we go fast, go safely, and at a decent cost? That was it. So we coupled this with the idea of how we have a nice flywheel effect, how we join these things together.

I'll talk through these. The first central idea is Wardley Mapping. In my team, myself and the architects used Wardley Mapping as a way to create perspective. How could we assess and make sense of the landscape?

Then there is this idea of clarity of purpose. A lot of teams were building, but they didn't know why. Leaders would say we need to have certain KPIs, we're working on north stars, and the teams were completely unaware. So how do we create a common clarity of purpose?

Then the idea of challenge: how do you create an environment within the teams where people are happy to say, why are we building this? Is this really the metric that matters? Is there a better way to do this? Create that environment of challenge.

Serverless-first: once you get clarity of where you're going, how can we make developers build really quickly and add quality? Then finding long-term value: how do we ensure we're not just making quick builds that we can build and throw away, but that we want sustainable practice? This is coined as the Value Flywheel Effect.

We've captured this in the book, The Value Flywheel Effect, that just came out on IT Revolution just before Christmas. I wrote it with my two co-authors, Mark McCann and Michael O'Reilly, and the three of us are architects at GP practicing that same model. The book is really about how you accelerate your organization towards the modern cloud. Today I want to share some key learnings about those five phases and give you some takeaways.

First I want to hit something that's been popular, the recent Twitter storm of serverless-first, or Prime Video. It seems like every six weeks there is a Twitter storm where someone declares a technology dead or alive. The thing about serverless-first is that it's not serverless-only. People think it's Lambda or nothing. I'd be the first to say that if your technology strategy is "use Lambda," then you're not doing it right. You are not doing it right. I thought this was funny: only Sith deal in absolutes. Ask any architect over the past couple of decades, should I use technology X? The answer is always, it depends. Where did we lose that perspective?

It depends what problem you're solving. If you're running on-demand video with millions of subscribers, Lambda is probably not a good option. That was a very well-written article by the Prime Video team. If you really want to hear about that serverless-first assessment of that article, I would go and read Adrian Cockcroft's article on Medium, where he breaks it down.

What serverless-first is, is using the cloud provider's managed services to build quickly at quality. Then you're in that evolutionary architecture state where you can make choices as you move forward. There's a nice thing about don't build yourself a one-way door. Always make technology choices that you can pivot away from. That idea of evolution is really important. It challenges me when architects say this is the only way we'll do it and we're going to bake ourselves in the corner. That's not a responsible way to behave for your business. Think of serverless-first as a starting point, not the only place you need to get to.

I want to hit each of these five areas. First, Wardley Mapping. For me the question is: how did a couple of engineers sitting in Belfast influence a Fortune 100 company? The answer was Wardley Mapping. We were able to sit and assess the landscape and decide what were the best moves to make. With any transformation or big technology drive, there are a hundred things you could do. What's the thing that you should do to really make a difference?

If you haven't heard of Wardley Maps, I would definitely look it up. There is a bunch of stuff in the book about it, but I want to give an illustration. What's the role of the architect in this? This is a really scrappy map that myself and a few of the team drew probably eight years ago when we were trying to assess the landscape. Maps are ugly. They're not correct. They're just something that you throw up. It helps a team align on a way of thinking.

For me, what the architect often did here was facilitate. The architect would maybe say, what happens if this technology evolves? Or what happens if there is an inertia point here where we can't evolve that? Ask leading questions and find out what's the really important thing to do. You'll spot weak areas: we have a key area of our business in custom build held together with sticky tape. Let's address that. Let's focus on the right areas and align on the right thing to do. A nice thing about mapping, which is often hidden, is alignment and collective effort. You're deciding what's the best effort to go forward.

A really nice way to illustrate this is this map here, which sits very well with Patrick's talk this morning about ChatGPT. The way you read a Wardley Map is you usually start at the bottom and work your way up. This technique is by Simon Wardley, the researcher. Visibility is on the y-axis: things at the top are more visible, things at the bottom are less visible. There is an evolutionary axis from left to right: Genesis, something brand new, never been seen before; custom, we know how to build it but we're not quite sure; product, this thing is now in demand; and commodity, this thing just exists.

ChatGPT exploded on social media around the start of February. Like many of us, I'm probably the same: you're on the edge of AI, you deal with it, you understand it, but you're not a specialist. Six weeks later, Mark Craddock put this map out on Twitter, and within a minute I was able to say: I can see through the ChatGPT Twitter storm and I can see what's important.

The bottom right has things that are invisible and commoditized: OpenAI, LLMs, API, and cloud. There is a value chain that ends in tools. This tells me that the toolsets are mostly commoditized. These things have been within people like Facebook and Google for many years. We are now just accessing it. So building a better tool for ChatGPT is not a good avenue to go down.

There is a more interesting value chain up the middle: vector DB, text splitter, embedding vector, prompt templates, QA prompt techniques. Prompt engineer, what is that? That's interesting. I can now see that the focus I need as a consumer of ChatGPT is prompt techniques and this prompt engineering field starting to come about. Within a minute of seeing this on Twitter, I could see through all the activity around ChatGPT and think, right, that's where I need to focus. Even some of these pipelines around embedding vector DB, Pinecone, these are things that will evolve. Someone else will look after that. I can focus my time high up the value chain and really leverage ChatGPT.

That's the power of something like Wardley Mapping. You can sense-make these very complex technical areas. I would follow Mark on Twitter. He's got a bunch of brilliant maps and great prototypes. For me, that's the power of mapping: you can straight away make sense of a very complicated environment.

The next phase is clarity of purpose. One of the things that I did quite a lot, and Mark McCann, one of the co-authors, does this quite a lot, is use the North Star Framework from Amplitude to ask teams, what is their north star? What is their metric that matters? You would ask, what's the north star metric for this team? Say it's number of deploys. That's good, but it's not the north star of us as an organization. If you're Spotify it could be minutes of music listened to. There is a north star metric, mid- to long-term impacts, input metrics such as breadth frequency, and then the work to drive this. It's a little like impact mapping.

What you're starting to do is find out who are the users, what are the needs, what's the scope. You're asking questions about the work that we're doing and getting a good picture. This is a great way of distilling down a piece of work and having something nice and clear. You may be thinking, should the product manager do that? That's not my job as an engineer. Again, that's the wrong idea. We are the business. You want to be sitting down as a team, product, engineering, legal, whoever else, and figure out what we're all working towards and agree on a north star metric.

It's not to say the other metrics aren't important. You still want to think about uptime, time to market, et cetera. But as engineers we often obsess over the things on the right. They outweigh the thing on the left, which is the business goal. For most of our leaders, the stuff on the right is table stakes. The stuff on the left is the thing we really want to talk about. It's about finding that balance.

A brilliant example was a sketch from Simon Wardley. He met the A Cloud Guru team, the cloud training company that was acquired by Pluralsight, at an event in 2017. They were doing a lot of serverless events and said, Simon, this mapping thing seems interesting. Can you do one? Literally in the green room at a conference like this, he scribbled out this map and basically mapped out their entire company strategy. They realized they were focusing too much on the community, on conferences and other things, and needed to focus on their learners and engineers. They effectively pivoted their model and created focus. I spoke to a few leaders in A Cloud Guru, and they were since acquired for I think $2 billion. A simple napkin sketch helped them say there are some things evolving, but you need to focus on your user need, which is your developer who's looking to learn. There is a proper case story in the book about how they did that.

The next one is challenge, and this is really important. Remote work has completely changed the environment as we work. The idea that a team is a bunch of people who sit in that bay has been blown apart by remote working. The things that bind us together in the physical location may not exist in the remote location, so we need to think differently about what that looks like.

I always think the system is more than the code. There is that socio-technical aspect: how your engineers work with each other, how you work with other teams, whether they understand where they are in the technology system as well as the organizational system. There are a whole bunch of things that were maybe implicit when you were physical, but not quite explicit when you're remote.

If you haven't already, I would certainly look at Team Topologies as a way of thinking about teams and technology. Don't assume everyone understands what this is. Take the time to explain Team Topologies and these systems to other peers in your organization, because getting that right is absolutely priceless.

That creates a nice environment where you have enabling constraints, where you're thinking about what the absolute must-dos are. If you've got a team API, it must be correct. Maybe there are certain ways that teams go to production. Maybe there is a build-it-own-it-run-it constraint, whatever those constraints are to keep your teams operating. I don't know how many teams I talk to that say, yeah, we're a stream team and we're building a platform and we're helping other teams. You're like, ah, okay. Putting enabling constraints in your organization will help your teams move quickly.

A great example is from a startup called Workgrid, where the CTO, Gillian McCann, had probably the best enabling constraint. The idea of inviting challenge in the organization as a CTO was: our technical strategy is really simple. If you get an idea, let's talk about it as a team. Any idea has value, but we all know what our enabling constraints are from a technology perspective. Let's just evolve quickly. No big process. Let's fire that up. The best way to introduce challenge is to give people ownership and empowerment.

From a serverless-first perspective, the next best action is such an important concept. There was a nice slide in the previous talk about the AWS Shared Responsibility Model. You often see companies that migrate to AWS infrastructure, maybe with separate availability zones or edge locations for enhanced performance. That's a powerful move and there's loads of value in it, but I would describe that as basic cloud migration. Now you're treating AWS as a data center. In fairness, you probably have a really nice data center run by professional people, but you still effectively have a data center elsewhere.

To really leverage the power of the modern cloud, you need to lift that value line. What can your cloud provider do around compute, storage, database encryption, platform management? This is the same not just for AWS, but also Google, Azure, whoever. Can your cloud provider be your platform team? Then you focus on what you want to build on top of that. Helping teams move up that line is powerful. They really move at speed.

One of the things about serverless-first that is not understood too much is that it's effectively an operational construct. When you say serverless, people think of Lambda. There is a spectrum. On the left, more operations: virtual machines, MySQL on-prem, storage on-prem, EBS, Hadoop on-prem. On the right, things completely abstracted away behind a service: Lambda, DynamoDB, S3, Step Functions, EventBridge. These are basically an API call away. What I would say to teams is start on the right. Don't think you need to start on the left and slowly evolve to the right. Start on the right. If you're using Lambda and it's not fast enough, or your throughput is quite high, fall back to Fargate or managed containers. It's okay to fall back, but start on the right. You see people installing massive complex solutions and locking in high operational overhead. Think of serverless-first as an operational construct, not just as a single service.

Since the book came out, I've been speaking to lots of companies and leaders about these concepts. One question I keep getting asked is: what's the impact if we go serverless-first? What's the impact on our teams and engineers? What are the things that are different from traditional work?

There is definitely something around your engineers needing to be really close to cloud principles, not just the ability to deploy something to cloud. They need to understand true cloud-native principles, and leaders need to understand those as well. System design and vendor architecture become really important, so people can think in systems and not just low-level components. There is a commercial piece: FinOps, cost and value, build cost, run cost, and value delivered. These are now part of what engineers understand.

There is also something around learning, teaming, inter-teaming, and the interpersonal skills of your engineers, plus growth mindset. The Modern Software Engineering book by Dave Farley is a brilliant book that describes these concepts and what this new norm looks like. There are solid principles like build it, run it, own it, rapid feedback loops, a lot of things we've been talking about in the DevOps movement.

To illustrate that, I put together a quick map. If you're a VP of engineering or technology leader, your main need is probably a high-performing team. The first dependency is technical skills: cloud expertise and system design. People often stop there: if I hire a bunch of really smart technical people, I have a high-performing team. But there is more to it. The next dependency is interpersonal skills. You want people who can work well within the team and interface well with other teams. It's a bad sign if a team is fighting with another team. That shows a lack of maturity in engineers.

You want a growth mindset: people not afraid to tackle new technology and new problems. After interpersonal, you've got commercial thinking. You want people value-aligned to what we're trying to do as an organization. You want people aware of run cost and build cost. You don't want people refactoring something to save ten bucks when it costs ten grand to do the refactoring. There is something about empirical evidence. You don't want people doing work based on gut feel or "I think this will be good." You want to be data-driven.

Then from commercial thinking you've got modern software engineering: things like build it, run it, evolutionary architectures. For me, the two things I'd look to evolve to the right are modern software engineering principles and commercial thinking. A map like that helps me and my peers talk about where we need to focus as we grow the organization.

There was a nice talk by Werner Vogels, the CTO of Amazon, at re:Invent last year on event-driven and the world being event-driven. When you get to this idea of modern cloud, this modern software engine, everything becomes event-driven. That's the path we go on. It's more async than synchronous.

What does that mean for the role of an architect or technology leader? I'm a practicing architect at GP, Globalization Partners. What I've found is architects are more of an enablement function. They're not sitting in a team drawing UML diagrams or trying to code things for the team. What you're really doing is facilitating thinking and making sure the teams are moving on.

We see lots of event storming, sitting down and trying to figure out what this landscape looks like, explore, and find the things we don't know. Alberto Brandolini has some great stuff on event storming. If you're older, realize it's just fancy BPM, but we'll call it event storming for now. It's a brilliant facilitated collaborative technique. The North Star exercise is something we sit and do with groups of people, with cross-functional teams: what are we trying to achieve? Let's map this out, usually on Miro or something, and figure out the metric that matters.

When you start into event-driven concepts, you use something like EventBridge as your event backbone. You get into principles of eventing design, bounded contexts, domains, a lot of domain-driven design, and principles around building a large distributed system built to scale. Then example mapping: use behavior-driven development and example mapping to make sure teams understand the right behaviors they're trying to build. That's a very different picture of an architect from 10 to 15 years ago. The technical complexity is still there, but you're trying to offload some of it, or help teams get through it themselves and help them think bigger.

For me, what you're trying to do in this serverless-first organization is ensure engineers have high agency with low barriers. The system you're building, the platform, has low barriers so they can move fast, and the teams are empowered to move quickly. You're trying to remove friction.

For the last phase, long-term value, I call it a problem prevention mindset. How can we remove problems before they happen? The hero engineer who went in, sat on support all weekend, and fixed something, that's brilliant. But the question is, how could we have stopped that before it brought down production all weekend?

There are two areas I want to talk about: SCORP and engineering excellence. SCORP is an acronym we use for Well-Architected. What combines the two is the Four Disciplines of Execution, or 4DX. It's a model to help teams empower themselves to create their own direction. I find this to be brilliant. There are four stages.

First, the discipline of focus: work with the team to make sure they have wildly important goals. Ask the team, what do you need to fix? What's important for you?

Second, the discipline of leverage: work on lead measures, not lag measures. Help the team identify lead measures, consistent with the North Star Framework. What are the lead measures we need to focus on?

Third, the discipline of engagement: use a compelling scoreboard or scorecard. That's a way for the team to visualize something. When we first started, we had fancy scorecards and scoreboards, but we've simplified it. It's effectively a Confluence page with tables. The simpler the scoreboard, the better, because that empowers people to change it. It's not fancy graphs.

Fourth, the discipline of accountability: a cadence of accountability. At the end of every sprint, every two or four weeks, we'll check in and review these stats. The important thing with these metrics is we always focus on the trend. We don't compare teams. There is no such thing as one team being better than another team. You look at a team and their trend over the past couple of months and see if they're trending in the right direction.

For SCORP and Well-Architected, we use the AWS Well-Architected Framework, which has six pillars: operational excellence, security, reliability, performance, cost, and sustainability. Each pillar usually has about 12 questions, and each question has best practices. We might focus on reliability and dive into what the team is doing around DR, resiliency, chaos testing, outages, recovery times. This is a framework where we can define what we mean by good architecture. In the past I've heard people say, architecture to me is... no, architecture is a thing. We have strictly defined that these six things are what we mean by good architecture.

For the SCORP process, the single dashboard contains tables. At the top there are business metrics specific to that value stream. SCORP is the acronym for the five key pillars. We haven't added sustainability yet; we're getting there. It's security: threats, mitigations, threat models, vulnerabilities open. Cost: cost per component, per cloud service, where the big costs are, what you've improved. Operational excellence: usually DORA metrics, deployment frequency, change, MTTR, etc. Reliability: what incidents happened and what recovery was like. Performance: errors, ingestion stats, processing times. Finally, if we do a deep dive, a Well-Architected review, are there big findings?

Having that in a simple wiki page where we ask the team to drive it lets the team decide: we want to focus on performance because that's where we think we need to focus. Let the team drive it. I think of SCORP as, once you educate the teams in what this is, a bit like eating fresh fruit and vegetables, your five a day. You know what works, you know what's good and not good. You let the team decide and drive the conversation.

The second thing is engineering excellence. People ask, what's good software engineering? It depends. One thing I was able to do, covered in the book, is take Dan Pink's Drive, about motivation and the idea of mastery, autonomy, and sense of purpose, and tweak it slightly. Mastery is technical mastery: how good are the skills within the team? Autonomy: is the team functioning well so it can move quickly with minimal friction? Customer obsession: are they very clear on those north star things? Usually a high-performing team will focus on customer obsession because the other two are nailed. A brand-new team might start with technical mastery: maybe they need to lift their cloud skills, maybe do serverless learning.

Ideally, you'll have many teams at very different stages. This is what I would use for our concept of engineering excellence. We dashboard some of these things and celebrate the improvements the team has made using 4DX.

For long-term value, you have organization-wide definitions of good architecture. We mean SCORP, and that's what that is. Engineering excellence has a very specific idea of what these things are.

A nice example in the book is the BBC. People often say serverless is fine, but it doesn't run at scale. There is a brilliant case study in the book around the BBC News team completely deploying that solution using serverless technology. They have eye-watering requests, like 2.3 billion requests served in the month, 3.3 billion serverless functions invoked, and 1,700 releases during a month. Anybody familiar with BBC knows that when there is a spike in news, that thing is absolutely hammered. The team said performance is beyond anything they've ever had before. These techniques do work, but you need to put these things together.

To summarize, I tried to put five practices around creating an atomic habit around Wardley Mapping: learn the shape of Wardley Mapping and how you can use that within your team to sense-make; be very clear in your north star and use that technique or something similar to align business and technology goals; look at the socio-technical picture and start with Team Topologies; don't be afraid to explain Team Topologies to your peers and don't assume it is understood; lean into serverless-first as a concept and mindset, not blindly using Lambda, but the broader approach I call modern cloud; and create a heartbeat within your organization for engineering quality, engineering excellence, and good architecture practice.

Those are some of the key messages in this work and model of the Value Flywheel. The idea of the flywheel is that once this starts to turn, you can keep that turning. It's a continuing cycle of improvement.

People often say serverless must be easier because there are fewer things you have to do and it's higher up the value chain. There may be fewer things you need to do from a technology perspective, such as fine-tuning hosts or virtual machines, but there is a wider scope of responsibility. I would say it's not easier, but it's better. Gillian McCann has famously been saying this for years: it's not easier, but it's better. When you create that high-performing team, you give them ownership, empowerment, and responsibility, but you also need to give them a platform where they can move very fast and at speed and quality.

That's pretty much me. The book has been out since Christmas. We have a blog called The Serverless Edge, where myself and two co-authors create content. We also have a podcast, Serverless Craic, which is a bit lighthearted.

The help I would need from this event is: as we move on to modern cloud and serverless-first, I'm really interested to hear about that modernization journey. Where are you in that modernization journey? What are the obstacles, the inertia points you're starting to deal with? There is a broader discussion to be had around how, as a community, we overcome these. The DevOps mindset and approach is absolutely perfect for where we're going, but we're starting to move up the value chain and there are different techniques we need to apply. I'm interested to hear your opportunities, challenges, and learnings from that modernization journey. Thank you.