Transform the Invisible Wall
The problem statement: at least 8 teams across department involved to deliver a new feature, with an average of 4-6 week. It becomes an issue when the senior management team has made a decision that they’re moving to cloud service. Should they do lift and shift, or should they adopt DevOps to revolutionise the way of working? Lift and shift seems to be a default option if the team has no idea; if you want DevOps, you have to know what’s your goal and fight hard. In our practise, senior management has played an important rule to get people agreed.
Top 5 priorities to be addressed along the DevOps journey:
1. Different goals;
2. Ownership: Access and Permissions;
3. Job security;
4. Organisation structure;
5. Compliance requirement.
As well as how to address these priorities; this is a long process and it could go back and forth, people needs to find their foot in the new world and they want to be valued.
At last, I am going to share the methodologies to measure the result for the project as well as the team to ensure it’s a sustainable process. The tool set includes: quantify the value for automation; devOps skill matrix.
Chapters
Full transcript
The complete talk — auto-generated from the talk's captions.
Good afternoon, everyone. Can you hear me okay in the back? Yeah? Cool.
Awesome. So I'm Mei, a consultant from ThoughtWorks. Most of you might have heard of ThoughtWorks. So we're primarily working with our clients to go through their agile journey to make them able to continue delivery.
I'm actually coming from ThoughtWorks Australia Sydney office. So I think in just two days, we have heard a lot of good talks about unicorns, about horses, but today I'm going to talk about a story about the koalas. So we have quite a giant koala, so we'll see. Press your forward button.
All right, cool. This right side here. Yeah. This one?
Yeah. Awesome. Can you see that okay? Yeah.
So just before we're talking about DevOps, I actually want to throw two question to you. So one of the question is, as we all read in the books, so for example, "The Phoenix Project" from Gene, so why do people have different takeaways from that? Another question is, why people have different reactions when they hear DevOps? The same word, why do people react differently?
So I think according to saying our brain is really acting like a program, so it has input and it has output. So what is deciding the output? What is impacting the behavior of people? That's our mindset, that's the perspective.
So this is also a disclaimer from myself as well. So what I'm going to share with you today is really my perspective regarding to my understanding of DevOps, my journey regarding to what was the lesson learned, what's the accomplishment. So I hope you will see that. All right.
So this is a purely very simple agenda regarding to that. So first, I'm going to give you introduction regarding to the context about the client who I'm working together with. So what the organization looks like, just to confirm you whether it's as a giant koala. Another one is really regarding to one of the big challenge we ran into when we started working is the first question is really, what is DevOps?
What is DevOps? So then another reason is when we talk about that to say, why do we need to adopt DevOps? So these are the two biggest question when we start to work with our clients regarding to say we need to go this way, but why? What is DevOps and why?
Then next step, I'm really going to go through the detailed journey from start to the end and really about the future. I would like to make it right at the beginning. So I think in the beginning, I can give you an abstract regarding to what the story look like. We actually have the story.
So we started to work on this DevOps initiative for very big finance project, finance organization, starting from a year and a half ago. So when we started, just nobody in the organization know what is DevOps, what does that look like? And most people are saying, "This won't work here because we are finance, and the regulators won't allow us to do this." And a year and a half ago, just a week before I come here, I'm participating in one of the big workshop in the company-wide is really looking to how to employ DevOps in the company-wide initiatives. So that's it, pretty much.
Yeah. All right. So this is what the koala look like. So it's a top 20 ASX listed finance company.
You can ping me afterwards if you can guess what the company name is. And it has 15K employees, and it has very good agile and lean culture. What does that mean is actually if you talk to any business of this organization, they can easily tell, understand the user stories and know how to split in granularity that everybody can easily deliver those kind of things. So business, they have story boards, they have the backlog, so they know how to do that stuff.
So this is an organization with very good culture. And technology-wise, because it's giant, historically, they have several acquisitions. So that means they started to have several strategic legacy systems they need to deal with. So for this one, we actually started from a system, actually, they have 41 website.
It was actually managing all the line of business including personal insurance, life insurance, banking, commercial insurance, all those kind of stuff. So that's where we get started. All right. So look at these beautiful pictures.
Hopefully, you can figure out where they are come from. I think the, yeah, the first one is about, it's actually a spider house in Philippines. And the right top one is a tree house in New Zealand. I believe most of you can tell what the bottom one is.
I searched this from Google because at that time, this is first time I'm coming here, so this is a picture from San Francisco. So what can you tell from this? So these house look different. They're built in different material, and they are in different countries.
They're in different environments. So you have sea house, you have tree house, and you have the skyscrapers. So how is this related to DevOps? So this is actually from maps.
I really think there's a great similarity between the house and the current DevOps because when we ask about the question, what is DevOps? Because when I was asked, I say, some people said DevOps is about collaboration between dev and ops. Some people talk about it as automation. I think there's all truths or truths in all of those answers.
But what does this mean for organization who's new to DevOps? So-This is actually, I'm going to introduce you to my definition of DevOps. So these are my house of DevOps. So if you look into this, I actually can structure this into three parts.
So the first top one is the business value. So any time when you want to adopt DevOps, there's reason for that. But you just cannot have DevOps as a goal because DevOps cannot be a goal and it is not a goal. So it's really approach to enable certain business value.
And in the bottom, in the bottom is a core that is as a base because the base that including the environment. So the environment including the country, the regulations, the media. Actually, you might wonder why I put media there. We'll talk more later.
And then above that is about people, because we know people is really, when we talk about DevOps, talk about technologies, practice, I think that's not own... For me, they are all less important than people. And then above that is organization. Organization is every organization is different.
So that means we need to treat that different. So that's why I have those kind of thing as a base. And build upon that, I actually have three pillars. So one of the pillars is about principle.
Principle pillar is talking about what's value, what's your belief of the organization? You have to because these are the guidelines regarding to when people have confusions, have concerns regarding to, oh, shall I go this way or that way? These are principles. These are the things that give you the direction.
And then in the middle is about the team. How do we construct team? So then this last one is about practice. I pose that last one because mostly once you have the principle, you have the team, so the team will decide what kind of practice they're going to adopt.
All right. Regarding to why adopt DevOps, I think there's so many people are talking about the reasons. So for me, that's regarding to the first organization we're working together with, it's mainly regard when we get started, it's not very clear. It's not very clear because it was triggered by application team in the first state.
Oh, that's all the things that the cool guys do. We shall do this as well, but why are we doing this? So, again, to talk more about this in the later slide, but I think for this one, the main takeaway from me is really about stay relevant. So because the world is fast changing, so how do you stay relevant to your customers?
How do you stay relevant to the industry? How do you stay relevant to your employees? Because if your employees really, if you want to retain, hire some cool kids, so that's really important for you to stay relevant to this. All right, so this is a simple steps regarding to build the DevOps, your house of DevOps.
You can have your own house of DevOps. You can define your own practice, you can define your own team structures, and you can define your own principles. And certainly, the number one thing you have to do is to identify the goal. So why do you want to adopt that way?
Yeah. So then that's quite easy, is really understand by its develop as a pillar. The last one has to remember this, this is not one-off thing. That means you just cannot do this in one time and say, "I'm done.
I'm done." So you have to have iteration, keep iterating. All right. So now I'm going to share the detailed story about the journey when we started. So this is about discover the business value about why we're doing this.
This is quite interesting because this comes from when the organization actually made a commitment a year and a half ago. So they made a commitment to have partnership with AWS. Then they come and say, "Oh yeah, we want to have our first batch of sites going to AWS in three months." And then they have made another big commitment to say, "Within 18 months, we want to migrate all our applications into cloud." So this is the content when it happens. When it happens, so certainly you read this kind of projects, you already driven by the infrastructure guys.
So the infrastructure team is really just gets application team into the meeting room to say, "Yeah, this is easy. Give it." And say, "So really this is a cost reduction initiative. Let's do this way. Let's lift and shift." All right.
So what does the application team say? Application team say, "All right. We already identify quite a few pain points of the current process, current working environment. We want to have all those kind of shining new architecture in AWS.
Can we do it in different way? Can we go to DevOps?" Certainly, we have certain kind of... Initially, I think I'm part of the delivery team. I really just, I meet with them and say, "Can we do this?" Actually it's no.
It's a fail. It's failed try because we found that, oh no, we just cannot resolve this at the same level. The way we tried is we have to figure out another workaround, how to approach this, how to get infrastructure team to agree to buy in. So the approach we are taking is really we started to talking with executives.
So we work out some business case with some amazing numbers and then kind of presentations just really shows them what's the key benefits we're addressing. One of the key one is really about reduce the time to market, improved resilience and improved quality. So luckily enough, I think that the executive is very supportive. So we go, because at that time, the delivery team and infrastructure team are actually in two departments.
So actually we need to go through two executives to get that in. So but luckily enough, we just got the two executives to reach agreement and thenIn that case, it's not a top-down approach. Then we reached finally, the team, we agreed, yeah, we'll go to this DevOps way, and this is a business value we're going to try to achieve. So that's the first step is regarding to this.
I think overall, if we look back, this is really a process regarding to initially, I think the team thinks they know the business value, but then we cover because when we figure out the conflict between two team, it's actually not known. Then they say after certain discussions, meetings, and finally we have one agreed shared business goal. Yeah. So once we have the business value identified, then we start to understand the base.
So the first thing is about the environment. So this picture is showing that is, I think the right-hand side of this, the first top one is from Sydney Morning Herald. This is talking about the robot is taking away jobs. It's a big statement made by the largest online employment website.
So it's saying the robot is taking away jobs. And the one is regarding from May technology review. It's basically saying that the technology is destroying jobs faster than creating jobs. So imagine that this is really what kind of news that people are reading.
And then in the right side, it's all about bad news about AWS, about all the outages. So you kind of say, yeah, so now it's really to say understand why media is important because we talk about just so many talks of talking about people are feared. So what does this fear come from? It's all coming from here.
So we are living in this environment. This is the environment we're looking into. In order to, I think it's just quite natural for people to fear because they have no knowledge regarding to what it looks like, and this is all the information they're getting. So to get this right, it's really just to say, I think one of the big thing we learn is change management is really, really important in adopting DevOps.
Provide enough the right information, provide the right communication channels is really, really important. Yeah. So after we understand the environment as people, the third level of the base is really we need to understand the organization. So this is a organization definition from the business dictionary.
It's quite interesting because if you samples, first one, organization is a group of people with a specific purpose. It's about the people and the purpose. What is more interesting about to say organization have a management structure that determines the relationship, activities, all those kind of thing because DevOps is supposed to change the way that people are working. So this is actually indicating management structure has to be changed in order to adopt DevOps.
All right. As part of that, we're actually talking about, say, invisible. So what's invisible? Because when we started, it's actually still happening.
Does not mean to say we still get quite challenging questions from people regarding to say, "Yeah, this won't work here. Why do you guys ask going this way? Because we used to work this way. It has been working very well." So it's really regarding to say what's behind all of this.
So I think as professor has done research regarding to two, so people can divide it into two kind of mindset. So one is fixed mindset, the other is called growth mindset. So fixed mindset is you think intelligence cannot be developed. To say what kind of a capability cannot be developed.
And that people with growth mindset is actually this can be changed. So you can learn by doing things and you can have new capabilities. The number one rule of the fixed mindset is really the number one, what's most important thing for them is look good. I want to look good all the time.
I want to perform good, in that I don't want to look stupid. So what does that mean? That you avoid challenges. So if there's something they know they're not sure whether they are going to perform good, they'll probably say no.
They just gave you reasons to say no. So just as security does not like that. But is that true? Have you asked why?
So those kind of question. But people with growth mindset is usually, so number one rule is really learn, learn, learn. For them, it's really, that means every time they're facing challenges, really to say, I want to give it tries and learn how does that look like so I can learn it, really. So it's regarding to people have this mindset.
So regarding to this kind of change because both DevOps and cloud are new to the organization, you do need people who have growth mindset in order to get this moving rather than just say no to this. Yeah. So as we all know that people has mindset, the organization has mindset as well. So similar to this, so organization, you can see the way that organizations responding to different challenges, changes.
You can actually figure out what's the mindset for the organization look like. So for this one, I think for some of the organization, it's not just means the organization is just a, if the organization is a fixed mindset, then it's everything is fixed mindset. It's actually say something, sometimes it could be fixed mindset, sometimes it's open mindset. For this one, it's really the easy way to develop a, how to develop a fixed growth mindset for organization.
So number one rule is to enable steady progress. What does that mean is actually just has been a research doingRegarding to say what motivates people the most? What makes people happy every day in work? That is actually to say, make progress every day, no matter how small is that.
So that's the biggest motivation for people to work. So I think that's very true. Imagine people can make to say, "I had the check-ins and my check-in production," so that will certainly make people. But imagine if another people just coming to certain way is really just say, "Yeah, I'm working on this issue.
I still do not have my access." And seven days later to say, "I still do not have access." Will these people be motivated to come to work? No. So that's the first thing is really important. I think another thing is this is a reminder for the management team.
So how do you enable, provide an environment for your team to make progress? So it's really to say you should really focus on the block and rewards rather than just micromanaging all other things. Another one is really about safe to fail environment. What is this value?
This is actually value to say it's okay to fail as long as you learn from that. And in the meantime, you need to lower down the cost of failure because if the failure has to happen in production, it impacts your customer. That's not reliable. I think another one is to really just say, if you could have your dev environment the same consistent as production environment, so you reduce the cost to failure.
So this is really giving encouraged learnings. The last one I would like to really to mention about is innovation-friendly environment. How can you create that? Because I think right now most of organizations are talking about we are innovative.
We want to encourage innovation in all the areas. How do you encourage that? So it's really about giving people assistance, set up clear goals, provide them enough support when they need it. Just do not create blockers.
So these are the key lesson learned when we're doing the growth mindset for the organization. So, yeah. So right now is really we coming to the principles. So what is most important for the organization?
We had started to worry about, shall we think some of the practice is important? But finally, after a kind of a brainstorming session with various teams, we started to reach out to confirm agreement. So number one is really self-directed team over command and control. Self-direct team.
So number two is cost correction over perfection. Number three is automation over manual. So I will just give you a quick example regarding to how do we build teams like that. So the self-direct team, number one team has to be competent enough.
You don't want a team make silly way, stupid way. So in our case, I think the team is really very experienced because they have been working with the customer's business directly for several years, so they know enough about customer, about business value. And what enables them become a self-direct team is actually the experience I just shared before because we are the first team saying no to lift and shift. So that infrastructure team are working with multiple application delivery teams.
So now other team said no, they're just going lift and shift. We are the first team saying no to that one. I think the team get encouraged when they say, "Oh, all right, the management team actually value that because they actually say difficult. Oh, business value is the only goal that guided what kind of decision is made." They actually say, "Oh, in the future just become to make more and more decisions by themselves." They actually really start to realize they can lead the decision rather than waiting for the decision to be made.
So I think the management team is really providing good support as well, because normally whenever you share some different opinions, you will start with why instead of say, I want you to do this way. They do not give detailed instructions. Start ask why, why do you want to do this? So this is really cool to have a self-directed team.
It's not a self-organized team. This is really about the team can make decision. Because this will certainly help the team once they just certain because if you have to go five levels up to make a certain decision, just waiting time, the people can do nothing, just wait for the decision to happen. So automation over manual.
The way that we approach this is very interesting. I just do not include this in my slide, but we do have one automation roadmap. So we indicate the automation, the overall things that needs to be automating to three levels. The first level is regarding to the configuration, infrastructure levels, and we go into continuous integration.
The third level is about continuous operational. So that is three levels we got drawn into. We draw box about to say in terms of into different colors. So one is red is manual, and yellow is automating, and green is automated.
We show this to the management team. What does this help? It's actually number one is this is going to encourage people when we automate a certain component. When we automate, for example, the EC2 instance, you can see that it become green now.
And number two is really this will help on the collaboration because we had some difficult times to engage with our network security team. They usually just say, "Oh, we don't have capacity to do that, to support you to do the automation." And once I release the first version of that, I actually get call from our network friend to say, "Yeah, we want to do this one. We want to-"We want to help you on this. So number three is really good for other people to have visibility because this is going to give people accomplishment.
Because when we do most of the DevOps work, it's really hard to make it visible. It's really hard. But for having this showing on the wall, demonstrating to other teams, this really makes the team proud. All right.
Actually when we start the project, the project is quite funny. We started with, in order to deliver this one piece of work, we had eight teams engaged. So that's actually the number of teams engaged. And actually, certainly don't forget our friends from risk, from auditing, and from security.
So the eight teams involved, I think I was amazed by the number of teams involved in one of the feature delivery. Certainly, for me, that means handoff. You need to go through those many people to get things done. And the way that those many teams talk to each other is quite interesting as well because they are communicating via work orders.
They do not talk. They say, "If you want me to do some work, create a work order for me." And then another one is really the transparency regarding to the team to team. So how to know each team is actually to, for me, so the green team might actually look like a black box for the blue team because they have no idea what the scale look like and the green team look like. So those kind of things is really interesting and usually what does that mean is usually until your end of feature delivery, you really have some people from security or auditing asking you some question and you just feel like, "I don't have answer for that." And unfortunately, if you don't have answer to satisfy those people, you just won't be able to release on time.
So what we decide is really a self-directed cross-functional end-to-end delivery team. So where we talk about how to build a self-directed team and cross-functional here is actually means we need organization structure to support that. In the past 18 months, we already have three reorganization in order to support these kind of activities. The first small way is we have some middleware system engineer coming into the application delivery team.
And after years and they going to realize, oh, this is helpful because we will be able to transfer the knowledge from the middleware people into the application team, and we'll be able to automate those things and say, "Yeah, we should do this in a bigger scope." Then they did another organization change, and later on, just another organization change is really. But I won't say we're already done enough to support that. This is still ongoing. So this is really based on the kind of priorities where you are regarding to that.
So what kind of end-to-end delivery team gives you? For me, I think for that one is really the team is looking over from creating new features to fix the bugs. And then actually if they run into any infrastructure issues, they need to fix that as well. So this really give you a big view regarding what does end-to-end mean because previously this also changed the view for business as well because previous business really know and really only talk to Brian to get this code fixed and delivered, that's it.
And right now for business they know right now they need to understand those security patching is part of the work as well. Yeah. So when we talk about the DevOps practices, this is talking about the payload practice. When we're talking about that, just too many.
They're just too many. So I think for any new starters, you really just learn this. What does that mean? Where shall I get started?
So I basically grab them into three categories. So one is the essential one. Essential one, I really just think if you do not do this, just do not say you're doing DevOps way. Because this is, for example, infrastructure as code, build for failure, continuous integration, test automation.
This is really as an essential part of DevOps. Then another one is regarding to advanced one. So I think advanced one is quite different from organization to organization, but it's really about visibility, dashboard, everything. You have monitoring and you have operational metrics.
So this is something we are still working on. So another one is really about the visualization. I realize this is crucially important to get this impacted because this is really important to get the team engaged. It's really important to get the business engaged as well because they know this is something real.
And customer side is really about according to the different organization might have different practices. Number one is about tools. So I think today, just two days we heard a lot of tools people are using configuration management tool like Puppet and Chef. So for us, we're using Ansible.
The only reason that's blocking us from using Puppet and Chef is really they require root password, and we still do not get a yes from security yet for us to have permission to use the root password. And another one I'm going to share a little bit more regarding one of the practice we introduce. I'm pretty sure most of you have heard of the Chaos Monkey from Netflix. So that Chaos Monkey thing is really just go to your production region, randomly kill one of your production instance.
It's really to test resilience of your production architecture. And for this one, I'm pretty sure whether you have noticed that you actually have Chaos Monkey for your organization as well. Have you ever run into a issue to say if one of your team member is sick, on leave, or just quit? So some work just cannot be done.So once we step into this DevOps world, those people really just come in, just managers, directors, executives really come in.
What does team look like after DevOps? What kind of skills we need to retain? So this is really a way that we introduce to try to build the resilience into the organization. So the first step is really identify the business goal to say what kind of goal you want to have, then about the skills.
The skills is different from team to team. So I think this one is really important, and your definition of the skill is different as well because Apache means for this team, team A could be different from team B because they usually do different things on certain things. Another one is regarding, it's really simple. So you just ask the team for the self-assessment, then you manage this periodically.
You can do a lot of different analysis on this to say which is the weakest skill area. So you can organize some activities, cross-training or pairing or just on-job rotations, those kind of things to just make sure you have quite a few resilience built in your organization. You don't worry when people just go on leave or people just simply quit. Yeah.
So after this month is what kind of business value we have achieved. So I think this is really straightforward. So for the end-to-end infrastructure provision, so we used to have four to six weeks, and right now it's four hours. It actually could be less.
And another one we want to deployment tally time is 30 minutes or more. Right now, 90% of them is less than 10 minutes. The one I really want to talk about is about infrastructure testing. I'm not sure.
We have routine regular firewall change twice a week. So every time when there's some firewall change, the network security team go in there, just add some firewall, remove some firewall. And in the next day, you might not just say, "My application is broken." Then you spend maybe enough eight hours to figure, well, it's not working. Then you figure, oh, maybe they change something.
So what we have done introduce, we introduce a kind of infrastructure regression testing using Ansible. We just run this kind of regression automated testing daily, so every night. Especially the day when they make some firewall changes, we check the job. So there's an outreach job telling was our jobs are failing or not.
We actually introduced this to our network security friends to say, "We introduce some changes around this, so we'll be able to know our application will be broken or not." Yeah. Another one is really about infrastructure security patching. We all know in the old days, actually, they still have infrastructure doing this way. So if there's upgrade, you need to, our system engineers really need to log into hundreds, thousands of machines just to make sure that it's updated, that usually you cannot do that in daytime.
You have to do it in night. And right now, because all the infrastructure has been scripted, what does this mean? It's simply done by deployment new version of the script. So just one clicks and hundreds of hosts can be updated already.
So this is something that our ops friends really, really like because this significantly reduced their weekend or night support. I think the most of the biggest accomplishment for this one is what really makes me happy is just starting from have zero knowledge or resistance to DevOps, then coming to a stage considering this, we need to have DevOps as a strategy for this company. So what I would like to need some help is really, first one is regarding to access control. It's quite interesting because we have a long lasting debating always from application delivery team to the system engineers to say.
Application developers really saying, "Oh, I need to have root permission so I can install whatever application I need." And our system engineer friends always come back, "So this is not allowed. According to our security policy, this is not allowed." I just wonder whether someone has some ideas to say, how did you get root access to that? Another one is regarding to what kind of access ownership you can have up to, because right now our automation have reached a certain level, but the team is really thinking about, we want to have automation up to VPC level. So we want to own end-to-end network, not only the application part.
We want to have that big part because we are asked to take care of every risk, security issues for the entire network. Why don't we own them? So that's one of the things I really want to get some help. Another one is about open source.
All we did in the organization, we programmed very highly regarding to open source internally, but I'm just thinking from, if I step back from another level looking at all the enterprise clients were all looking into pretty much the similar things, are able to open source those kind of things to a level so we can actually reduce some waste on the industry. Yeah. I think that's it. Thank you very much for the time.