Taming the Hybrid Beast with DevOps
Taming the Hybrid Beast with DevOps
Chapters
Full transcript
The complete talk, organized by section.
Wesley Pullen
Good afternoon, everyone. I am Wesley Pullen.
Unfortunately, my very close friend here, Gary McKay, could not make it, but his reasons are noble. He had to take his daughter to college. She enrolled and got accepted into Cornell, and so it just so happens that they had something going on early. She's active in band and some other things, and so he elected to take his youngest, I guess his last one, he might be an empty nester now, his last child to college.
So I'm going to represent what we did collectively. This is the work of both Electric Cloud and Somos, but there are some contractors and some others involved. So this was a large team collaboration. I get the honor of being able to share some of these insights in our DevOps journey together with Somos.
This is all Somos proprietary stuff. I've had to mask some of the stuff I'm going to show you in the end that we're doing with them. But I think it hopefully will be informative for everybody.
Okay, so let's begin.
So, taming the beast. Just a little bit about Electric Cloud. I don't want to make the assumption that everyone knows who Electric Cloud is. My name is Wesley Pullen. I'm the chief strategy officer. We've got sprinkled out through here quite a few of my colleagues here.
My focus has been release automation from day one, before Gartner and Forrester would even mention the words. So this has been very exciting. Electric Cloud is all about helping companies adapt to accomplish the needs of DevOps.
We believe that companies make large bets on DevOps, very significant bets: people's time, their energy, their wisdom, their thoughts, the tooling. And when you're making those big bets, what can happen is collision between development trying to move fast, operations trying to keep up the pace and keep control.
I've tried to liken it to my 16-year-old when he first got his iPad and his iPhone. His attitude was, "Dad, just give me an unlimited credit card, and I promise I'll only get these type of games and this amount of money."
That did not work. I'm more of the operations parent that says, "No, there's control. I want to check the rating of the games, how much they cost, how frequently you're downloading, what you're doing." So, absolutely not. I was more operations-based. He was Agile, wanted to download a new game every two hours, and we had to kind of figure out how we're going to pace it together.
Electric Cloud's all about helping companies bridge that gap, and that's our focus. We did a very similar thing with Somos, as you'll see.
Somos is very interesting. They used to be called SMS/800. They are the progenitors, if you will, of the Service Management System, the database, and the registry that handles 1-800, 1-866, 1-888, 877 numbers. Very significant. They just rebranded and renamed themselves in 2015. They've been around since 1993. And it's a very interesting registry, the largest, the main registry for all of the 800 numbers or toll-free numbers in the United States and Canada.
They're very customer-centric. A lot of contractors, a lot of employees, but it's all about their customers. They work with what we call RespOrgs, or Responsible Organizations, and service providers to ensure that they get the registry and the information that they get.
It started off with all being mainframe, and you'll see how that journey still continues and where we're at thus far.
So this is what I decided to do. Oh, this is not the most recent one.
Ah, that's okay. This is not the most recent one. Is there any way to upload the most recent one? I was figuring you could build off some visuals.
Okay, could I plug in? Because I defined some of these, and it'll make it easier instead of just throwing up terms and...
Is that possible?
Sorry, everyone, I thought this was the most recent slide, but it looks like it is not. So we're going to make a quick ad-lib shift, and in the meantime, Sam is going to do a quick dance for you in the background. No, I'm just kidding.
Okay, let's make this happen. This will be a quick change. Hey, if this breaks my laptop, are you guys getting me a new one?
Yeah? Sure?
All right, no worries. We can make that happen.
Let's see if this will work. Don't worry about the guy behind the curtain here. Will this kind of flow in from you guys? Let's see if it works.
It's not up yet.
They're going to change the feed. The reason why is I added in some content to help define, because if you're not in registry or telephony or messaging, it can be a little jarring. So what I decided to do is, I try to espouse not sharing information where it's not defined what we're talking about. It helps to define it.
There we go.
To try and define some of the terms. SMS/800 is really standing for Somos. They rebranded. You can see that stands for the 800 Service Management System database. That's the core.
The TFN Registry is where they got their start. The registry's all about the database. It started all on mainframe. So taming the hybrid beast, it starts with this beast, this very large registry, and start to peel back services off of it and expose services to it.
So we're at the stage now where it's no longer just all mainframe. We started to uplift off of the mainframe some of those services. It's still in existence. It's still a database, but API management and other layers, no more future development is on the mainframe anymore. So they're migrating, and the reason why is because people are retiring, and they want to be able to start moving off of the mainframe to have that tribal knowledge move over. And you'll see some of this in a second.
So that's the TFN Registry.
The TSS Registry is about texting and smart services. You'd never be aware that for 1-800, 1-866, most of the collaboration... There's some videos here that I was trying to provide, but in the interest of time, I couldn't cover them. It shows kind of the growth.
Does anybody know when the first 1-800 number hit the world? Anybody pick the timeframe, a year?
Okay, I'll give it to you. Is it in the '60s, the '70s? How many people think it's the '70s?
'70s.
Okay. How many people think it's the '80s?
'80s.
How many people think it's the '50s?
All of you are wrong. 1967, AT&T released the first 1-800 number in the world, and it started to route. It was a manual person plugging it in and trying to do the routing, and now all of this is done through technology. Major shift.
RouteLink is part of the system that allows API access to expose all of this information. It was never exposed before. So RouteLink is a Somos technology. TSS Registry is Somos technology, and the TFN Registry is one of their hallmark technologies.
CT Services is all about the training and consulting services that they launched in order to expose access to what we call Responsible Organizations or response providers and service providers.
And this is what it looks like when we start peeling it together. Now, this makes a little bit more sense. The green is the messaging. That's where the TSS Registry comes in. The pink that you see is all the voice stuff.
You see CMS covering the top and CT Services on the bottom. One is about the database that expands their whole network there, where they draw all the details from. The bottom layer is all about the exposure of how they consult and train and continually get people updated and exposed via API.
And part of that journey wasn't just DevOps solutions like Electric Cloud's ElectricFlow. There are some other pieces to this puzzle, as you'll see. It's rather elaborate.
So this is how it began. I didn't want to cover all this stuff. This is very complex. This is just a small piece of the SMS/800 toll-free number registry, and what you see to the left versus to the right is a lot of this pink that we shaded in eventually became taking away things from the mainframe.
The complexity was that they had all kinds of technologies. It has been said it's like a Noah's Ark there. They got two or three of everything, right? There's lots of technology there. And so the goal was to start offloading and orchestrating a lot of this stuff.
So this was the TFN Registry side.
Then we have the TSS Registry side. Again, yet more technology, the difference being that you have message providers, you have outside sources having to go through the registry in order to get routing data. You got messaging providers that are coming in. You do query and response through hubs, and then this is how the toll-free messaging, text messaging works, and a lot is going through text today. An enormous amount goes through text.
I think it's like 80% is sending things through text versus tablet. There's a wonderful video that Somos has provided. It's tracking over the years, like a 30-year span of how text messaging has now taken over in terms of the largest communication system. And what's interesting is the way this all gets routed. They're in the middle of making that happen through the registry.
Finally is the RouteLink. RouteLink is an open API communication bridge. It's all about making sure that you get access to query things so that you don't have to go and call in, say, "Well, I need to get a list of X, Y, and Z." They started to open up the APIs so that it would make it easier for consulting, for them to get their products out, and things of that nature.
So that's kind of the high level.
The final thing is CT Services. This is really about the training services, but this is how they began to expose it. You see the registry is a part of that. They train on SMS/800 and how it's powered. They train on RouteLink for the service providers, how it's queried. All of this is about the training platform that they provide for their customers.
Now, all that, this is the journey. This was the situation that they went through.
The business wasn't able to respond. In the beginning, the business wasn't able to respond. They were 30-year-old mainframe. Their mindset, and I think part of the journey for DevOps for Somos was not just the tech. It wasn't about the software. It was about the mentality coming from COBOL mainframe. That is not easy to just uplift and say, "Hey, we're going to start doing Java and distributed microservices." It's a large leap to start doing it.
So we started, you'll see once we start getting to the solution, we called it this kind of PLL: pizza lunch and learns. They started really doing pizza lunch and learns. Most popular is the meat lovers pizza. I think the second came in was like a veggie with extra mushrooms and olives. I personally like plain cheese, so I always got outnumbered, but I had to kind of pick off the stuff I didn't like. But that's how they got started. The menus expanded, but it all started with PLL, pizza lunch and learns, to get people from the top.
This was not a, "Hey, we got some techie people that want to do this." This started with the CEO on down, that we need to make a shift. We have to start with our mentality.
It went back to the kind of, you'll see a framework that we put in place called the Somosian DevOps Framework, SDF.
So it wasn't future-proofed. It was difficult. Impending retirements, as I spoke about, increasing demand. You guys are probably familiar with this, but this was the challenge.
We came up with a quote that kind of describes something that I think is very important to DevOps that we don't hear a lot: you take great people working within poor systems, and you're going to get poor results. It's no different than having a bad process and automating it. Then you're just going to have automated bad processes. So you got to start in the beginning.
This is what those lunch and learns were about. They had great people, very solid. The average tenure there, five, seven, 10 years or longer. So the team was very solid, but they weren't working with the right set of tools, the right mentality, and the right processes.
And I'm not speaking just software, I'm talking HR. How are you compensated? How are you taken care of? What are the performance reviews at the end of the year? If you're measured one way, but you're being expected to deliver on something else, it becomes a mismatch. So it starts off from there, too.
Then from there, moving off of the mainframe. This is a process. They are still not off of the mainframe. Let me make that very clear. The mainframe still has a position at Somos as a database service, but they did migrate things off of the mainframe, but it's still there. So it's a process.
The journey is they have cloud data center and on-premise environments. Their migration from the mainframe continues. They stopped development on the mainframe. That was one. It's still used as a database server, but that's it, just to extract data. And now they have APIs to manage the middleware. So they started slowly lifting and shifting and saying, "We're going to do this from a microservices perspective," which you'll see up at the top. And that was both mentality, training, as well as software.
And then they have large contractor teams, so there's lots of contractors and people have to work together. Those pizza lunch and learns, the PLLs, helped because you got to bring together not just the internal team, their external providers as well. And so they're all participating in the lunch and learns.
And in trying to connect some of the toolchains, what we see in DevOps is there's always a bunch of tools. They all have their own logging framework. They all work in a certain way. We get used to all these tools, but then somebody's got to be responsible for bridging the gap and bringing in these technologies. And so Somos was interested in taking advantage of being able to do that as well.
So this is how we did it. This is kind of a part of that journey, architecting the right solution.
Again, first things first was the PLLs, the pizza lunch and learns. Working hand-in-hand with ops teams. It wasn't simple. The lunch and learns made it easy because it was this thing called relationship. It's the one letter that we didn't put in for DevOps.
We talk about CAMS, comms, culture, and all these other aspects. I think there's another one that we need to split out from culture called relationship. We're all in the people business. We work with people, not just machines, and if you don't have those relationships, you can't build trust.
So trust was started by building the relationships. The pizza wasn't about the food, it was about building the bridge between relationships, to be honest with you. So that was important to us, was the relationships.
Then we started to doing the decomposition, decompose the mainframe legacy stuff into microservices, start merging and re-engineering processes. Started focusing everything in AWS.
All providers, all DevOps technologies had to be set up and work seamlessly in an AWS clustered, load-balanced environment. Everything, whether you're an on-premise provider or cloud-native provider, it had to interact. That was their system. They went to AWS.
Okay, and so you also had to be able to collect and share deployment release metrics. They're big on feedback, so everyone wanted to be able to see those metrics.
They had to gather inventory of what they had at the time, and they had to create what they called the Somosian DevOps Framework. It was making DevOps work for them.
A lot of people read DevOps, and I see Stuart and his company doing DevOps or someone else doing DevOps, and the sad part about it is what he might be doing may not work for our team. We may have a different mixture. Maybe we have more mainframe COBOL programmers than you. Maybe we don't have the Java expertise as you. Maybe we're used to .NET. So there could be a lot of things in the tech stack and in the relationships that we don't have.
So I can't copy and paste your DevOps framework into Somos and say, "Hey, why aren't we getting the results that you're getting, Stuart?" Because it's not the same. It's not a one size fits all. Everyone has to adapt, and this is what Somos learned, not Wesley, Electric Cloud. This is Somos learned that they had to adapt DevOps to fit their culture and to fit them.
And so they started slow. They started very slow, but they started to adapt and started getting better. And they fed the DevOps culture through meetings, through lunch and learns, things of that nature.
They started to remove tools that were no longer needed. I don't know why, but there is some type of appetite where companies have, and I see this a lot from the role I play at Electric Cloud, they have hundreds, lots of these tools, and they're really deprecated. They're not really providing value.
Well, Somos came up with a plan to say, "We're removing tools that don't add any more value." And they came up with a statement, which I'll show you in the end on the next slide, is when you start to find that something's not working, stop using it. Just quit.
And that's what they start to do. But they started by removing it. They started integrating the remaining tools through ElectricFlow, and this is where they found ElectricFlow to be helpful, and we'll show you a little bit of that.
And then it came up to what we call TTM: train, track, manage. It's the one thing that we started to piece together, train, track, and manage to ensure that the teams are successful.
We have a saying in the States: you inspect what you expect. And so if you don't inspect, if you don't set up a system with which you can track and measure what's going on, then things are going to get lost in the shuffle.
These were the high-level goals that the CEO gave Gary. Gary was the release and development manager, and he's also a Scrum Master, certified Scrum Master, so he had some interesting Agile background and then moving into DevOps release and kind of deployment management.
So they want to abstract away mainframe services, interfaces, which they did. They created a bridge for the mainframe DevOps culture. They did that through the PLLs, the lunch and learns and stuff.
They started providing automation as a service. They used ElectricFlow service catalog capabilities to start onboarding, hey, we want to provision instances, and then we want a mandatory time or clock to deprovision.
Something so simple, I don't know why we don't do it. We did this our own way. We have our CEO sitting in the room, so we'll keep it quiet between us, but what we do in technology, we spin up a million instances and never take them down. No one tears them down. They sit there and collect. And Amazon loves it. The clock is ticking.
Well, what we did with Somos, they have a commission or provision and deprovision. You have to set the clock of how long you need it. So that way, after two days, after three days, the maximum was like seven or 10 days because they're doing quick testing, it shuts itself down. You're not being charged any longer.
So everything that gets provisioned has a deprovision clock, highly suggested from a self-service capability, so you don't get charged unknowingly or unwanted.
So providing automation as a self-service, everything is hosted in the cloud, and then unclog the pipeline to accelerate how the releases work.
And so this is kind of what it ended up starting with. You see, and this is VersionOne that they were using for planning, as well as Atlassian. They had several tools there. They used the Java framework. Everything was trying to be hosted in Amazon Web Services. They started with Jenkins in the build phase, used ElectricFlow to tie it together, and they had tons of other tools.
This is a small snapshot for one of what we call the dragon teams, the DevOps teams within Somos. And you see the business product owner from the far left all the way to operations. The goal was the PMO would start to manage and get involved, and we would do project tracking. But this was the initial start. Start somewhere, help us orchestrate this process from build to release.
What ended up happening is more of this. This is the outcomes and the results as it stands today in June of 2018. I think we can even do better. From releases that went from 36 hours to an hour, we can cut that down. Not everything is automated. Most of the stuff is automated, but not everything is automated.
From deploy times from 12 hours down to 20 minutes, ElectricFlow started to create more of control of the pipeline, getting more involved, taking over some things, transforming what was already in Jenkins into ElectricFlow is something that Somos decided to do. They just started taking more of the capabilities one step at a time.
So this is kind of the small process that they start to go through. Test validation, convergence, did it pass or fail? This was their process. They customized and changed their process, and they allowed ElectricFlow to just adapt to whatever they needed.
And they did some cost optimization. They converged their tools and teams and started to use ElectricFlow as self-service. When you want something, you go to this catalog, you click the button, and everything is built for you. You don't build pipelines through modeling workflows and nothing. You don't build provisioning tasks. Everything is done. You click the button.
What pipeline do you need? How long do you need the resource? What tech stack do you need on it? And then boom, it gets provisioned for you. And it's something that they started to start unveiling more and more. It took time. It's not something we start at day one. But after some time, it started to give them the value that they liked.
And then these are the lessons learned. I think one of the things when we started doing this DevOps conference back in 2014 with Gene is that we used to always talk about what was your journey? What did you learn? What can you help others along the way who may be going through something similar?
The first one is the Somos hybrid environment was not planned. They did not plan on being hybrid. It kind of grew on them. It happened organically. They had mainframe. They couldn't get rid of everything, but they started to traverse it, and so they kind of grew into a hybrid shop.
The mainframe houses all the data services. They're not getting rid of that right now. There's still a place for mainframe, and there's a slow migration to start moving some of the data. There are projects. So this is ongoing.
This is the statement that Somos uses internally: when something no longer works, stop using it. When it no longer works for us, let's stop. Stop allowing things to just filter through, collect dust, and we have processes that don't get refactored, don't get automated, and they're slowing us down, and it's a part of our release cadence and release cycle. Let's stop using it and fix it. And so they start doing that, started attacking that.
And I put that lesson four. The pizza was great, but they've expanded the menu. So it's no longer just pizza lunch and learns. They really do kind of collaborative, bring in catered lunches. It's gotten a lot bigger now. It's no longer small satellite teams. The team has gotten a little larger.
And then finally, again, coming from them, we became over time, Electric Cloud's technology over time, became key to their strategy. I think in the beginning it was just something nice to help for one particular item, and then it grew over time to be a part of their ecosystem.
These are the results so far. We talked about some of the outcomes, but some of the results so far: reduction of static environments. They use the self-service catalog to provision and deprovision services to AWS.
They used to go from 30 deployments a day, now they're up to 50. Their goal is to be at 100 by the end of the year, 100 deployments a day for the technologies that we're seeing now. For them, that's great. For some companies, 1,000 a day is great. But for Somos, in taking their Somosian DevOps Framework, this was revolutionary. This changed their business.
Shorter release cycles, fail faster, but track reruns for trends, being able to rerun pipelines and see what were the comments of how that occurred was something very critical for Somos. And then reducing the time to market.
And they start pushing the DevOps teams to say, "Okay, these are your pipelines. It does the work for you. You no longer have to do all this work. You no longer have to build complex scripts. You don't have to build elaborate workflows. Use this self-service capability, click the button, fill in your parameters, let it do the work for you. Let it do the work for you."
And better cost optimization.
So all that, I think we have a few minutes. Last thing, last thought. Now, I decided, since Gary allowed me, to share a little bit of what we've been doing behind the scenes that we did not show. I did mask some of the data, unfortunately, because they don't want their private data out, but they were willing to share a little bit.
So about a year ago, we started to do something that we've unveiled today, which is using true predictive analytics.
Thank you. Five minutes.
True predictive analytics, the ability to study patterns and recognize patterns from what people are working for. We kind of liken it to the Fair Isaac Company, FICO score.
I'm going on 47, and my son is 19. He's gotten his first credit card. My credit history, because I've had quite a few years, is significantly greater than his little one year of just getting started. So with Fair Isaac, when you start looking at trends, I have an age, I have an adaptability, my credit score's been around. His credit score's not that great because he's just getting started.
With predictive analytics, when you start using machine learning, we need data. We need to study trends of what people are doing. So we take a successful team, and we start looking at each individual and let them track. We track them for a year.
And this is kind of what emerged, and this is allowing us not only to identify the patterns, but to see where can teams...
We expect that just because Stuart here, and I'm picking on you because you're in the front, Stuart. Just because Stuart has been on a great team for a year, the mainframe teams that we're working with have been together for 15 years. So there are certain things that they were better at, even though he's a better developer, a better coder, and it's amazing how it turned out.
So in comes what we call DevOps Foresight, ElectricFlow's DevOps Foresight. It takes pattern recognition, studying the patterns of what companies are doing, what developers are writing, looking at code, looking at what they're contributing over time. And we need lots of data, obviously, and start predicting where things are going to be to make life easier.
It's us shifting left significantly, which we had already started in prior releases, but now saying we want to study the data.
So we put Somos through the test. Obviously, I'm not going to share the names. I'm just going to share some of the trends from what they were doing in projects. We took different developers, I'm not showing them all, and what they contributed and what they were doing.
And then what emerged is trying to say, okay, let's look at the various influences, whether we're looking at a developer influence, this is the work that they're contributing, patterns over time, the code base, what is the code telling us over time? And then they're running CI, which they were doing through Jenkins, continuous integration. And so we start looking at the CI influence, and you start to peel back layers.
Now, obviously, as I just shared with you, they're registering information. They don't do trading systems, they don't do wealth management, and they don't do online banking. So I can't share with you which one is TSS, which one is TFN. You just have to guess. So I can't share that, but I can mask the names.
And when you double-click on the one that is the greatest offender, which I won't say what it is, this is what emerges. We can start to see: where are you spending your time? What patterns are coming from developers working, making code commits, making changes, the success of those changes, the defects that come from those changes, and then how we do CI on the very changes that you made over time?
And the time factor, considering I'm just about out of time. Did I show the time factor? Oh, I took it out. Okay. I took out the time factor. It was a year. June to June. June 18th to June 18th was the time factor.
So this is what we did behind the scenes. We wanted to unveil it after we felt it was solid. Somos was a great candidate for us because they were willing to allow developers who are newly starting on microservices, making code changes, and beginning to expand. It allowed us to refine and understand how we can truly add machine learning to what they're doing to predict risk.
Not to say, hey, Stuart has a risk score because he thinks his iPhone is about to die. It's us studying it over time and how he's using it and saying, "No, we know exactly where you're going to be. Here's the trend. Here's what's going on for you."
It's no different than for us. We have Experian, TransUnion, and Equifax. They give you indicators for your FICO score that if you were to pay this credit card down here, this affects your score by this much. They know exactly because they've been studying us and calculating data for 35 years. That's a lot of data to study.
So all that to say, my time is up. Thank you so much for coming to this particular session. I hope this was helpful. If there's anything I can help you with in terms of, or Electric Cloud as a whole can help you with, we'd love to do it.
But again, I appreciate you coming by and hope you enjoy the conference.
All right.