Containers and Next Gen Infrastructure Ecosystem
John Willis is Vice President of DevOps and Digital Practices at SJ Technologies. Prior to SJ Technologies he was the Director of Ecosystem Development for Docker, which he joined after the company he co-founded (SocketPlane, which focused on SDN for containers) was acquired by Docker in March 2015.
Previous to founding SocketPlane in Fall 2014, John was the Chief DevOps Evangelist at Dell, which he joined following the Enstratius acquisition in May 2013. He has also held past executive roles at Opscode/Chef and Canonical/Ubuntu.
John is the author of 7 IBM Redbooks and is co-author of the “DevOps Handbook” and the upcoming Beyond the Phoenix Project.
The best way to reach John is through his twitter handle @botchagalupe.
Chapters
Full transcript
The complete talk, organized by section.
John Willis
So I'm on the part of what we call the selection committee for figuring out what talks should we present. And this year, some of the feedback we got last year in the U.S. was, "We don't do enough technical," some variation of this, that we don't do enough technical.
And I think we all love the meta conversations and see case studies of companies that are doing incredible stuff, and most of you here are doing incredible stuff. So we decided to do an operations, next-generation future. So we got this kind of sidetrack, if you hadn't noticed. Cornelia, who's going to be giving a presentation tomorrow, which I think is going to be amazing. She's going to get mad at me because I keep saying how awesome it's going to be. Damon Edwards. So it's kind of interesting. John Resig in the back, he's going to be giving a next-generation operation.
So, for me, about four years ago at DevOps Enterprise Summit, I was asked to do Docker for Managers. It was back then, nobody knew what was going on, like, what is Docker? How does it work? And I thought about this, and I probably should have called it Next-Generation Infrastructure for Managers.
It's going to be reasonably technical, so most managers would actually be very annoyed if they heard this, but it is to view a high level of what is the mess right now of a technical infrastructure, and the best I can explain it, the way I think about it. So there.
And the clicker is not... Oh, there we go.
You know what? I mean, just Google me. I've done a lot of shit. The DevOps Handbook, I'm pretty proud of. Today, I work for a company called SJ Technologies. I sold a company to Docker three and a half, four years ago. I was early in at Chef, and so, like I said, I've done a lot of things. And I have a lot of presentations, so if you want to hear the longer version of this, go out to my GitHub project that has all my presentations. I only have 30 minutes, so I got a lot of...
The only shameless plug in this presentation is I actually calculated I've actually authored 10 books. Early in my career, I did a lot of IBM Redbooks. Probably most of you don't know what even that means, but I've had a couple of books within the last three or four years, but only one Audible credit. Only one. So, me and Gene did this. It just was a labor of love.
It's the history of Lean and all the things that converge into. It's about eight hours, audio only, and I can say that's one of the proudest accomplishments, even creating startups. This one rates way up there.
All right. Spoiler alert. All right, sorry. It's Kubernetes and containers. You can leave now. I mean, now's your exit.
So I would say we're of what's past. We're in London, Shakespeare. What's past is prologue. But yeah, I stole some of this actually from a Google presentation at DockerCon. The other thing, notice I didn't say Docker. I said containers, so put that in your head there.
Do we have any Docker employees here? Good. Awesome.
I own a lot of stock there, so I got to be really careful, but I'm really kind of pissed at those guys.
Kubernetes is a container management system, or Kubernetes is a container management platform, or Kubernetes is a service management platform. So that's the way I kind of grok Kubernetes right now. And this will make a little more sense when I tell you the things I'm not going to cover.
So this is the in scope. I got 30 minutes. There's a lot of subjects, so I'm going to talk a little about the OCI, and then I'm going to try to put in context what is the container ecosystem. There's an incredible amount of confusion right now when people try to describe containers.
I'll tell you this. Every enterprise I go to, and even some of the most mature people who've been running Docker or really started with Docker for four years, I ask two questions. First question I ask is... He's laughing. He's the only one that got the correct answer, and there is no correct answer, by the way.
I say, "What container implementation are you using?" And the fun begins. So always the answer is Docker.
So I'm like, "Which type of Docker?" And then they start looking at me weird like, do I have any credibility? Now I'm scared. He might know something I don't know.
And now you get the real fudge stuff, where it's like, "The open source one."
I'm like, "There is no GitHub Docker Docker. So which one are you using?"
"Let me get back to you."
Or, "Oh, I'm sorry. You're right. It's the Docker Community Edition."
And so again, and I think I appreciate the fact that the word Docker has become like Frisbee or Coke, but it's amazing how many times when I ask these questions, like if you run OpenShift, well, then it might be CRI-O, and we'll get into all that, right? But the bottom line is, it's kind of a mess.
Most people, I will tell you, who've been using Docker for a while, because Docker rebranded, and I'll go into this a little deeper, to Moby, or just gone to the Community Edition, and that's not open source. Now, it's free, and I'm just not sure that's a long-term strategy.
So then I ask the next question, "What's your long-term strategy?" And even the best of the best will say, "We have no freaking idea," on their container strategy. "And we're waiting. We're going to see what Google does. We're going to see how this all covers."
I ask the same set of questions for Kubernetes or orchestration, and it's not as ugly, but it's kind of ugly.
So I'm going to try to break that down in terms of how I think we should be answering those questions. And then I'll talk a little about the service mesh and what's going on there, if you've heard of Istio and Envoy, and kind of put that in perspective. And then I think one of the most interesting things going on right now, and then again, the next slide will tell you a little about what I feel about this whole system in general. But this is, if I'm looking forward, using the hockey metaphor where the puck's going, to me, I think the Kubernetes API extensibility is the single most interesting thing going on in technology-based future.
Out of scope, I'm not going to go into any intros of Kubernetes and containers, right? There's a billion people who have done that. Storage and network are incredibly interesting, important, but I got 30 minutes, so I just can't spend a whole lot of time talking about all the plugin architecture. The network world is ugly. Maybe I'll talk a little bit at the end if I have some time, because there's two camps there and it's a little crazy. Just SDN-ish.
Also, there's a lot of interesting tools. Again, no time. I'm not really going to talk about cloud native computing, and I forget, I'm really not going to cover the PaaS. Just I'll talk about some of the PaaS's that are implementing Kubernetes. And again, serverless, again, not that any of these are not important. I do think serverless is flanking our industry relatively fast.
I'll tell you that I think that if Kubernetes keeps up, it might just all unfold under Kubernetes. That's my early guess, but I don't have time to cover serverless here.
All right, so in scope, out of scope.
So like I said, it is the Wild West right now. It really is. So if you feel like you don't know what's going on, you're in a large group of people who are running large infrastructure in really large companies that really can't successfully answer those questions I ask, because it's kind of a mess.
A little bit about the OCI. Basically, Docker originally used Linux containers, LXC, then they wrote something called libcontainer, where they were evolving, right? You got to give Docker incredible credit for opening containers and exposing what only very few companies were using at scale and commodified it for the rest of us.
And at some point when the OCI was created, they basically donated the runtime, which was the libcontainer, and now that is the predominant runtime. Most of the players that you would be interested in, concerned in, are running runC.
There's also a lot of work and a lot of arguing about image specification, and that's owned by the OCI as well. So the OCI is a really good place to keep track of where things are going because, like I said, all the players that I would consider first-tier players in this game pretty much are running runC and are arguing over image spec.
So the container ecosystem, right? So instead of us saying Docker all the time, again, I can appreciate from a brand that that's the way you order any type of Coca-Cola, but to have a more honest conversation, I would say that we need to break it down into three distinct questions or categories of how we think about the container ecosystem.
So the question should be, what's your container runtime? Truth is, it's probably going to be runC unless there's one interesting... There's a couple. Again, I'm not all-knowing. By the time I finish the presentation, there'll be four new products out there. But runC is pretty much what most people have settled on.
And then the container engine is the one that I think we're mostly going to talk about, and then the question of container orchestration. So I like to think about if we're going to have conversations about container ecosystem, and I'm totally up to discussion, debate, because this is the Wild West. So if you want to grab me and say, "I think it should be coined a different way," I'd love to have that conversation. But I'm going to stick with this one for now.
So runC, really predominant. I wanted to have more than one bullet. So Railcar is this Rust-based OCI, I don't know. But Kata Containers is interesting. Again, there's some nice properties about it being very lightweight. There's a buzz around it.
I still think, unless you just want to experiment and stuff like that, I think just for all intents and purposes, when we talk about container runtime, it's runC.
But we get into the engine now, we get into a little more interesting conversation, which is, okay, now the question is, what engine are you running? And Docker, right? But remember, there's three versions of Docker. There's Moby, which is the open source, which is literally renamed Docker Docker to Moby Moby. To be honest, I don't think many people are running it.
There's the Community Edition, which is, again, it's not open source, but it's free, and it has all the properties. And then there's the Enterprise Edition that you pay for. So Docker is really three flavors, and again, I'm amazed how... I already gave away the punchline, but I was going to ask the audience how many people could explain the difference between Moby and Docker. And anybody who raised their hand, I was going to ask them to come up here and explain to the rest of the people. And those hands just slowly go.
And there's Rocket, right, which came out of CoreOS, which makes it even more confusion, because now Red Hat owns CoreOS. And then CRI-O is interesting because it's part of the Container Runtime Interface that is part of how Kubernetes is designed to run containers, and CRI-O is, for the most part, I would put it in the Red Hat bucket, but Google was involved, and it is the kind of competing...
So one of the things that happened was Docker has had a lot of false starts, and one of the things that, as they were looking at rebranding kind of Moby as the open source and Docker, it was a take-back-the-brand positioning. And what it did really is it forced a lot of vendors who would say, "Oh, we run Docker," now they kind of have to explain that answer.
If you're some provider and you say, just take for example OpenShift, nothing wrong with it. OpenShift: "Oh yeah, we run Docker."
What's Docker? Have you talked to a Docker sales rep?
Well, so it kind of forced this kind of CRI-O, which is a good container runtime. I'm going to be honest with you, container runtimes are not really that interesting.
And then like I said, I already kind of did this. You need to really understand what Docker is. The open source is called Moby. Literally, if you're branding, if you're using the brand, you're really not allowed to use Docker as a brand representation unless you basically attribute that this is a different product than the open source, and we just all just run Docker.
And there's Community Edition, and there's lots of versions of Enterprise Edition. And it includes a lot of things, right? So if you're all in on Docker, you can get all the things.
I just wanted to cover the things. So there's the cloud-based computer engine... I'm sorry, cloud-based container engines, your Amazon ECS. Honestly, ECS is, again, Amazon does great things. They didn't do their initial container thing well. Azure's done a pretty good job, and Google's, they didn't have to. By the time they got around to do it, why would I give you a container without Kubernetes? So GKE is really your kind of distribution there.
So like I said, Kubernetes is really, the dust is settling. It is the orchestration tool. Docker has Swarm, but even Docker kind of gave up because now they're bifurcated in that you get Docker with Kubernetes and Swarm. And why you would ever use both, I have no freaking idea.
Swarm was a good product. They just didn't invest in it the way they should have. Same thing sort of with Mesosphere. Mesosphere has gone all in on Kubernetes. So if I said anyone, Swarm has Kubernetes in the latest release, and so does Mesos, and so does Pivotal. And again, I know less about that instantiation, but it seems like everybody is basically following the Kubernetes lead.
So then the question is, this is that second, harder question, okay, what is your orchestration engine?
"Oh, it's Kubernetes."
Okay, which Kubernetes?
And there's Kelsey Hightower, who is the Kubernetes god. Amazing guy. He has a Git repo called "Installing Kubernetes the Hard Way." So I've heard a couple of customers now telling me they call it the hard way. So it's just the default distribution. Do you want to roll up your sleeves?
And now Heptio is interesting because they call it the un-distribution. So these are two of the... The founders of Heptio are two of the original developers of Kubernetes at Google.
And so here's one of the problems, because when you get below Heptio, and this is not a negative, it's just something to point out, that OpenShift, Docker, and Mesos are going to sell you an enterprise-ready version of Kubernetes. Buyer beware. That means there's some proprietary foldings into that. There's some things that aren't pure open source, and they're trying to make your life better, right? And maybe that's what you want.
But what the Heptio folks say is that we are going to try to give you what we call an un-distribution, and we're going to be open source end to end. So we're going to try to do something that's really difficult to do, and try to give you that promise of enterprise ready, what you need to be an enterprise to use Kubernetes, but there's no proprietary enterprise closed functionality.
And so, like I said, Docker now supports... OpenShift, you got to give OpenShift credit because they've been running Kubernetes for literally early beta, they had put Kubernetes into OpenShift. So they have more burn time if you're looking at the enterprise-ready versions. They have a lot of experience running Kubernetes because they've been running it almost four years, I think. At least three, probably four.
Also, I'm a big SDN fan, so they run Open vSwitch. So they actually have, in my opinion, best network solution. But there's a lot of opinionation in their implementation. There's some confusion on what their long-term strategy commitments are. What if you want to use hybrids, you want to use Kubernetes as a service versus OpenShift?
Again, I got 30 minutes, but I'd love to have a longer conversation about this Wild West. Are you going to run OpenShift on all your clouds and/or on-prem?
I talk to a lot of analysts, so I tell the analysts, right now, OpenShift looks really good from investment point because most of our infrastructure is on-prem. And so if you want... You're Red Hat, you already have enterprise license Red Hat. You want to go Kubernetes, this is a safe bet, but it's going to get really ugly when you start experimenting or running AKS or GKE on Google in here.
And then the interesting is, how is that... Does that make sense? I'm looking at a lot of faces. Give me a nod if like... Okay, good. Yeah, well, that's a heavy nod over there.
So anyway, that and then Docker. There's some good things on Docker. Solomon did some great things. He was a brilliant young man. So Docker, and the guy who was my chief architect of SocketPlane, of the company that we sold to Docker, is the chief architect of the Kubernetes implementation. So I can guarantee you, Madhu Venugopal did it right.
This is the guy that wrote ASICs program for the Cat 6K for 10 years, and then was the first committer of OpenDaylight, which was the SDN Nicira killer. He is an amazing architect, designer, developer, and he did the Docker and all the development for Kubernetes and Istio and the service mesh stuff.
Anyway, the Kubernetes as a service players. Again, Amazon seems to be, I don't know if it's on purpose, but the general vibe of EKS is it's still a little short of what we need. Azure's doing a great job and, of course, if you're going on a Google platform and you're all in on cloud, then right now GKE is probably the best play.
Service mesh. How many people are like, kind of have... Well, they come up here and explain it, but I won't do that. How many people understand when we talk about containers and service mesh and Istio?
Wow, okay. Wow. I thought we'd have at least half. Good. Well, then this is going to be a good presentation for you.
So the service mesh has been introduced, kind of ongoing. I don't know when they first started talking about Istio, and I'll explain what Istio is in a minute. But the idea is that you're going to have... Service mesh concept's been around forever, but when we start talking about Kubernetes, we not only talk about clustering containers, right, which is the first-order problem. Then the second-order problem is, how do these containers invoke APIs and services? And if we don't think about that, it's going to be a Wild West of this connecting there, this connecting there.
So the service mesh model really is designed to be a layer for service-to-service communication. Remember, in this context, it's thinking about how you would have pods with containers in them. Let's just say clusters of Kubernetes with containers in them, and how they would call other services. And the idea is to have lightweight proxies, and then we then start... It's like layer seven routing and management or data management.
And so the service mesh capabilities, this is where it gets interesting, is it really starts with observability. And although you won't read this in most of the documentation about it, but basically what you're seeing is, and I'll talk about the data plane in a minute, but what you're seeing is all egress and ingress traffic basically being analyzed by some service mesh.
And so that opens up the ability to have traffic control, service discovery, load balancing, resilience, deployment strategies, blue-green, canarying, whatever, security. And then the one thing I left out too is circuit breakers, although circuit breaker is actually very more specific to the data plane aspect of this. I probably lost about a third of you, but stay with me.
This is the only one I didn't have to get permission to use version. There's some other good ones out, but it's Istio architecture. So Istio, right, is Google's implement... So I say Google. I think they collaborate with IBM and other, but I'm just going to call it Google's implementation of a service mesh for Kubernetes.
And so what it is is it's very much like SDN, but remember SDN is more like layer three. This is layer seven type stuff. It's a control plane, data plane architecture. And I'll tell you in a minute, the data plane is pretty clear. But there's a lot of arguments about the control plane and what should be in a control plane or not. So it is still extremely, extremely early in all this stuff. But this is how the cells are starting to form.
And so when we talk about Istio in the context of this is the Google's implementation of a service mesh, there's a data plane and control plane. The data plane basically, like I said, is basically actually, if you read the second bullet, it runs as a sidecar model, which means that in Kubernetes context, it runs as a container that is a proxy. And it then sees all ingress and egress data and then allows you to do all the magical things that you might have to do: service discovery, all those things, right?
And then the control plane is then the control plane of the separation of the meta services that kind of configure and the data gets sent up to manage. I will say again, data plane is pretty clear right now. Control plane, again, I'm sure somebody at Google would be furious at me right now. Right now, I would very much con... Well, all right, let me wait one more slide to get to that.
This is the control plane for Istio. It's something called Pilot, Mixer, and Auth. Pilot basically does service discovery. There's these things called route rules. So you can see the traffic management stuff, destination policy, a lot of policy stuff that you can implement. Mixer is like telemetry, ACLs, whitelists, rate limits, custom metrics. And then Auth is basically all your security, CA, TLS, and encryption, right? So that's the generic service.
But here's the thing, right? The real meat is in the proxy. So this is the proxy that they call it runs as a sidecar model. So it runs as a container basically in the pod where the other container is. And it's layer seven. It was actually developed by Lyft.
It's almost like I say this, I say, like in 2000, I don't know when it was originally developed, but imagine in 2017 you said, "Hey, give me all the money in the world and I want to create the perfect proxy." Right? And that's kind of what Lyft tried to do.
I mean, they looked at NGINX, and nothing against NGINX, but they had to design... If you look at the traffic patterns and how they've changed over the years, it used to be 90% north-south and most in 10 or 20% east-west, right? That world is now 90%. I think Facebook posted three years ago, 97% east-west of their traffic, right? So it mandates a different way of thinking about a proxy, and Envoy is that kind of proxy.
And don't quote me, but it sounds like I've been told that Google is not only replacing the north-south, but their east-west old Apache mod_proxy's with Envoy. So in other words, right now, Envoy.
And back to the data plane, I think personally that you should spend more time thinking about Envoy than Istio. But that's an early guess and an early bet. You've got to remember, I'm not an oracle here. I might know just a little more than all of you, because it is pretty confusing.
I will say that NGINX is not going to step out of the game, so they got what they call NGINX Mesh. So this is their version of a competitor to Envoy that fully fits the Istio model. And again, for people who didn't raise their hand on service mesh, you're probably somewhat confused, but at least I've given you a starting point to go research. And that's what this is all about.
So finally, what time are we at? So I think I actually will end early, so maybe we have a Q&A.
2:04.
2:04. What was my I had to end by?
Six minutes.
Six minutes? Oh my God, I better hurry up. No. The most important point in six minutes, get ready. No.
API extensibility. So some of you might have heard of this as operator framework, which actually CoreOS is a great... I was telling somebody this morning, if you want to see the lineage of how you got to here, there was a CoreOS original article about operator frameworks, which really was a way for Core... It was brilliant. CoreOS was trying to address, how do you run stateful apps in Kubernetes clusters?
So they wrote this, I call it a manifesto, they don't, about we need to think about this and this. There was a second generation of that discussion. Google kind of adapted, and you're now seeing really, I think, more of the discussion is less about operators, and some people feel that the operator discussion will get more into kind of Kubernetes API extensibility.
But this is Joseph Jacks. He's my oracle when it comes to Kubernetes. I don't have time to tell you why, but he seems to know everything that's going on to the minute. And this is a tweet he just put out recently. He said, "All complex software delivered as a service or behind a firewall should be implemented as custom Kubernetes API extension controllers. Radical efficiency is bound." I totally agree with this.
So here's my radical hat. I think Kubernetes becomes the next Linux. I don't know when that happens, but I think we're basically just like... And then I think it's like a 10- or 15-year run of a fabric that becomes how we run all our applications. I know this sounds crazy. But if that happens, Google has designed this extensibility API to be at the millisecond level. It's at Google scale.
And basically, I'm probably not using the words right, but it's sort of an event loop that would listen to every egress of the API of a Kubernetes cluster at the millisecond level. And you could create your own custom resource controllers and custom resource definitions. And so that's how all of the stateful, like the Redises, the MySQLs, the Cassandras, whatever, are going to start implementing and already have, and I'll show you a list of ones that are migrating towards the API model, how you run stateful applications.
But more importantly, it sets the base. Even like Joseph Jacks would say, if you're Workday, you should basically build your whole infrastructure on Kubernetes API extents. If you believe that this is the foundation, it's like if you could go back in time and Linux kernel modules, you would describe to me like, "Oh my God," and you knew what was going to happen over the next 20 years. This could be that. And even if we're wrong, I think you should go investigate this and figure out this technology for your organization.
Drop the mic. No. A couple more slides.
So this is a little about the Kubernetes API. I will tweet out some references. I forgot to put the reference list in here. But I would definitely follow the lineage of the CoreOS operator framework discussion. And I think the best thing is to Google Kubernetes custom resource controllers or custom resource definitions or API extensibility. Or anyway, this is just, again, a little banter about the controller.
And the thing is, it is at Google scale, right? My opinion here is Google waited for us all to catch up. We're starting to catch up, and now we may start seeing the benefits of what Google has understood at scale that is uncommon to almost everybody else on the planet. And we might see that starting to manifest around these areas. Some of the numbers I've seen about people running and trapping data at scale with this API extensibility is off the chart. I don't have any numbers to state.
This is actually a longer list, so I cut out all the names that I thought were ones that almost everybody would recognize. Now, this is actually in the operator framework model, but it's the evolution of what's happening. These are companies that are building on top of Kubernetes APIs to allow you to run stateful implementations of their...
And so Joseph Jacks tells me, funny that a 28-year-old person is my oracle to a 59-year-old person, but that's the state of the world. He says that this will be coming out soon. It's out there. It's called Kubebuilder.
So here's the thing. Right now, if you want to write custom resource controllers or what they call CRDs, it's really hard. I mean, really ugly hard. This is the first attempt to make it a little less hard, but still really, really hard.
But this Kubebuilder, and the nice thing about this is, if you forgot everything else that I told you to Google, you can start here and Google here, and this will at least set the framework of what really probably is going to be the first-order driver of this development ecosystem that, if Joseph Jacks is right, and I believe him, will be the formation of being able to sit at a millisecond level in an event loop of everything that happens in a Kubernetes cluster. And if everything runs in a Kubernetes cluster, that's probably a pretty nice place to be.
I think I'm done. Yep.