Infrastructure as Code (Yes, it applies to z_OS as well)
Infrastructure as code, and dynamic provisioning of environments is not new but has not been applied to most mainframe environments. This session will show how z/OS has transformed and how open source tools can be used to provide infrastructure as code for z/OS as well.
Rosalind Radcliffe is an IBM Distinguished Engineer responsible for driving DevOps transformation into our products in support of customers ability to do DevOps, as well as assisting clients in their transformation, and the internal IBM adoption of DevOps. She is a frequent speaker at conferences, an IBM Master Inventor, and a Member of the IBM Academy of Technology.
Chapters
Full transcript
The complete talk, organized by section.
Rosalind Radcliffe
Hello, my name is Rosalind Radcliffe, and I'm an IBM Distinguished Engineer responsible for DevOps for enterprise systems. To give you a little bit of background, for those of you who don't know me, I started in IBM 32-ish years ago in ISPF development, so I've been in and around Z my entire IBM career. I've had almost every job there is in IBM other than manager, finance, lawyer, those kinds of roles. I've been a technology person my entire career. I've been in services, I've done all sorts of things.
I like to joke that my claim to fame is what some people say broke ISPF: putting the menu bar across the top of the ISPF system and putting the command line in the wrong place, as everybody says, because I floated it to the bottom. If you don't understand those jokes, then you don't know about ISPF, and we'll explain a little bit more. The other thing I like to say is my claim to fame is arguing across the industry to get Control-C, Control-X, Control-V as the industry standard, because IBM had Shift-Control-Insert. I did all those in my first few years in IBM.
After that, what have I done? I've worked around systems to help clients take advantage of what they have to do better, to do business better, to deliver business value. I've worked in operations, I've worked in development, and all that led to this thing called DevOps. My job has been to make z/OS DevOpsable. Yes, I made up that term recently. I was talking to someone for an article, trying to explain what I did, and they kept not understanding. Finally I said, "Making z/OS DevOpsable," and they got it.
So: taking this platform that has been around for the last forever and making it possible for you to do DevOps. It has been possible for quite a while. We've had a lot of capabilities in the system, but we've had things that made it a little harder. What I've been doing is trying to make those pieces easier. We've done a lot from the development standpoint, but infrastructure as code is something where everybody looked at me kind of funny and said, "That mainframe thing? How do I do infrastructure as code there?" So I thought I'd start talking about that.
Just to make sure we're all on the same page, infrastructure as code means making sure all the configuration, all of the information, is stored so that I can build new systems, so I don't have to worry about configuration drift, and so I can provision quickly. In the distributed space, we have hundreds, thousands, lots of little baby boxes running around, so we need to be able to do this. It is very important to keep systems consistent. It is very important to provision a new system, so infrastructure as code matters a lot in the distributed space. If I have 1,000 boxes and I build them all manually, what's the chance they're going to be the same? Zero.
In the z/OS world, we're a little different. We build a SYSRES, the system image. It's the set of volumes that represent the system image. It's the thing. Every system runs off of it, so there is one of them. All the software versions are the same, so the system is the same, so the environment is the same. We don't have the same system drift in the sense of software. We don't have to worry about that one. In many environments, the number of systems can be counted on your two hands. If you have two, three, or four systems, how hard is it to manage those four systems? They are more complicated because they are running a whole lot more than that, but just starting with the operating system, it is not that hard.
I do it with JCL, job control language. I do it with something that, if you don't ever have to touch, congratulations; I know you don't want to. But I've done the configuration that way. It might be sitting in my private libraries, and Joe might have his own set. Maybe it is not the same every time, but it is still code, right? Oh, and it is not stored in the source code manager either, because it is in my library or your library. So what if somebody else has to do it? Not usually there.
And then there is all the middleware. The operating system might be relatively easy and there might be few of them, but there are probably hundreds of CICS regions or lots of database setups. What do I do about all those? All of these things lead to the need for infrastructure as code.
There is one other thing. If we look at the standard pipeline, I have been having fun building pictures and trying to represent the pipeline realistically when it comes to a DevOps process. I've gotten tired of the straight line because it is not a straight line, and it is really not an infinity. It really is a bunch of little circles. I do my coding and my build, and I have this nice little feedback loop. When that works, and only when that works, I move on to the next phase. Then I have this other feedback loop, which is provision, deploy, and test.
There is this little provision word in there. In the z/OS environment, I have a dev environment and a test environment. It is just sitting there. I don't provision anything. Why not? If this is the standard world that we're moving to from a DevOps process, if I have a culture that says I get to provision my test environment, run my tests, and throw it away, why don't I do exactly the same thing in z/OS? If I do, then I better have infrastructure as code, because if I'm going to start provisioning a thousand of these z/OS instances, I better be able to do that with code.
I can do this today. I can provision z/OS systems. I have at least three different ways of doing it. I can provision them, I can provision the middleware, and I can do this same thought process. If I want to do provision, deploy, and test, I really need infrastructure as code in this environment so that I can have these environments and provision in the environment.
Everybody asks about this provision question: "How? Why?" Actually, why is usually the bigger question. If I can have my own test environment, if I can test my own capability without everybody else, I can experiment. I can play. What if I want to try something very different in the environment? Right now, if I try to do that, everyone else in the environment is affected by what I do because it is a shared environment. We don't need to do that.
There are multiple ways to provision systems. You can use a zDNT system, z Development and Test Environment. You can run it on Intel Linux in your cloud environment. You can run it in z/VM if you want to. You can run it using z/OSMF. There are lots of different ways to provision a system, all of which can be done in a scripted manner, the same way you script everything else.
In particular, if we think about zDNT, z/OS running on Intel Linux, it is being provisioned using your cloud technology, whatever you use. If you are using Ansible to provision your systems, you can use Ansible to provision the system. If you are using any one of the cloud providers, you can use that to provision the system. Then once you've provisioned the system, how do I configure that system? I use the same tools. I do the same kinds of things.
So what's your pipeline? How do you do it today? I picked a set of capability to have some fun with. I picked things that everybody was talking about, that I saw in charts while I was here, and that would work with z/OS. If you look at the picture, you'll see a whole bunch of maybe unusual icons. I'm going to pick on the row in the middle first: the Microsoft Azure DevOps stack. You really think I'd be standing up here talking about that stack? I don't think you expected, when you walked in the room, that I would talk about Microsoft Azure DevOps when talking about infrastructure as code for z/OS. There is your surprise. I've been playing with it. It works just fine.
It is a Git repo. Git's Git. That's easy. The pipeline can run tasks. It can run Groovy. Groovy runs on z/OS. I can do what I need to do. I can store my artifacts in the system. That is one thing I need to call out specially. z/OS traditionally did not allow you to take load modules off the system. Load modules, the program output, whatever you've compiled into a program, could not move off z/OS easily. We had to use special tools, XMIT, to get it off the box. We've provided capability to allow you to copy those off the machine without special processing and without losing any of the attributes. I can take that load module, copy it into the hierarchical file system, tar it up, and put it in Artifactory, Nexus, Azure Artifacts; I don't care what it is.
Now it sits alongside all the rest of the parts of the application. Think about it: infrastructure as code, my application sitting there, I can provision my entire environment the same way. My artifacts are staying in the exact same place. They're stored in the same place. I can do the same kinds of capabilities.
You'll notice a few other things up there. Ansible. I've been working on this Ansible support for a while, so it has absolutely nothing to do with the IBM Red Hat acquisition. Before anybody says I'm talking about the acquisition before it has gone through, I'm not. We started this before that. What we've done is work with Ansible to make sure I can create a playbook that runs to allow me to do configuration on z/OS. Imagine I create a playbook that allows me to configure my environment and do all the steps that I need to do. That's exactly the same. Or Rundeck. We've got that working in the environment. It doesn't matter what your tool of choice is; I can do the automation and the work I need to do in exactly the same way I'm doing it on any other system.
We've been doing a lot of work with this inside IBM, but we also want to make this more open. IBM has always been a contributor to open source. We wanted to make what we're doing more open and more obvious. One thing we have here is a project out on GitHub, that external thing where other people can contribute. We've got an area that lets you contribute automation. This sample was contributed by an external source: "I want to do user ID management on z/OS using Groovy in a way consistent with every other platform." Now I have a user add function. I can do user add like I do anywhere else. On the back end, it happens to be talking to RACF, but it could talk to any other security system. It makes user ID management on z/OS look like any Unix platform.
Now your Unix admin can create user IDs on z/OS. More importantly, your automation can create user IDs on z/OS. The idea of this area is that customers, anybody, can contribute their own automation. We're building up open source community around this infrastructure as code written in ways that can be shared and easily consumed by the current generation of developers coming out of college, not just those of us who know JCL or REXX. This is all done in Groovy, and it is available externally so you can go play.
This is one example submitted by a customer. We had a customer recently present at a conference. They happen to be a relatively large insurance company in the United States. They brought in two new college hires that they wanted to bring in as new system programmers. Brand-new system programmers. They were going to do z/OS installation and verification. Do you think new people want to do this job? You would think they'd go running away. The statement earlier about people not wanting to do COBOL: it is really not that they don't want to do COBOL, it is that they don't want to see the ISPF panels that I make them see. If you give them an environment that they think is friendly and nice and works just like anything else they're used to, then they don't have a problem.
In this customer case, we gave them Groovy. These two new hires got Groovy and a set of tasks to do. The Groovy they had happened to be GroovyZ, so it was extended. It allowed them to do tasks, MVS commands, TSO commands, ISPF commands. It can submit JCL for all those things they didn't really want to figure out, and they wanted to use that existing JCL on the system. They could still use it. Their job was to do update, installation, and verification.
Before they got there, it was a manual task. It took a while: 27 hours of people time for validation. Actually, that is a little fast; the team doing this originally was very experienced. Once these new hires started automating the tasks, in the first quarter of 2019 they got down to 11 hours, and now they got down to four hours to validate it all. That is four hours of automation, not people. This is an entire z/OS system verification. It includes all the middleware and all the system parts. They did all of those stages in Groovy, with Google.
The other interesting thing about this experiment was that they really wanted to avoid having these new people harmed by the existing system programmers. They didn't want them jaded by the experience of the existing system programmers. They were trying to hide the limitations the existing people thought there were in the system. They learned a lot through Google and by talking to other people in zNextGen, rather than from having a system programmer hold their hand. They did all this because they were allowed SSH access to a command line on z/OS and were given Groovy. That is what it took. With modern languages and capabilities and young developers willing to learn and play, you get a lot of capability very quickly.
That's one customer example. Another customer had, as part of his job, to create LPARs. He updates and creates LPARs and manages them on the system. He didn't like the fact that he had to spend all this time doing the work. His real job was a whole bunch of other things, but this was one of them. He had a set of manual steps he had to do, and at the end he could IPL his system.
He got to play with GroovyZ and said, "This is fun." He tried taking GroovyZ and writing a set of configuration files for what he wanted his Z to be, checking them into a code repository, in his case Bitbucket, running a Jenkins file, and creating the system. A Jenkins slave running on z/OS pulls the data across, pulls the configuration file, runs each of these steps, checks out the code, does the build, deploys it, talks to the system, and IPLs z/OS. It now IPLs in a few minutes.
So if you want a new z/OS image running on Z hardware, I can run this with a configuration file and have my new Z system on z/OS. I've got a bootable image and can IPL it. That provisioning chart I had a few slides ago? I can provision it on Z hardware. This was my proof example when a customer told me it wasn't possible to create a new z/OS image in less than six months. I showed him the script running, and it was under two minutes.
This was built by a customer over time in spare time. It is quite literally spare time. You will notice at the top there is an external GitHub site. He posted it to the external GitHub site and is going to continue to enhance and evolve it so that other people can contribute and share infrastructure as code. This is z/OS core infrastructure as code. All you're doing is specifying a set of configuration parameters, and the system is up.
There is another aspect to this process. IBM has been doing a lot of work with users and other vendors in this new project called Zowe. It is part of the Open Mainframe Project, and its goal is to help modernize the Z platform. The idea is that there is a browser-based desktop to z/OS; a set of API capabilities, REST-based services to interact with the system; and a command-line interface. The idea of the command-line interface is to allow you to automate tasks from another machine. If I want to run a task from my desktop or laptop, I can automate it from there. This is another way I can use capability within my business to get access to Z function and make it more familiar.
With the CLI in particular, remember that we have a CLI on z/OS, so you can also do things from z/OS. But there are times from a business perspective when I need access to Z from a different machine. I've worked with a client for years that wrote complex code to pull a few pieces of data down, put them in an Excel spreadsheet to process, and send them back. With the CLI, I can easily access Z files, get the data down, process it, and send it back.
Zowe is part of the Open Mainframe Project. It is external work so anyone can participate and contribute. If you're interested, get involved. Zowe also has APIs backed by z/OSMF and z/OSMF workflows. Here is another way you can use provisioning on the system: use these workflows to provision the systems you want. We're giving you lots of choices. Pick one, but don't stay where you are. Pick one that allows you to do this as infrastructure as code.
We need to help make sure these systems are manageable and maintainable by more than the people my age and older. I've been in IBM 32 years, so you can do the math. We need other people to keep managing these systems, and by providing these modern interfaces across the industry, we help do that. The other important thing is that as we all work together and contribute samples into the community, we can come up with more defined systems.
Right now, most z/OS systems have been configured over the last 30, 40 plus years. I will bet most of you don't know all the bells and whistles and knobs you've turned where. In most cases, you might not actually care. In many cases, you could probably go to a more standard system and not need all those bells and whistles. Getting to a more standard configuration makes it easier to manage and deal with the system.
Infrastructure as code is perfectly plausible. We do it internally. There are companies doing it. It is something you need to consider and do. Every year I've had the question, "Can you please go tell everybody that it's possible to do DevOps on z/OS?" This year I'm going to ask a different question. I have two simple questions. What other open source tools do you want? What other tools are you using as part of your process that you want to work with z/OS so I can go make it work? What else can we provide to make it easier for you to work on the system?
We want to make z/OS not different. Before you take that sentence wrong, let me clarify. I've been working very hard to remove the differences from z/OS that don't matter. The z box and z/OS provide reliability, security, scalability, encryption, performance, take your choice of abilities. But removing the differences because we just did it differently is what we're trying to do. There is no reason you shouldn't use the same source code manager. There is no reason you shouldn't be able to use the same deployment tool. z/OS doesn't need to be different for those things.
It will be different. It has Workload Manager. You don't want us to break Workload Manager; it works really well and helps you optimize your workload. There will be those kinds of differences. What do we need to do on the system to make it easier for you? What's left in this process of making it easy for you to use the systems the same way you use any other system?
That's my goal. That's my assignment. We're working on making z/OS as equivalent to a cloud-native environment as anything else. For those of you who've dealt with me for years, you might know that I say z/OS was the first cloud. It really was. The only thing it was missing is self-service, and now we're working to provide that. Thank you very much, and I'll be in the speakers' corner.