Log in to watch

Log in or create a free account to watch this video.

Log in
US 2021
Share
Download slides

Cloud-native DevSecOps at Supersonic Speeds (well...getting there)

The 309th Software Engineering Group at Hill Air Force Base, Utah traditionally handles cradle-to-grave software updates for the fighter aircraft across the Air Force. The organization has operated with platforms in existence since the '70s and '80s, in addition to more recent aircraft.


However, the need to modernize how software is delivered needs to keep pace with the changing cyber and kinetic threats.


In 2019 the SWEG began its journey to experiment with cloud-native engineering practices - even going so far as proving out that Kubernetes could be successfully deployed onto a jet and flown at supersonic speeds. However, for this to scale, the SWEG has had to invest in paying down significant technical and organizational debts, working with industry partners who are experts in the new practices in DevSecOps, and building the SkiCAMP community to support modern software delivery across aircraft systems.


This year we have been working to support some Air Force Research Lab efforts and deploy software updates from cloud-to-jet in under 24 hours. We're working on modernizing and automating our release engineering practices, product-centered delivery, fully integrating security into all phases of the software lifecycle, and establishing the foundations for cloud-native development, digital engineering/ digital twin testing, and test automation in the Systems Integration Lab and optimizing the flight test feedback processes. We're also building our internal organizational capacity, working with our vendor partners to deliver enablement, training, dojo-like residency programs, and mentorship across our SkiCAMP/SWEG engineers. We're finding ways to leverage technology and updated practices to help the cybersecurity and operations teams to work smarter, not harder.


As we continue to build on this effort, we are going to push the boundaries on digital engineering/ digital twin testing and analysis, improving our security practices and security engineering, and scaling adoption and experimentation. These are big challenges in the next few years as we push the organizational culture and tech foundations to evolve. Additionally, we are going to continue trying to expand the support and onboarding services and enablement to bring the larger enterprise programs into the cloud-native solutions to help modernize at scale and make the SWEG an opportunity to technologists to build and apply the latest skillsets and technologies against mission-critical efforts.

Chapters

Full transcript

The complete talk, organized by section.

Derek "Eeyore" Bissinger and Michael Snyder

Welcome, everybody. Thanks for joining us.

We're going to get started. We're going to talk about, within the government, within the Air Force, bringing DevSecOps into an organization at supersonic speed. After three years of progress, we're getting there. We're making steps.

Yeah, and we're going to bring up a lot of acronyms. We're going to have a lot of jet metaphors, so be prepared.

Derek "Eeyore" Bissinger

Okay, I'm Derek Bissinger. I go by Eeyore, and I'm a retired 20-year Air Force pilot with the B-1 and currently living out my second childhood, enjoying life as a software engineer now back in the government, part of the 309th Software Engineering Group.

Michael Snyder

I'm Mike Snyder. I'm a principal heading up our defense business unit at our small company called Oteemo. We do cloud-native DevSecOps and transformation work for both federal and commercial customers. And I am pretty much the most non-technical person ever working in the technical field. So we're going to keep this fun.

Pretty good. You did pretty well.

Derek "Eeyore" Bissinger

All right. So I'll start this out.

Government is huge. If you can imagine the Air Force governed by DoD, it's a large entity. So when we try to talk about DevOps in at least this presentation, we kind of have to keep that in mind.

I was working with industry before I got into the Air Force. I was part of the corporate software development world, and I really wanted to bring all of that DevOps goodness that the industry really has grabbed onto and bring that to government. But then I realized how big the government is and how difficult it is to actually affect change.

So this has been about three years of work for me to get to where we are today at this presentation. And so I decided to break it down into: let's find a small team, two-pizza rule. Let's get them together and work as much as we can with industry to learn best practices, to learn DevOps, to learn all of these different tools and all of these different capabilities that exist.

And really, about a year ago, I realized that what we really need to do at the 309th is bring on best-of-industry engineers on contract to sit side by side with government to be hands-on developing solutions together. And this latest effort that we're presenting here started in April, and it was our first real run-through at the 309th with dedicated contract support with best-of-industry. And we were very well-suited in that setup, with our small problem set, to develop capability very quickly.

And our goal was to bring DevOps to the government, to lower cycle time away from the current months that we experience within the 309th to a matter of hours. And really, this is just the starting point. But it sets the need for what we need to be doing within government, how we need to be working at this, and it's still very much the need today.

Michael Snyder

One of the things that we needed to do from the beginning, what we found is that traditionally, when contractors were brought in, it was to holistically outsource all the development work or push it off or get it out of the government's ownership.

What we came in and came on board as part of this team was: how do we help to build the organizational capacity within Eeyore and his team to help them, one, take more ownership of the path to production so we can continuously improve it, but two, also, how can we solve problems and work together and be more of a partner organization than just a full outsource? So it's a different dynamic for the 309th, as he was saying, to do this.

We're going to be throwing a ton of acronyms at everybody, so just going to level set a few to get started here.

So big one here is the SWEG. We mentioned the 309th already. There are three software engineering groups across the Air Force, and each of them are associated with the three maintenance depots. The 309th, which is based and connected with Hill Air Force Base out in Ogden, or out in Layton, Utah, just north of Salt Lake City. The 76th SWEG, which is at Tinker Air Force Base in Oklahoma City, and the 402nd, which is at Robins Air Force Base down in Warner Robins, Georgia. So all three of those groups have very specific mission sets, customer portfolio of a lot of different customers, weapon systems, aircraft systems, and other software programs that they support with the mission of doing cradle-to-grave software development and software sustainment for the Air Force.

One of the big pieces of software that's one of the most traditional architected elements of the entire system is the OFP, right? The Operational Flight Program. That's the embedded system software that the aircraft will not fly without. OFPs integrate and control the propulsion, the avionics, the navigation, targeting systems, all the data, everything that operates within those aircraft. All is run through the OFP.

Rapid OFP, so the next thing for that: again, like we were saying before, the OFP cycle to get any updates is typically months, sometimes longer, to get those things deployed out to the whole fleet, from initial concept development through development, testing, all the testing on systems integration labs, testing on the aircraft, and then deployment.

The rapid OFP framework that Eeyore has really been leading with the team at SkiCAMP and the 309th is looking at that software supply chain in a small enough size that can extend all the way from either a cloud or an on-prem development environment to an embedded system where we're in the field, as they said on here, within 24 hours or less. That is a game-changing vision that we're looking for. How can we build that software supply chain that breaks out of that traditional mold, but really sets a new way forward safely, securely, and at unheard-of speed?

Derek "Eeyore" Bissinger

Yeah. What Michael was talking about there with SkiCAMP is a small team, about 10 engineers that are working off base on commercial internet, and we like to call ourselves SkiCAMP.

And we've been working with the Strategic Development Planning and Experimentation Group based out of Wright-Patt, Dayton, Ohio, who is largely the program lead running the requirements for our work, and they're asking for this capability. And the term that we use to refer to that framework that we're delivering is Lightning.

And we have that metaphor, like I said, of cloud to jet. Lightning strikes at the speed of supersonic if we can get there. So lightning's a lot faster, but if we can get to those lightning speeds, that'd be even better to give us that capability that we've been talking about.

All right. So basically, I want to break down a little bit on the current process just to level set and set the framework for some of the unique challenges that we're still struggling with when it comes to this rapid capability to deliver the Operational Flight Program.

So obviously, we have constraints throughout. I'm just going to draw attention to some of the key constraints that we face.

Having come in to write code for the Air Force, I was immediately brought into the understanding of how software development basically works ahead of hardware, and a lot of times even ahead of the testing. So it won't be that uncommon to have software development teams start writing code from requirements to provide combat capability, and write the algorithms and data crunching to make that combat effect happen without even knowing the specific hardware architecture that would be available, as far as how much memory is available and how fast are the drive access times. And so we're having to make a lot of assumptions based on past architecture, and largely based on what the vendor has provided previously, to make these decisions as far as software goes.

There's also the integration in the systems integration lab that is another whole team outside of the software development team that is your hardware test stands, your software integration. It could be a simulation system or any mix between simulation and hardware. And these are experts of those specific test systems. They may not be familiar with your OFP, but their job is to test it on the prescribed regression testing that has been established for past releases.

Then obviously there's manual steps where we will burn source code to a CD to mail to the systems integration lab, and just the same when they're done, they'll turn around and mail that out to the various test wings, like the ones we have at Edwards, and a flight test will be scheduled to evaluate that capability on the actual jet itself.

Everything ends up being a flight test, and that is costly, as you can imagine, and time intensive. There could be delays with mailing this source code between the developer and the integration lab, and then again delays mailing it out to Edwards for that flight test, and delays for that schedule of that particular jet.

And then the last point I want to mention is every release is a full release. There is no single release of a new page in a UI. It is the entire system, from flight control systems to storage management systems to navigation systems. Like Michael was talking about at the outset, everything that's required for that jet is released in one package, one release.

Michael Snyder

And then again, looking at it from end to end, the total value stream, right? The organizations that we may be a part of, the lines of communication may be separated. They may not have clear planning program. They're not aligned throughout the whole life cycle of the software development, but they have to work together in order for this whole process to come, so that updates get pushed to the fleet effectively. So that is a very lengthy process to work through all that, a lot of back and forth.

Derek "Eeyore" Bissinger

Right. So that's the current state, and obviously you can see there's a challenge just with the current process. There's additional challenges within the organization as a whole.

Navigating those challenges can be extremely difficult because these are career positions that have been established for testing, for design, for accreditation, for flight safety, and their expertise is extremely valuable, and we want to make sure that they feel part of this.

But obviously, we can't include the several thousand people that a program office has. We wanted to keep this small and limit it to just 10 people. So we decided that what we could do is find a small piece of this that we could affect a change with. So some aspect of the OFP that we could segregate and isolate and bring away from the rest of the system, the rest of the value stream, so that we could actually demonstrate and prove out and illustrate the Lightning framework in a way that everyone from top to bottom could recognize and say, "I can see what my role would be to add my role to this." And that was our goal. And really, that drives back to our need statement of trying to get that DevOps to happen at supersonic speeds.

Michael Snyder

Then there is a ton of work. There is a ton of investment and efforts that's happening across the SWEG to try to bring Agile into it, Agile delivery, rewrite and address some of the traditional waterfall ways of operating. But at this scale, where we're talking almost 2,500 engineers and 90 different programs, a lot of which they don't develop it from the beginning -- it's an original equipment manufacturer, it's a big integrator's product that is then handed off to the SWEG to own after the fact -- it's a very tough challenge and a worthy challenge, but definitely a deeply embedded one that we have to try to work through.

And as you said earlier, we realize this will not be successful if we try to boil the ocean, if we alienate the people who have a lot of these established roles, and we don't find a way of making it so that they see their part in it going forward when we try to scale. That's one of the biggest challenges that we want to make sure we address.

And then we have to have our meme requirement for all slides that are presented. So wanted to make sure we included this one here because it is a government presentation after all.

Derek "Eeyore" Bissinger

Yeah. So basically, this brings us to the point: in order to show that this could actually work, this is a purposeful, systematic approach that basically came out of the last three years of trying, working on making this happen. And just like Michael was saying, the challenges, the existing processes, we cannot just go around all of that. We have to find a way to work with, and in a way that meets their schedules.

So we have time, we have energy, and we have passion to move this ball forward. And what we wanted to do was create the isolation necessary to be able to accomplish our goal while still responding to and being responsive to all of the other offices that we know at some point will have to be integrated. So the way we did that is to keep the conversation small.

This was something that became very important early on, like year one, when I was working with this team and we were showing our capabilities, we were showing our progress, we were demonstrating our capabilities to program office management, and they immediately saw that value and wanted to add everything under the sun.

So to keep the conversation focused and to keep the ball moving forward, we rapidly came to these four tenets of keeping the conversation about increasing the use of Agile. And it's very important to set these tenets properly. What we're saying here is nothing new. None of these tenets are new.

What we're saying here is within government, working in the Air Force, working on this project with the rest of the infrastructure in the greater Air Force and DoD enterprise, if we could keep the conversation about what we're doing to add, to make it more agile. Not saying you're not agile. We're just saying, you are you, and we're going to add by focusing down hard on Agile and making sure everything we do is Agile.

And with automation, we're not saying all of those systems integration labs are suddenly going to be out of business. What we're saying is we are creating automation where everything is manual. So it's an add. It's like, if we can do more testing, that's a good thing. If we can show that automation happens faster, it doesn't slow down the OFP, that's even better.

And then digital twins, to show various different hardware configurations digitally to help improve the software solution, going back to those hardware architecture standards that allow us to use modern hardware architectures to help improve security, trusted platform, do zero trust. Also get at some of the more reliable system architectures for memory, for networking.

And then if you roll on top of this the work we really worked hard on the last year of getting best-of-industry engineers on contract to work side by side, what we found out here is they already understand this. This is all like, "Of course you do that." But what they're doing is they're keeping us on point.

So it's very easy for our government employees, our government engineers that have been working for 10 years in the current system to fall back on, "Well, let's just write a manual test. Well, we'll get to automation later," or, "We'll do a waterfall because that's what we know. That works with our leadership." So our corporate partners on this were very good at bringing it back to these tenets. Once we said, "This is our goal," they held us to it, and they were very good about that.

All right. So this is really the summary of where we got to. So if you fast-forward to today, this started in April. We were able to get those contracts built and those best-of-industry to sit side by side, and we created what we know as the Lightning Rapid OFP framework.

It's truly focused on Agile top to bottom. It's using DevOps and automation throughout for integration, for testing, accreditation, security, documentation. It uses the digital engineering to help push those known hardware architectures and do multiple parallel system tests. It allows us to get to container-based applications and show that immutability of containers as they move, and really allows us a lot more, gives us so many benefits that are proven, and ultimately gives us that capability to take a line of code and put it in the jet in 24 hours.

Michael Snyder

So this has been, as Eeyore's mentioned, a three-year process, right? This all started back when it was just a small government-led team. And then back in April, we really started. Our team came on board from our company, but then also there were several other vendors that were all working together as well on this effort. So it wasn't just us, it wasn't just the government. It was a very good mixture of integrated teams that came from different experiences and different focus areas.

So before that, though, leading in, Eeyore and his team have been working on their on-prem infrastructure. Literally MacGyvered their data center on-prem with bubblegum and--

Heavy data.

And rubber bands to keep this thing up and running.

So we came on in April. It was a process to get all of our contracts up and active finally to bring us on to deliver the work. And by the time we came on, we found out, "Hey, guess what? You have a June flight test event that you're going to be scheduled against already, so you guys better figure it out and get cracking."

What we realized, though, is Eeyore has this tremendous vision for setting the whole Project Lightning framework for a reusable and tailorable and portable capability that can be used to support any aircraft system out there, right? That's the vision for Lightning: how can we change the way that software is truly delivered across the Air Force to the embedded systems.

But we can't just start with all that. So our team came in and we worked with them really closely, and the way that we like to approach this is to take a cupcake, where you really identify what are the core components to get something, proof in production, proof of value out there, show people this new capability is possible.

So we did a first design exercise with the whole team. People who had never worked before came together, right? And we decided these are the components that are going to make up the cupcake. Have to hit security controls, have to hit certain functionalities, have to make sure that we hit these other sorts of architectural things as well. Documented that out, captured that out.

What is the next thing to get to that next level? What's the vision for then? A cupcake starts small, then you want to take it to the next level up. You take it to the birthday cake. This now we're going to go into the next few iterations and start delivering the birthday cake and then Eeyore's wedding cake, right, when the big party's happening and everything. That's going to be Lightning, the final Lightning thing that keeps growing, keeps evolving, keeps improving.

But we started at least setting a vision, a true technical roadmap for it, and we proved it out, right? We got something to happen. So we established all the technical baselines for remote access, SSO, test automation configuration, the MANDO and the GROGU components, and we integrated with the OFP development team, right? All of this group came together.

And then in June this year, we executed a full proof of concept during a U.S. Air Force flight test event, right? So we deployed a baseline OFP to an aircraft pod at the beginning of an ATO, an air tasking order day. It flew. It came in, we received post-mission data that fed back to the development team, who updated the software, and then we were able to deploy updates within the same air tasking order 24-hour period. In fact, it was less than 12 hours of difference, with testing, with security validation, and with a complete reboot of the system on the hardware.

The proof of value was demonstrated, and it showed that the Lightning framework is truly something that is viable going forward. It was pretty game changing, considering those updates don't happen at that speed, at that timeline at all.

Now, let's say again, this was just an initial proof. We still have lots of challenges that we have ahead of us.

Derek "Eeyore" Bissinger

Yeah, we want to get to regular flight tests. We don't want to be waiting for a flight test. So now that we got a demonstrated capability, let's get it on a calendar. Let's give the developers something to iterate over time. Let's try every quarter to get a flight test.

Keep pushing the SWEG to take a look at what we're doing and see where it can fit in. We're already talking to A-10s to see where we can adapt and morph the Lightning framework for them. We're talking to several other programs. The 309th at Hill has over 90 other projects out there. So we have this capability as a small team to bring them all in and show them, give them one-on-one, have them sit down, spend a day in the life of with us, and help spread the knowledge out to the rest of the SWEG's program load.

Michael Snyder

So some of the things that we recognize going forward as well is, for this truly to be successful, we need to make sure that we integrate our team that comes from the DevSecOps world and cloud-native world, and the other contractors with the OEMs and their software development path. That path to production has to merge with Lightning, has to merge with this effort, has to be a true partnership, not a fully outsource.

We need to bring the teams, the efforts together. And we also need to continue to mature what we started, right? Within the SWEG and within other SWEGs, and within other software development organizations, Platform One, SkiCAMP, SpaceCAMP, all these other groups that are working together, right? We need mission partners, organizational support, funding, and engineering team members.

The biggest thing for us, success for us in this, is building the capacity within the Air Force to own the path of production. So we want to make sure that our combined team really works together to help level up the true people who own and will drive this effort into the future. And that's Eeyore and his team, and then the broader SWEG.

Derek "Eeyore" Bissinger and Michael Snyder

So again, as we're looking to the future, this is just a nice shot. SkiCAMP is a unique organization within the Air Force in that it is attached to a unit on base, but it is 15 minutes off of Hill Air Force Base. It's in downtown Ogden. It's helping the economic recovery in Ogden.

And there's a big community technical DevSecOps product development, huge tech community that's growing around this effort, and we can't wait to see where it's going, right? We want to make sure that we open it up and bring other people into the mix too, so all the best of the best can come and help support these problems.

And it's a cool working space. It's off. It's not a cubicle farm. It is in a revamped old feed and manufacturing warehouse, distributing warehouse from the 1880s.

Yep. And it's changing the dynamic of what it means to truly build software for the Air Force and deliver true mission impact for the work that people are doing.

So with that, we'll transition to questions. Really appreciate everybody's time. We'll be in Q&A on the Slack channels, and can't wait to hear what people think and what people have to say back.

Awesome. Thank you.