Log in to watch

Log in or create a free account to watch this video.

Log in
Las Vegas 2023
Share
Download slides

DevOps Technology in Isolation is Pointless

We are closer to stalling every day. We thought we were well on the way to elite DevOps performance and then we realized that culture and compliance can stop the dream dead on the tracks.


We can deploy quickly to our staging environments and then the compliance kicks in.


This talk will discuss how Controlant's Governance Engineering project was started, how the FDA regulation guidelines are the design, the pains of manual validation and give an overview of our plans for the future.

Chapters

Full transcript

The complete talk, organized by section.

Heiðar Eldberg Eiríksson

Welcome everyone to our talk, DevOps Technology in Isolation is Pointless.

My name is Heiðar. I'm the DevOps engineering manager at Controlant. I've been in the tech industry since the mid-2000s, coming up on almost 20 years soon, in various roles: QA, developer, sysadmin, and then the last half decade or so in cloud operations and DevOps, specifically at Controlant since late 2018, developing and deploying infrastructure and applications in the pharmaceutical supply chain.

And as you'll see in part of the talk, I have also become somewhat of a DevOps evangelist and thought leader.

With me today is Jaimie.

Jaimie Fryer

My name's Jaimie. I am a quality assurance engineering manager. I've worked in technology now for 13 years, entirely in quality assurance, in multiple industries: in games, in business intelligence, cloud technology, and for almost the last four years in cold chain logistics.

I lead the quality assurance chapter at Controlant, and so I set the standard of quality. I also lead the automated governance transformation, where essentially we are automating compliance.

Heiðar Eldberg Eiríksson

At Controlant, we have a pretty straightforward mission, although it's a lofty goal.

We want to leverage the power of people and technology and automation to facilitate, for our customers and our partners, zero-waste supply chains, to the benefit of everyone here and the planet.

We have had some success in this area. We monitored, actually, every single Pfizer COVID-19 vaccine globally. So billions of vaccines in the last few years with 99.9% plus success rates at a task that, on average globally, has about a 20 to 30% failure rate, actually.

The reasons for that failure rate in general, I'll go into a little bit. But first I want to talk about the beginning of Controlant. It was founded back in 2007 by some friends from college and university. Some of them knew each other from childhood. In the early days, it was very much like many startups: a company of bright minds trying to find interesting technical solutions, perhaps in search of a problem to solve and a market fit.

The breakthrough came in 2009 with the swine flu, actually. Controlant was uniquely positioned to develop technology to monitor the swine flu vaccine in storage and transit within all of Iceland, which enabled the company to gather a lot of experience and knowledge that ultimately led to the development of the first reusable, real-time IoT data logger validated for pharmaceutical use. You can see that in the top right there.

And the accompanying cloud platform to gather data from hundreds of thousands, ultimately millions, of these devices monitoring shipments all over the world.

And that solves the challenge that pharmaceutical companies are experiencing today with their supply chains. So supply chains in pharma, like probably most other industries, are very fragmented. There are many, many companies, many, many handoffs. So there's a lot of fragmentation in the visibility and in the availability of data to really make smart decisions about optimizing the cold chain.

As well as, in the case of Controlant, we give the companies, our customers, the ability to intervene if there is a temperature deviation ongoing before the medicine becomes unfit for use, ultimately saving a lot of medicine that would otherwise need to be destroyed.

There are a lot of numbers here. Annually, about $35 billion are wasted in pharmaceutical supply chains due to temperature deviations. All of this culminates in the pharmaceutical industry being responsible for more CO2 emissions than the car industry.

And that's in stark contrast with over 90% of pharmaceutical executives agreeing that sustainability is a core requirement for the industry going forward.

But the number that is perhaps of most interest to this talk, to this audience right here, is the digitization score of 27 out of 100. The pharmaceutical industry is only very slightly ahead of the public sector in terms of digital maturity and trails far behind, for example, FinTech, the automotive industry, and aviation.

So that presents a challenge for people like Jaimie and I, and we think that DevOps methodology and tools can definitely help.

A little bit about DevOps at Controlant. Back when we joined the company in 2019, everything was running on VMware virtualization. And so SSHing into a machine, SCPing some files around, or rsyncing some zip files, unzipping them in a specific directory referenced in a systemd unit file, and running a `systemctl service restart` was considered a deployment at the time.

And there was a lot of technical debt in the infrastructure, single points of failure, and non-redundancy. So Jaimie and I, when we joined, definitely saw that we had our work cut out for us. We made plans to modernize the infrastructure, estimated it could take at least two years, maybe three years.

But then the World Health Organization declared COVID-19 a pandemic, and the Controlant platform was actually picked as a critical supplier for Operation Warp Speed. So we worked with Pfizer and the United States government in monitoring the COVID-19 vaccines, as I said earlier.

This contract, which we've been working on for the last few years, came with some imposed timelines. We needed to modernize our infrastructure in ultimately months, as opposed to years. And we managed to do so.

We saw massive improvements in DevOps outcomes, much improved security posture, high availability, system resilience and uptime, but also release frequency and so on.

But due to the timeline imposed on us in this endeavor, we ultimately sacrificed in culture. So we rebranded the ops team very traditionally and just called the ops team a DevOps team, and sort of clapped our hands together and thought, "That's that," right?

But no. Not at all.

During this, we saw that the core chronic conflict that's described in The DevOps Handbook, which I recommend everyone here reads if they haven't already, it's a fantastic resource. We just saw that the core chronic conflict started moving.

Jaimie and I had many discussions around the time on how we would actually solve this, and he ultimately was the one to push me to read the book. And so now I'm here pushing you all to read the book if you haven't.

We put our heads together. We wrote a DevOps QA white paper, basically translating the knowledge from the handbook into Controlant-specific context. We came up with processes and projects to follow through with that.

And along the way, we picked up a lot of tools. And here we start to talk about the topic, right? The tools.

So we're using a lot of nice tools. We're a polyglot organization. We have Java code, Python code, and .NET code. But we abstract the pain away from deploying that into production by basically putting it all into Docker, running in ECS or EKS or Lambda. And it's 99% deployed using Terraform run by GitLab pipeline.

So it's actually highly automated. And this toolchain sets us up for elite DevOps performance.

But if you humor me, it's very rare for me to have a room with this sort of experience and knowledge that I have in front of me here. And I would like to perform a little bit of an experiment.

I would like everyone to raise their hand if you think that we are able to release once per year.

Yeah, so that's about everyone, thankfully.

Keep your hand up, and keep it up if you think that we can release every six months. Every quarter. All right. Every month. Every week. All right, starting to see hands go down. Every day. More down. Every hour. All right.

So hours or days, right?

It's actually a quarter. It takes us a quarter to actually release.

And the reason isn't the technology or the tools. It's compliance.

We're still on this journey of DevOps maturity, and we're working on, for example, spreading out the DevOps responsibility in the engineering organization, making the engineering teams capable to self-service deployments while still solving the large problems centrally, like logging or monitoring or stuff like that.

Ultimately trying to make everyone responsible and giving every team the capability to take full ownership over the software development lifecycle, or SDLC. You wrote it, you build it, you deploy it, you run it.

And we've seen improvements in the culture, again, some uptick in our delivery frequency. But we've also definitely seen the core chronic conflict shift again in a very real way.

And to shed more light on that, here's Jaimie.

Jaimie Fryer

Thanks.

So yeah, as Heiðar said, the conflict essentially has shifted, right? And now, I suppose in our regulated industry, that is essentially defined as a conflict between our compliance organization, who want robust, diligent, and documented processes, and an engineering organization that me and Heiðar have been working toward the three ways of DevOps: flow, feedback, and learning.

Engineers, we like to solve problems with tools, right? So we need version control, so we install a version control system.

But our compliance organization come from this industry that we pointed out with low digitization. And their solution to issues essentially is to document and execute procedures. That obviously has a lot of human input, which creates these fragile manual procedures.

And one of the downsides to that we see essentially is the compliant downward spiral, where despite our best efforts, it is actually possible to end up in the spiral I'm going to describe.

Let's take an example from my part of the business, in quality assurance. At the end of every release, we execute our validation testing, which I'll describe a bit next.

One example would be a tester starts testing before the official start of compliance testing, right? And we'll make an initial repair. Obviously this is okay, but people make mistakes. But we would go in our test case management system and maybe set the test case to untested again.

And it's not okay, when you're monitoring pharmaceuticals, to just say, "Well, that was a human error." So we make a procedural evaluation. We ask ourselves, "How did the procedure allow us to make this mistake?"

And then we'll probably update the procedure, maybe change the configuration of the test case management system, which creates a more complex procedure. So there's a higher likelihood of non-compliance the next time we initiate compliance testing.

Controlant is part of a pharma quality management system. Pharmaceuticals are important for the modern way of life. When we get sick, we need pharmaceuticals at times to make us better. And obviously as part of Operation Warp Speed, we saw that pharmaceuticals can also be used to prevent sickness in very meaningful ways.

So we safeguard this pharma supply chain by monitoring it, by making sure that they're stored at the correct temperature or the correct humidity. And also in ways that are less obvious, for example, by preventing stockouts so that important medicines are available to people when they need them.

Regulators also care about safeguarding the pharmaceutical supply chain, right? You would expect your government to try and protect you.

These regulations come most typically from the European Union and from the U.S. government, in FDA 21 CFR Part 11. But they're not really made with DevOps in mind. They're made to keep you safe, and probably not really made for you to go fast with flow, feedback, and learning.

Because the evidence of compliance is what's important, right? Proving that you did the right thing is what counts, using procedures and making sure that the right people are executing those procedures, that they have the training to prove that they're capable of doing it.

So as part of our pharmaceutical quality management system, we execute what is called validation testing in quality assurance. We would probably run something like a traditional release, maybe not as modern as it could be, but essentially run all our automated tests, run our manual tests, and then also we do things like pre-test our compliance.

And then we start this validation procedure where, if you look at the blue boxes in the flow, you can see quality assurance are essentially performing testing activities, documentation activities, and our compliance organization performing the white boxes are essentially inspecting all the work of quality assurance.

Then if anything goes wrong, we see in the yellow boxes that that might trigger an investigation, which goes back to our compliant downward spiral, where we don't want to make mistakes, but it's totally possible because these processes are filled with humans, and people can make mistakes.

And that's not very DevOpsy, right? Because there's a lot of interruptions, there's a lot of handover and communication, long Slack threads, long meetings in the transitions, essentially where the arrows are. And that can make very fragile procedures, as I've said, with a high communication overhead and a large number of instructions for each of these boxes. Most of them have some form of procedural document.

So quality assurance supports the DevOps effort at Controlant essentially by executing and automating its compliance testing. Because validation, as I've just shown you, is part of a pharma quality management system. And Controlant cannot sell solutions to pharmaceutical companies if it is not compliant, essentially.

And just testing, as you've seen, doesn't make you compliant. You also have to show that you've actually performed the compliance actions in addition.

Like the three ways of DevOps again, quality assurance are essentially automating testing. In 2019, we used to be 100% manual.

And then, yeah, we lead the charge on shortening feedback times. We coach, and we try not to gatekeep in terms of feedback. So the quality assurance engineers embedded in our teams do their best to teach developers how to make production-quality code from the start, instead of trying to stop poor-quality code from leaving the building.

And then in terms of learning, our tests exhibit the behaviors of customers, right? So when we execute our tests, we should be pretty sure that our customers have what they need when things go to production.

Quality assurance also are heavily involved in removing bottlenecks in our current value stream pipelines. We are automating compliance and we're leading the organizational change to shortening feedback times by improving our testing from manual to automated. We're also doing hardware test automation.

So, yeah, that's a lot of the challenges. Let's have a go at fixing it, right?

When I realized that we were really struggling with this, I went off and I learned the compliance language. I read the regulations, things like FDA 21 CFR Part 11 and all these sorts of things. And that was really appreciated by our compliance organization in terms of culture.

They appreciated that an engineer had taken the time to understand their world and their perspective. And then we also used what I had learned as the requirements for our automation.

That really started in our journey in September 2022. For example, the FDA published Computer Software Assurance, which is a more forward-thinking, more DevOpsy interpretation of how compliance can actually be executed.

Then I went off and networked with the stakeholders. I found the forward-thinking engineers, my grassroots crowd. I found the sponsors. So in now a company of around 500 people, getting access to the C-suite is probably easier than in larger companies. But fortunately I found the sponsors.

And it was important that these were the sponsors who could actually greenlight a compliance automation project. And they had to be multidisciplinary. The engineers can't just go off and engineer a compliance solution and then be like, "Here you go, compliance organization." And they're like, "Yeah, but we didn't ask for this."

And then vice versa, right? The compliance organization are not software engineers.

And then look into your value stream map. See where the bottleneck is. Identify if it is a core compliance conflict. Stand by the water cooler and understand what your engineers are asking for and how you can accelerate the compliance.

And then listen to the negative feedback of your plans so that you can remove any of the impediments on the way.

And then kickstart your project. Right now that you've got a sponsor, hit your project formally.

We used a lot of metrics, and actually we maybe don't have as many metrics as we'd like. So we use things like the Accelerate book and IBM white papers to say, when you apply DevOps to your compliance, we should see similar outcomes to when you just apply DevOps to your engineering.

And automate as much as possible. Because taking your battle-hardened manual process and replacing it with another process just increases the risk that you're going to make a process that you understand less. And by automating it, essentially, you make your fixes to your compliance issues into software, which are more repeatable and go faster.

And be relentless. The transformation that Heiðar described in terms of us becoming a DevOps organization wasn't always easy and had a lot of meetings with a lot of stakeholders. And it took us three years, right? And I expect that it will be the same when we come to automate our compliance, that we are just at the start of this journey, and we will hit some roadblocks along the way.

Build quality assurance into your standards. I mean, I'm a quality assurance leader. Of course I would say this. But I think there's a virtuous cycle here, because quality assurance exists to mitigate risk in development, right? And regulations exist for very similar purposes. So there is a virtuous cycle, essentially, by building your quality assurance into your compliance automation.

And keep selling your solution internally. Repeat the business case until you're blue in the face, right? So that everyone, all your stakeholders, especially your engineers, understand why you're trying to do this, and message how it will transform you in the market.

I think in many compliance industries, especially ones with low digitization, compliance will be a bottleneck not only in your company, but probably in the companies you compete with. So essentially by automating your compliance, by optimizing your release phase, you will go faster than your competitors and hopefully be able to outcompete them in the market.

It's important to find your internal customers. We're fortunate that there are forward-thinking teams in our company. We prototype with them and we make sure they're not tied to that.

When we selected those teams, we wanted to make sure that it was a team without a harsh fixed deadline, so that when we optimize the compliance automation, that we weren't tied to the same deadlines.

And then we essentially share their successes with our tools by essentially amplifying the feedback and showing that it can be done even on a small scale.

Demo from the first feature so that people understand what you're trying to do. Our compliance organization, for example, didn't really understand. They got the abstract, but they didn't understand until they saw it for the first time. And that's really important.

Show everyone who wants to see it, right? Don't just demo to a small room of your core stakeholders, because then you maximize the feedback. And record your demos, because maybe your most important stakeholders can't attend all your demos. They're usually quite busy people.

And when you're on the right path, you go from this. So this is, on the slides, essentially the documentation that we hand to our customers after we've executed compliance testing, which is arduously documented with low flow, lots of communication overhead, low feedback in terms of only knowing that we're validated every quarter, and low learning, because we don't really get to understand how our customer uses our stuff until every quarter.

To the start of our success, which is automated compliance. So you can see at the bottom here, essentially, is the start of our automated governance system reporting into GitLab for a medium risk strategy.

And then the added bonus, essentially, was that our compliance organization, now understanding our intentions, messaged to our customers what we actually want to achieve.

So if you can't change the culture, then the tools are essentially useless.

Heiðar Eldberg Eiríksson

Yeah, exactly.

We've started to see success in this area thanks to Jaimie and his great team and all of the effort involved. But we still have a ways to go, and this is us reaching out to you guys to help us.

We love talking about this stuff at length, and we're kind of open people. So if you want to share a cup of coffee or a beer or other beverage of your choice and have a conversation about this, we would absolutely love to. Don't be shy.

But we really would like your help to bring about the culture shift in these conservative, safety-critical industries, such as pharma, obviously, but also automotive and FinTech and aviation, et cetera.

And spread the DevOps mentality outside of your engineering organization. Spread it out toward your compliance organizations. Spread the word about governance engineering until hopefully, for the benefit of all of us, the core compliance conflict is gone for good.

Thank you very much.