Generative AI Governance Strategy at Scale (Adobe)
In the current software development culture, enterprise companies are constantly balancing engineering flexibility and corporate responsibility. With generative AI’s immaturity and sprawling footprint, there’s pressure on both ends of the scale. In this talk, we’ll share our experiences onboarding hundreds of AI use cases and a process that allowed us to move as fast as a developer with the certainty of a lawyer.
Chapters
Full transcript
The complete talk, organized by section.
Brian Scott & Daniel Neff
So in my opening remarks this morning, I mentioned that we hosted an event for some of the 50 top technology leaders that I admire most. And it blew me away that nearly a quarter or more were now responsible for some aspects of managing and governing the AI rollout.
And one of the stories that blew me away most came from Brian Scott, who is now principal architect at Adobe within the cloud ops organization, helping solve enterprise-wide issues and helping teams move fast. And so he will be co-presenting with Dan Neff, principal cloud architect, who enjoys improving a team's tempo and flow and tackling unique challenges to scale and evolve.
So what is so remarkable is that they are collectively responsible for creating the generative AI governance strategy for Adobe, which I believe is the third largest software company in the world. They're working with a breathtakingly broad set of stakeholders across the Adobe enterprise. And I love this because on the one hand we want to do things responsibly, but on the other hand you have thousands of developers with a genuine, voracious appetite to have AI help them do the work that they do — and yet they need to balance potentially existential risks to the organization.
So I asked them to present on what they're working on, why they were chosen, what are the risks and opportunities for AI, and what they've done. So with that, I see you, Brian and Dan.
---
Daniel Neff: Thank you for having us. We're from Adobe.
Before we get started with the presentation, just wanted to talk to you a little bit about our company. Adobe's been around since 1982 and is a company of mergers and acquisitions as well as original development. Photoshop we brought on board along with Aldus very early in our journey. But as we've evolved, we've become more and more of a web company, diving into analytics, moving our desktop applications into the Creative Cloud offering, and pioneering some simplified creative technologies around Adobe Express and Firefly. Brian?
Brian Scott: Yeah. So our company's fairly large — dealing with 30,000-plus employees, all working across many different verticals and pushing technology forward. Again, as Dan stated, we've been around for about 41 years, really pushing the industry. In FY23, pulling in about $19.41 billion in revenue. And really trying to align our company and ourselves towards pushing that technology forward — roughly 8,000-plus patents and growing — and always trying to give back to the community.
Daniel Neff: So more about us. Brian and I are coworkers at Adobe. Adobe has vertical business units aligned with customer demand around creative, document, and analytics work. We're part of the cross-functional teams focused on things like finance, security, legal, and in our case technical architecture. We come from a strong operations background going back many years. For those of you that do remember Friendster and MySpace — you're welcome. Or we're sorry, depending.
Brian, you want to talk a little bit about our current role?
Brian Scott: Yeah, sure. We currently lead technology onboarding within what we call the dev and engineering category, as well as open source. This allows us to maintain a portfolio of dev and engineering tools across the enterprise that teams can leverage, but also nurture and foster a great open source community for those folks that either want to use open source internally and/or contribute to open source externally.
And with that, really trying to again improve the onboarding experience across not only all verticals, but ensure that finance, legal, and security have ownership over some of those existing responsibilities.
For the past year, we've been asked to focus very heavily on generative AI intake. How do we get generative AI technology adopted across the company, regardless of the business unit, regardless of the role? So that's everybody from someone in finance that wants to use AI to generate summary content, to integrating cutting-edge technology from third-party vendors into our marquee products.
Daniel Neff: So we wanted to come up — with our experience in operations, having lived through Kubernetes and the cloud onboarding and machine learning — we realized there was a place in the future beyond the hype where we wanted gen AI technical onboarding to be as dull and predictable as all the other well-oiled parts of the machine.
So Brian, you want to grab one half and I'll grab the other?
Brian Scott: Sure. I'll go ahead and start off with maximizing responsibility. It's really very much a balancing act. We want to try and maximize on responsibility to really promote safe use of generative AI. With all the various different technologies that are coming out — whether they're third-party SaaS tools, whether they are large language models, whether they are add-ons being added onto your favorite video solution of choice — there's always this delicate balance of allowing your team to move fast, but also moving fast enough to ensure that you're putting in the right forms of governance, like compliance, dealing with all the amazing contract and fun language that can exist in those T&Cs, but also aligning with data governance policies, ensuring that your data is safe, especially when you're dealing with so much of it within the enterprise.
Daniel Neff: And Brian is a well-loved boat anchor on my efforts on the left, which is the traditional Silicon Valley view of solving the customer needs: using the best technology, hacking and iterating quickly in a lean-agile motion, and onboarding new features with new tech in existing technology. In that case, I'm talking about people love using IntelliJ, but then integrating GitHub Copilot as a third party into the existing standard tools.
We added the Gartner data in the middle for this year, just to point out that this is not an AI problem. This is a challenge that we have had ongoing and will continue to have beyond gen AI.
So with that, let's talk about gen AI a bit more. Brian?
Brian Scott: Definitely. So the hero of our story is what we call the A-through-F framework. This is really focused on a single artifact to allow all of our stakeholders in the entire review process to really be on the same page and understand all the data that's going in and all the data that's going out.
So when we think about this, we leverage what we call Team A. Let's say our customer support engineering team — their objective is to improve customer quality of service. This is very important — to provide better quality of service, to really treat your customers as VIPs and really be attentive to their needs. We then ask for the audience. Well, the audience here is going to be our internal customer service reps. The objective is to make our internal customer service reps more productive.
Now, we ask for the input data type, and here they're defining it as published documentation and internal runbooks. Along with the input data type, we also ask for the data classification. This is very important because it allows all the stakeholders as part of the review process to understand the classification and what type of data that is. This is very important because when you're dealing with LLMs, those terms can be very different in terms of how certain pieces of data may be treated. This allows, for instance, legal and/or security to understand that the input data is internal data.
Then we ask for the technology. This is also very important. In this case it's Azure OpenAI, LangChain, and a vector database. And so with this, not only do we have our audience, our input data type, the objective, and which team is requesting it — but we also have what type of output they're going to be creating. Here they're going to be enriching their how-to docs with all the context coming from the actual input. We also ask for the data classification for the output as well. So again, the input data and the output data both each have a classification that will come into play down the road. This really just aligns all of our stakeholders and allows them to look at this use case through the same pair of glasses. Dan, anything you want to add here?
Daniel Neff: Yeah, I want to just give a little color to the icons on the right. We've reduced this to the critical six data points for an AI use case. But as Brian alluded, there are a few more follow-up details. If you look at the right with the stakeholders, we've represented security with the lock, legal with the scales, and privacy with the box-in-a-box icon. For each of them, they have a few more detailed questions they want us to ask.
So on the left is our minimum viable AI use case. On the right are stakeholders that may have additional details they want to capture, or which elements of the use case are critical to them. I've also shared on the left there the builder and the target customer — which is Team A as the customer support — and the hammer and screwdriver being the engineering team itself. We'll talk about this more as we go forward.
Brian Scott: One thing I'll add real quick, Dan — privacy, legal, and security: obviously every enterprise is different, so you may have other groups or other stakeholders that you just may want to drop in there to be part of that review process.
Daniel Neff: Yes. And understand that in Adobe culture, these use case requests are valid to come from any role inside the company at any level inside the company. So currently we're dealing with hundreds of use cases.
So now that we have a use case filled out in reasonable detail, we have a risk score that various stakeholders will assign. For example, if we look at legal caring about the audience — and we go across to the middle slider of risk — on a scale of zero to one, audiences could be internal or external. In the case we had before, the audience was internal. So the risk score for the audience is on the lower end.
I apologize for my numbers on the right; they don't quite follow the narrative. But you can see similar things with input data being public or private, with the objective being summary or actionable. And so this scoring goes on against every AI use case. Brian, you want to talk about how we actually use the risk score?
Brian Scott: Yeah, definitely. If any of you have ever watched Twister — there's this Dorothy device that they launch into the twister to give folks an early warning system. This is kind of how we treat the actual risk score. It really gives not only the person who's creating the use case an early understanding of where they sit on the scale, it helps them understand: okay, how long is it going to take for my use case to get fully reviewed? If their use case is on the very high end, they know it is going to take some time, there are going to be some additional questions asked, there's going to be a lot more rigor to the actual process.
Now, from the folks that are dealing with responsibilities such as legal, security, and privacy, it really allows them to say: okay, we understand this is going to be either a low or a high — and now they can change up their questions to the requester to help speed things along a little faster. It really gives everyone an early warning system of the risk level for this use case. So when Dan and I route these use cases over to legal, we actually provide them with the actual risk score in the use case so they can easily understand where it sits.
Daniel Neff: Yeah. We have a Christmas buffet of details here, but we are being given by Gene kind of a train-station-kiosk worth of time. So we'd love to go into more detail in the future, but this is kind of the basics.
So we are huge fans of The DevOps Handbook and The Phoenix Project and bringing in the Accelerate metrics to the company. We were also early readers of Wiring the Winning Organization. So we wanted to break down how we went from nothing to something against the solidify, simplify, amplify model. Brian?
Brian Scott: Yes, thanks Dan. With solidify — this whole process is very much an MVP and nothing's perfect right out of the gate. So we end up having to put in a process where we can stop, review, and refine, taking all the feedback not only from the folks that are creating these use cases and going through the entire review process, but also taking feedback from legal, security, and privacy on what they would see within this whole pipeline and how we can actually make it smoother for both sides.
With that, creating some early feedback loops for the folks that are creating these use cases — we put up some upfront quality checks. This is to allow the requester to understand what they may need to adjust or fix before they go through the full review process.
Then we try and optimize for approved tech. And what this means — if we move into the simplify column in the middle, at the very bottom — we really try and optimize for known tech over new tech. This is because we have certain use cases that have to move really fast, and so we try and steer requesters to known approved technology to allow them to move fast through the entire process. It's always going to be faster for them to onboard to existing technology than for us to review and work out onboarding a new vendor or a new technology. Separate from that, technology onboarding is actually treated as a separate process — actually onboarding a new vendor or a new technology follows its own separate process.
Then we try and optimize for the shortest job next, meaning we try and route or review the use cases that have a low risk score versus those that have a high risk score, just because we know the low ones will get through the pipeline a lot faster.
We also take a file-system-based approach to how we triage use cases — which ones are coming in, which ones are currently in review, and which ones have been completed.
And then the most important one is really creating a single funnel, a single entry point — meaning we have a single form that everyone goes to to submit their use cases. Prior to this, we had literally about three, four, or maybe even five different forms floating around that were grabbing all this data, and teams were going crazy figuring out which form to fill out, especially if they were maybe working within a specific vertical. So we brought all that together into a single funnel to really enhance their actual experience.
And then to help amplify things — we weave in certain points to really identify high-value use cases, again those that may be going out to market, as well as helping to identify those low-risk use cases to really get those through the pipeline a lot faster. Dan, anything you want to add?
Daniel Neff: No, I think we started off being ticket agents at the airport just trying to move each one faster. And we decided to build a little lounge, and we found that talking with people about their cases upfront — even quickly — allowed us to get them to the technology and solutions they wanted faster. We are doing our best to turn it from a personality-based process to a purely functional process. But a little bit of human interaction doesn't hurt.
With that, if you're going to try and build something like this or do this yourself, we want to recommend — capture from your stakeholders internally, the people that are going to be responsible for sign-off, as much stuff up front. Create these scenarios that you know would be green, ask the stakeholders: "Hey, would it be all right to skip the triage and let us tell you that these are green?" And establish a tech radar so that people can self-select known technologies as easily as possible.
Brian, I'm giving you the dark side.
Brian Scott: I think there are definitely some things that we've learned. We've stumbled over some rocks, but there are a lot of lessons learned and some things that we feel you should probably avoid.
Don't put more water in the pool than you can drain out. What this basically means is that we were routing a lot of use cases over to our stakeholders and providing them too many all at once that they cannot review. It really goes back to that stop, review, refine process.
Again, taking all those multiple different intakes and just consolidating them into one entry point to allow all teams to go to one form. And really follow the A-through-F framework.
And then perfection over MVP — don't try to perfect your process, that's just going to slow you down. Really focus on your MVP and take in that feedback and iterate, just as if you were building an actual product.
Daniel Neff: Yeah, I want to highlight the multiple intake funnels. There's nothing in your life more dreaded than when the senior vice president of privacy and governance asks who specifically approved this dataset for this team for this use. And the single process is the only thing that's going to save you.
So hey — we'll shoot up a flare. We would love help if other people have strategies that are similarly aligned in how they're working through what we have. We'd love to improve what we've got going on, or take a look at what you have and give you some tips. So please reach out.
I'd love to share — we have a huge problem with approved technology and vendors versus approved models and datasets and temporary access to those. That becomes an entire effort by itself, just the intersection of what teams are using and who has permission to use it. Vendors aren't really providing a great solution here, but if you've solved this and it's not a headache for you, we'd love to know what you're doing. We'd love to learn enterprise-scale best practices.
So thank you to the previous presenter, and we look forward to seeing what the next group has to share. Brian?
Brian Scott: You hit all the points, Dan. Thanks everyone. Thank you.