Generative AI in the Enterprise: One Year Later

Log in to watch

Las Vegas 2024

Download slides

Generative AI in the Enterprise: One Year Later

John Rauser

Director of Software Engineering · Cisco Cloud Security

Anand Raghavan

Senior Director of Engineering, AI · Cisco

In this talk, we will look at how the deployment of Generative AI-powered products and services has played out in the large enterprise over the last year. What use cases are winning, and which ones are taking longer to realize value? How has last year's toolchain evolved and updated? How has the development of GenAI products matured across the different facets of software delivery like agile, devops, security, and data science. How has the promise of platforms helped enable large groups to accelerate their development cycles? In the context of these questions, we will review the patterns that are making teams successful, and the anti-patterns that are holding them back, based on lessons from deploying GenAI at one of the world's largest software companies.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

So up next is John Rauser. He's Director of Software Engineering at Cisco, which is now one of the world's largest software companies. John owns Cisco's Zero Trust offering in their Cloud Security group.

He's going to present on Cisco's experiences exploring how Gen AI can enable better integrations across their portfolio products, and how they helped create and become internal customer number one of an internal AI platform. I'm so excited that he'll share some of the organizational dynamics, including patterns and unexpected anti-patterns.

He'll be co-presenting with Anand Raghavan, Senior Director of Engineering, who's basically the head of AI for Cisco Security. Here's John and Anand.

John Rauser

Good morning everyone. It is so great to be here day two of the Enterprise Technology Leadership Summit — Gen AI focus day. I'm so glad to be here. I'm having so much fun connecting with you all — conversations in the hallways, at lunch. It is such a great community here.

I just want to take a moment to thank the programming team and Gene and everybody that puts on this production. It is — yeah, that's right. Let's do that. Absolutely.

And I'd like to say as well: this community — because it's not just a conference, it's a community. This community has helped me so much in my career and in my company. It's helped me with accelerating my career. It's helped me with building better teams. It's helping me with working on bigger things. And I hope that you have that same experience getting involved in this community. It's a real reward. And like I said, I'm so glad to be here.

So I'm back one year later. I'm joined by Anand. We are going to talk about what we're doing at Cisco in Gen AI. But just a quick word about Cisco.

Cisco is one of the largest companies in the world. We do about $50 billion in annual revenue, $200 billion market cap. And you know, you might think of Cisco as a hardware company — that's the older Cisco, selling firewalls, switches, routers. But as you can see on the slide here, over half of our revenue now comes from ARR subscriptions and software. And we're now one of the largest software companies in the world. That's part of the transformation that's ongoing at Cisco today.

There's four key product areas at Cisco, which I've listed out here: Networking, Security, Collaboration, Observability. And Anand and I work in the Security group. We're focused on building security products and building services for security products. And I'll just note here some tremendous growth that we've had over the last year — 32% growth year over year for security across Cisco. And that's a big number. We're not talking about — we're talking about billions here. That's quite a bit of growth, and it's quite spectacular to see that happening at Cisco.

I should stop moving around so much. I just realized there's a camera pointing at me.

A little bit about AI at Cisco. AI at Cisco is huge. We're making a very big bet across Cisco on AI. We've got a strategic portfolio of investments in startups. We've got acquisitions — and actually Anand came in through one of those acquisitions to Cisco. We've got industry partnerships, a lot of strategic initiatives going on. We've got a lot of structuring initiatives, bringing AI to our employees, helping people use it, helping people build products with it. And a lot of grassroots activities as well — hackathons, conferences, all kinds of activities that are coming up. An upswelling of just interest that stems from actually a long history of building machine-learning-driven products in various parts of our portfolio. So we're really looking at AI in a lot of different ways and making a lot of big bets here.

And so it's great to be back here. Last year I was here talking about some of the use cases, talking about how we were planning to build AI. And this year I'm back to talk about how that's working out, and some of the things that we're doing, some of the products that we've actually launched in our portfolio.

Now, there's a question that might be on some people's mind. There's a lot of hot takes out there about AI, and people wondering: is it just hype? And my answer to you today, very clear and simple: no, it's not just hype.

And I think we forget about what it means to have an enormous impact. And it's easy to overlook the fact that a 30% gain, a 10% gain, even a 1% gain at scale — in a business like the ones you're working on here — that is a huge, huge impact. So we are finding that that impact is resonating across the business.

There is actually a report that just came out just recently from Google that looks at this — a survey across the enterprise landscape. 74% of companies are seeing ROI from their investments. 86% are seeing revenue growth from their investments. And 84% are deploying AI within six months, which is incredible to see that.

At Cisco, we're focused on two key areas. One is in our products — building AI into our products. And then the second is helping our employees, so for productivity. And we work in engineering. I've listed out a few of our engineering use cases where we're looking at deploying different solutions to accelerate different aspects of our engineering life cycle. We're actively doing that, and we're testing that too.

So some colleagues of mine in my area just published this report where we've deployed Copilot to see if it actually has an impact on our developer productivity. And the impact is huge. It can't be understated. In fact, one engineer said this to me just the other day. He said, "John, working without these tools is like working without Google. I mean, you can try to do development without Google, but it's really hard, and I wouldn't want to do it. And now that I've got these tools, there's no way I'd want to do it. There's no way to live without it." That's a senior engineer on my team having this incredible impact.

So I want to talk a little bit about how we're actually thinking about our approach to deploying AI in our products. And to do that, I'm going to use a model that we all know and love. It's the layered model from Wiring the Winning Framework — or did I get that right? Pretty close. Wiring the Winning Organization.

So you're familiar with this model, I'm going to assume that. What's interesting to me when I look at this and I look at how Gen AI is going to impact our business: a lot changes up in Layer 3. Those things can change more often than we'd like. Changes in Layer 2 — new things, novel innovations happening at Layer 2. Inventing a new tool category, for example, is not something that happens too often. It's a challenge. And some of you people, some of you may have even tried that. But changes in Layer 1 happen very infrequently. The last major change, truly disruptive change in Layer 1, might have been the DevOps engineer — the introduction of a new role in Layer 1, a new person doing the work.

And with AI, we're introducing a truly disruptive force in Layer 1 — something new, something that we have to figure out how to work with. And I think it's interesting because when we use the model to try to understand this, it predicts what we're going to see. We're going to see an explosion of tools and instrumentation at Layer 2. And we're going to also see that complexity that's going to bleed into Layer 3 if we don't manage it.

So that's where we're taking the approach of using platforms — to achieve some of the goals in Wiring the Winning Organization, to achieve modularity, simplicity, figuring out how to do these things, but most importantly, making sure that we don't experience that sprawl that some of you may be worried about in this room. And actually we're going to hear a little bit more about that later today from John Willis and Joseph Enochs. That sprawl that we saw when DevOps came onto the scene — and we even talked about this a bit at the conference yesterday — 20 different Kubernetes clusters, 10 different monitoring tools. We don't want to find ourselves in that same situation when we're deploying Gen AI, having every product team try to figure these hard problems out themselves. How do I do guardrails? How do I do orchestration? How do I implement compliance and governance into my product?

So that's where the platform — it's truly powerful, and is helping us to accelerate our growth with some costs. So there is the concept of platformification happening here too.

In our group, we made the decision to hold back on our first launch. Now, we did launch the first AI assistant in Cisco, but we held that back so we could leverage the platform and invest in the platform, and then accelerate every other product team after that. So that's the power of platforms that we're seeing today, and we're realizing that value.

That's what Anand is going to talk about next, and I'm very excited to hand off the clicker. There we go.

Anand Raghavan

Thanks John. That live feed is so addictive. I'm having a fun time just reading what people are saying as John is speaking.

So first time here, really excited to be in front of you. Wanted to take a few minutes and walk you through the thought process that we went through in the last 12 months. So my company — we got acquired into Cisco, came in a year ago. And what was that first 12 months like for us, building enterprise AI apps? There's five different apps we are building in parallel on a common platform. We are thinking about common infrastructure across all of Cisco, not just security. So what were some of the lessons that came through that?

First: learn from the customer. A lot of people are building Gen AI products because they want to build Gen AI products. Let's not do that. Let's talk to our customers. Let's look at product and growth analysis. What are the areas where you want to invest in to build something that will help fuel growth for that product?

Second: have a lot of customer interviews. Let's really understand what are the pain points the customers have, and make sure that you're building something using GenAI that solves for that pain point. If it doesn't need GenAI, please don't use it. It only helps if it is solving a real customer pain point.

The next set of things: launch fast and iterate. This market is moving at peak speed. If you're taking three years to launch a product, it's already three years too late. Every month there's a new model coming out. So pick the right LLM and the right platform architecture, roll out an alpha early, and then iterate and start getting feedback from your customers. That's the best way to grow in this fast-changing market.

Third: make it easy to make it better. What I mean by that is, make the product itself help the customer give you feedback on what is working, what is not working. So in-product integration of thumbs up and thumbs down. Have your ML retraining pipeline ready. Have your model eval pipeline ready so you know what's working, what's not working. Have multimodality ready to go when your customers are ready to go from just text-based inputs and outputs to images and voice and video and all of that fun stuff. So: make it easy to make it better.

I'll give you one example. So this is a firewall AI system that we launched December of last year. A lot of work continues to happen on this. And in the firewall case, talking to customers, looking at customer interviews, the pain points are very clear. Number one: we have firewalls for 20-plus years. There's documentation that sits across so many different silos. Learning how to do something — for example, how do I create an access control policy? — could mean clicking into five different PDFs and figuring out what exactly to do. A really good case for a retrieval-augmented generation plus GPT kind of an approach to solve for that. So documentation was number one.

Second: if you have had firewalls for 10-plus years, you have hundreds of thousands, sometimes millions of rules. Let's say you want to ask a question — I know there's a network admin somewhere here. If you have a question — okay, what is the rule that got triggered that blocks Anand's access to dropbox.com, so that I can go move, remove that rule? And there might be multiple of them. So natural language to SQL, or natural language to API calling, to actually identify that rule so that a human can actually go and remove that.

The associated one is to optimize your policies. There's probably several shadow rules, there's duplicate rules. Can you automatically identify these and make it easier for the network admins to manage their firewalls and optimize performance?

And the last one — and I'll get a chuckle here — are there any network admins that are ready to automatically deploy their policies using Gen AI? If so, meet me offline, I'd love to talk to you. But at least can you automate it enough so that it gets easy for the network admin to review a policy that's created, so that they can then deploy at a click of a button rather than sitting and creating it themselves. So those were some of the key use cases we launched the firewall assistant with.

In addition to that, there is an AI assistant for the Cisco Secure Access product that John and team work on. We have one for the Cisco XDR platform where we are changing how Tier 1 analysts interface in the SOC and using AI assistance to completely transform the way they operate. And we have a few more coming down the line for all of the different security products in our portfolio.

I want to spend a few minutes on this slide just speaking through the building blocks of what it takes to build something like this. So first you have the data side of the house. And depending upon what kind of application you're building, you probably need a data store for unstructured data, a data store for structured data, a data store for vector data where you want to store your embeddings, and depending upon what you're building, probably a time series data store as well — so that these are all performing for the kinds of data and the kinds of access pattern you have for them.

Then you have your entire model infrastructure and model environment. You need to have a model runtime environment where you can run these models really fast and efficient. Given the cost of GPUs, and NVIDIA being its own economy in the global world, it's a lot of money to actually run it on GPUs as well. So how do you optimize GPU as a CPU for that? You have your model hosting and inference infrastructure, model repo and tracking to make sure that you have version control for your models, and all of that under the model development and sharing environment. So make it easy for different teams to collaborate on building models and share with each other and all of that. So that's the model ecosystem.

And then how do you improve and get better at this? Of course, you have the model retraining and fine-tuning associated with your model hosting as well. Feedback management — cannot emphasize the importance of that. As customers give you thumbs up and thumbs down, just like you do in ChatGPT, you want to make sure that your data analysts and your machine learning team are in a tight loop of improving the models and fine-tuning them based on user feedback. And very tightly coupled with that is your model eval environment, where you have clear test and training data sets and you're actually analyzing with every new version of the model you're putting out there: are you measurably getting better in terms of model performance?

And then the last one is usage analytics. So if I'm at a product organization building an AI assistant, I really want to know — is it getting used? How often is it getting used? One of the challenges you'll see is, if you're building a product where people don't need to log in every day — if you're just looking at absolute numbers of how often are people using the AI assistant, that may not be an ideal metric for you. Because if I'm a network admin and I only log in if I have to go change a policy, I might log in once a week, maybe once in a few days. So a DAU metric for an AI assistant usage may not be the appropriate one for that. You might want to look at a metric of: when they log in, how often do they use the assistant? Can that be as close to 100% as possible? Can that be 500%? Do they ask five questions each time they log in? That tells you the utility of the AI assistant for that particular user and solving pain points for them. So usage analytics, and thinking through what metrics to put in, is another important part of that.

Going one level deeper, just geeking out a little bit on the stack itself. This is a very simplified architectural view of what we have. So the lowest level of the infra layer — you have all of the different models. You can have your own custom LLMs, you can have just plain calls to GPT for a documentation kind of scenario. You can have NLP services for name recognition, things like that. And then model deploy infrastructure. Above that is a data layer. We spoke about all the different kinds of databases you might have. The AI platform — and I'll go into this in the next slide — an orchestration model that orchestrates across different models becomes very vital when you're supporting a heterogeneity of use cases that the customer has. And then the feedback piece of that.

And on top of a platform like this, now you can plug and play different assistants. All the different ones that I mentioned are the ones that you're building. Now you have a common platform where all of these things are tied together and they learn from each other and benefit from each other. And you can start asking cross-product questions. If I want to say, create a policy that provides John access to dropbox.com, a common platform can create that in the firewall, can create that in Meraki, in Secure Access — all of that with just one question.

And then of course you have your guardrails on the input side and output side to make sure that the data is being used properly, proper security is provided in terms of prompt injection attacks and things like that. Safety concerns are handled — toxicity, harm, abuse, all of that. And then: is it a relevant question? If you ask the model, what's the weather in Las Vegas today? Of course it's hot. But that may not be the most appropriate answer you want your assistant to give, because that's not the job of the assistant. This assistant was trained to do one job in a particular vertical. So relevance also is an important thing for you to think about as you launch these assistants, so that you're not bleeding money to OpenAI because you didn't put in the guardrails — and people can ask, it's a free access to OpenAI that your customers get.

And clicking one level deeper into this — the model orchestrator becomes a very, very important thing in this ecosystem. And what I mean by that is, let's say you have five different use cases a user wants out of your AI system. There is a documentation use case, there is a policy creation use case, policy visibility use case, policy optimization, policy automation. When the user asks a question, how do you understand the intent behind that? And based on that intent, you might want to route it to a different model service that handles the question differently. And that's where a model orchestrator comes in, where it parses that, it understands what is the user trying to accomplish here. And it's able to now make calls to one model, maybe more models as you think about agentic workflows.

And one of the things you're going to hear a lot more this year, if you haven't already, is: if 2023 was the year of RAG, 2024 is the year of agentic workflows. So how can you have an Agent orchestrator that can actually connect across multiple agents, understand what those agents are capable of, get responses from those agents, combine that, and provide it back to the user? So that's what the model orchestrator can do for you.

And with that, back to you, John.

John Rauser

Awesome. Thank you. So really appreciate the time that we got to spend today to talk to you about what we're doing.

We have a few asks. We want to learn from you. How are you implementing Gen AI? So by the way, the best way to connect with us is through LinkedIn, or just come and talk to us in the hallway — that's probably even better. But we want to know how you're implementing Gen AI. We want to know what impacts you're seeing — if you're able to quantify those impacts, if you have some data, we'd really like to hear that.

And then finally, strategically — if you're using a platform approach, if you're not, if you're doing something — we'd love to hear that, how you're thinking about implementing Gen AI at scale from a strategic standpoint. So please reach out to us, add us to LinkedIn, and just say hi, shoot us your thoughts. That'd be great.

All right, thanks everybody. Thank you.