Lessons Learned Creating an AI Platform for The Cisco Security Product Suite
John Rauser highlights the development of an AI platform at Cisco, focusing on zero trust security for application access. He discusses the challenges of integrating AI across products and the need for a unified customer experience. The conversation covers managing access policies and how AI can simplify this process. Rauser also shares the benefits of being the first customer of a platform, guiding its development and enhancing organizational benefits.
Chapters
Full transcript
The complete talk, organized by section.
John Rauser & Gene Kim
Gene Kim: So last year, John Rauser, Director of Software Engineering for Cisco's Zero Trust offering in the cloud security group, co-presented with Anand Raghavan, now VP of Product for AI across all of Cisco. Gave a fantastic experience report on building an AI platform and using an AI platform to support multiple products across Cisco's security product suite.
This was so fantastic because so much of our experience with platform engineering suggests that development teams get value when they can be liberated from needing to know all the details of interacting with an LLM and all the things that it requires — just as we do when we don't want to interact with the details of CI/CD pipelines, container building, environment creation, and so forth.
So I'm so excited to be doing a follow-up session to explore more of their learnings, especially on their very clever use of LLMs to serve as a basis for integrating the customer experience across multiple products. John, I'm so happy that you're here.
John Rauser: Thanks, Gene. Really great to be here. The talks so far have been excellent. I really enjoyed the content, and so happy to be a part of it today.
Gene Kim: Oh, so good. So, John, I introduced you in my words — can you introduce yourself in your own words and describe what you're working on these days?
John Rauser: Sure. So yeah, I'm a Director of Software Engineering at Cisco. I'm responsible for building the Zero Trust security portfolio in our cloud-delivered security solutions. So customers will use us to access their private applications, access internet applications, and do that securely using their always-authenticated access.
It's a really important product in Cisco in terms of the opportunity. You know, Cisco is using security and networking together and trying to bring that as a solution to customers. And I think these two things truly do go hand in hand. And there are so many opportunities in that area to leverage the data we have, the things we know about access and authentication, the places where people are trying to go. So we're looking to leverage AI and LLMs wherever we can to create more value for customers. And that's what I talked about at ETLS in October last year — how we're doing that, how we're executing on that vision.
Gene Kim: Awesome. And, you know, Anand wasn't able to make it, but could you talk about his exciting new role? Hopefully it was a result of great things he did as opposed to punishment for something bad.
John Rauser: He did. Absolutely. So Cisco is really doubling down on the AI-driven initiatives. And Anand and his group have actually moved to run AI products for all of Cisco. So with a lot of success that was delivered in the security area, they've now been sort of elevated to a new position in Cisco to deliver products and help products get delivered across all of Cisco.
And those two things kind of go hand in hand. One is creating products that customers are going to use to build their AI systems, to protect their AI systems, but also to enable teams across Cisco to build AI platforms, to build AI products more quickly and more easily. So those two things go hand in hand. And I think also, if you wanna build a great platform, you also wanna be the first customer of that platform. In my experience, that's a great way to do things. It's the way that Amazon got built, it's the way that lots of the great platforms in the world got built. So doing those things together in the same place is, I think, so important, and is what creates both great products and great platforms.
Gene Kim: Oh, that's awesome. In fact, we're gonna hear about how you are kind of customer number one inside the Cisco security suite. But before we go there, one of the things that blew me away when we talked last year was the novel use of LLMs to create a single pane of glass customer experience. We kind of laugh about single pane of glass, but I mean, really — everyone knows that customers really don't wanna be looking in 10 different places for things, especially when dealing with important things like authentication, firewall rules.
But you talked about just how challenging it is, especially for engineering leaders, to be a part of this exercise — to create a unified customer experience — because it involves so much communication and coordination and cooperation and constraints. These are very constraining things. So can you talk about the challenges of even tackling a roadmap with so many different product teams, and how this led to this very surprising use of LLMs as a way to deliver value to customers?
John Rauser: Yeah, thanks Gene. So I think a lot of people on the call will be able to relate to this problem, which is a lot of different products coming up in a very large company, and customers getting a disconnected experience across those products. So if you've grown very quickly, if you've grown by acquisition as Cisco has in a lot of cases, you come up on this problem where, for a customer to accomplish a job — for their job to be done — it has to cross many different products. And that becomes problematic in a number of different ways: if they're logging in and logging out of those products, but also tracing the workflow and carrying the information that they need to do their job across those different products.
So specifically in the security group at Cisco, we acquired a number of different companies — Umbrella, Duo. We have the firewall suite, we have our XDR suite, we have the different things that get installed on the endpoint. Some of you might be using AnyConnect VPN today. So we have all these different products that get configured in different ways. And really one of the big pieces of feedback we're hearing from customers is, "I don't like this disconnected experience." So how do we solve that?
That becomes a huge organizational problem. It's a management problem, it's a leadership problem. How do you get all these teams aligned to work on the same thing at the same time? We want to build a new dashboard, this single pane of glass, but we're all working on a different cadence. We have our different delivery times, we have our different delivery styles. And so to get everybody aligned to building that new dashboard is very challenging.
Now, I do wanna say one thing though: we are executing on the dashboard. We have actually just recently launched at Cisco Live, just a couple weeks ago, our Security Cloud Control. So it's our new dashboard where you can configure all our security products in the same place — but it took a really long time to get there, right? That organizational, that leadership problem of getting all these teams to work together to do this. So in the meantime, what can we do?
And that's where AI actually comes in. I think one of the opportunities of AI, and one of the things that I look for when I'm looking for what's the next great idea, where should we be focusing in terms of thinking about what problems AI can solve — well, one of the great things that it does is be the bridge between disconnected, disparate data sets, different ways of thinking. So it can connect those things together that are otherwise completely disconnected. It sits in the middle, it can know about all the things. And so we look for those kinds of opportunities: how can we bring these two things together, or in this case these 10 things together, using AI?
How can we create an AI assistant that guides you through the workflow across products? How can we create an AI assistant that allows you to configure multiple things within the same chat window? And so that's what we looked at doing as a stopgap measure, to buy us some time towards that single pane of glass.
Gene Kim: Yeah. And I just wanna sort of punctuate that. What I thought was so novel and amazing and important was sort of your recognition that this is gonna take years just to coordinate, to be able to build something together — and recognizing that LLMs could be a stopgap. So I was studying the transcript from your talk last year, and here's something that riveted me then and still rivets me now. You said, suppose you had firewalls for 10-plus years, or probably 30, 40 years, right? You have hundreds of thousands, sometimes millions of rules, and you have a question like, "What rule got triggered that's preventing John from accessing dropbox.com? And how do I change or remove that rule?"
And I think what I've just found so shocking is — here's clearly a problem that defies easy human analysis. And we all know that LLMs aren't really logic engines, right? The famous example before the reasoning models: it can't even count the number of S's in "strawberry." And yet LLMs can apparently comprehend and parse firewall rules. Can you help me just reconcile those two things? How does one even explain that?
John Rauser: Absolutely, Gene. So it's very interesting. First of all, the context is you've got a very complex access policy enforcement policy with different user groups, different resource groups, allowing and blocking — and you're trying to take your intention, which is to introduce a new concept into that policy. I wanna block engineering from accessing AI tools. Why would you wanna do that? I don't know; we would never do that, but some people might. So you need to check: is there already a rule in place to do that blocking? And you'll have this question in your head — how can I get that answered?
Well, traditionally it's a very complex analytic experience where you have to go through and check everything. But with AI, obviously you can just pose the question. Now, the point you're making here is: can AI actually answer that question?
And here's the interesting thing — this is where it gets pretty neat — because the AI is not actually looking at this complex policy rule set and making the deduction itself. The AI knows what to do. The AI knows how to solve this problem using tools that it has available to it. So this is where you get into this sort of agentic AI. The AI is an agent that knows what a policy is, it knows where to interact with policy APIs, it knows what tools to invoke to look at that policy and do the reasoning. It doesn't necessarily have to do the reasoning itself.
Where AI can't do the reasoning itself, it can use a calculator. I know as an AI stepping through the problem: at this point, I have to calculate this result. I'm not good at calculating. I'm gonna use this calculator tool, get the calculation, get the answer back, and then continue through the steps. So that's really what's happening when you look at AI solving these problems that involve mathematics or involve complex logic — they're actually invoking tools that are doing that for them. And then also piecing that together.
And here's the interesting thing too: combining that with other things that the user didn't know they wanted when they were asking that question. So, hey, can John access these resources? Yes, they can access them. Do you also wanna see some other information about those resources — the connectivity, their uptime? So these are things that we add other agents to do. We've just added an agent that's the experience management agent — it checks the experience against different resources and can report back, "Hey, this resource is flaky, this resource is unreliable. Did you know that? Did you wanna do something about that too?" And guide them through further steps in the process that they didn't even know they wanted to do. So that's another great thing we can do with AI in this context — suggestions and guiding through these other experiences that are available to people.
Gene Kim: That's super cool and super helpful. In fact, I was rereading the Phoenix Project graphic novel, and it just cracked me up — when they're trying to troubleshoot an issue, of course it's a network change, right? Why isn't it working? Someone changed a firewall rule somewhere. I totally resonate with how hard that is to get right.
John Rauser: That's right. And on that point, Gene — you might ask the question, "Why can't John access this resource?" thinking it's a policy problem, right? You're in the policy context, you've got policy on the mind. We're humans, right? We're thinking about policy. We think it's a policy issue. We spend an hour in policy, you know, grueling through it. It's not a policy issue, it's a connectivity issue — that resource isn't actually connected. AI can do that reasoning and correct that bias in our thinking about how we're approaching the problem, instantly. And so that's a huge opportunity for us as well — to guide users to see those things that they weren't thinking about at first.
Gene Kim: No, that's super cool. I love the way you say it's about giving these tools to the AI so it can use them — just like the parallel of, "Hey, Twilio is a tool you can use to talk to the customer," and it somehow can actually know what to do with it. That is awesome.
So one of the things that you talked about was the dynamics of being first on the AI platform. Anand last year talked about the data stores, the models, the training, the fine tuning, the evals, the usage. Can you talk about what parts of the platform were most useful to you? Maybe any observations you're making about which parts of the AI platform are actually most used by other people? Any surprises, as a consumer — the inside skinny on what was delivered?
John Rauser: Yeah, sure. So when I think about platforms and what they are and what we're doing with these platforms, we're providing infrastructure and tools to teams as a service in this organization so that they can go faster with AI. This is not new — everybody on the call will recognize this pattern. We did it with DevOps, we did it with other things too, which is to provide these platforms that allow teams to sort of unlock value very quickly and then move past that problem.
And one of the things that I've noticed about platforms — when we build them and provide infrastructure to people — there are all these tool opportunities. You can take that platform, turn it into a tool, maybe sell that tool as well. The productization of the various elements of the service chain, of the chain of activities — they can be provided, they can be sold, they can be productized. And so what happens there is that element of the chain of activities becomes its own platform. And so there is a recursive element to this where you have a global platform that's provided to Cisco in general, and then other people are plugging into that.
Well, what do they end up plugging into that? They often end up building their own sort of mini platform to host their specific business-critical needs. The generic platform is not actually supposed to solve specific business problems for security, or for Webex, for calling, for networking — it's to provide that sort of innovation architecture. So, "Here's a way to access LLMs. Here are some guardrails in front of them. Here are some evals. Here's a RAG system. Here are the different pieces that you need — now go quickly and use them." But then we find out we have business-specific needs that need to be implemented at each layer. So we might create our own sort of mini platform to host those specific needs.
I talked about those agents — the policy agent and experience agent — these different things that are specific to our context and our use case. So you get this kind of recursive nature. And that's what I found surprising about this after a year or more of experience with these things — we have to find ways to allow people to use the platform, enable people to use the platform, but then also to plug in with their own needs, which might involve building their own infrastructure and then sometimes upleveling that into the larger platform. So that kind of thing is going on.
Gene Kim: Yeah. In fact, like — Cissy Young, she's a principal engineer at GitLab, and she was saying there are some parts where it's just not known what belongs in the big platform and which ones belong with the teams. And you have senior engineers on each one of these, and you often get into a situation where there are lots of opinions, lots of heat but no light — or sometimes it's really contentious. Does that resonate with you? And what are some rules of thumb you've developed to sort of help teams get to where they need to go with the least amount of wasted energy?
John Rauser: Yeah, and this is the thing — this is this sort of duality, the tension of platforms. It can be an enabler and it can also hold us back. So actually, Gene, you shared with me a great paper, published on your site there — "Building Shared Services That Don't Suck."
Gene Kim: Yeah.
John Rauser: It gives generic advice that is useful to everybody here. If you're looking at building infrastructure and tooling — AI infrastructure and tooling — for your organization, you gotta read that paper. It's just fantastic. And it really captures the sort of zeitgeist of the need for platforms, but also not letting platforms get in the way. So how do you choose what goes where? Well, it's difficult. We have to kind of make that decision as we go along and feel it out. This is all new territory. What's slowing people down? What's helping people? That is our challenge right now, and figuring that out is some of the wisdom that we're going to gain and be able to share with the group as we go forward.
Gene Kim: And then — so you were specifically talking about the benefits of being customer number one. Talk about why you say that.
John Rauser: The benefits of being customer number one — yes — and the challenge of being customer number one. So, you know, we wanna be our own customers, we wanna have our experience with the platform ourselves. The benefit of being customer number one is that we get to kind of guide what's needed in the platform. We get to inform more closely what should be built first. As more people come on the platform, there are competing needs, priorities, all that kind of stuff. So we're no longer the first and only customer — it's not just about us anymore.
So self-centered, right? But the benefit of being customer number one is you get your needs in there. Now the challenge is it does take longer. It does take longer to go to production, to build with a platform. But you get that organizational benefit — you're giving back to the organization, you're benefiting the greater good. It's not faster, but it is better. And in the enterprise, that is what we have to think about. It's not just about us. It's not just about getting our specific product to market quicker. It's about getting all of Cisco, or all of your company, enabled to build their products quicker. That's what platforms are. And that's what we're doing.
Gene Kim: Yeah. And just to clarify — it wasn't faster for you, but it was faster for everyone else at Cisco who benefited from that work, right? That's the idea.
John Rauser: Yeah, exactly. Yeah.
Gene Kim: And then hopefully that counts as some valuable currency — like, next time you need a favor, next time you need a feature, that gets paid back.
John Rauser: Absolutely. That's right. You built some social capital there.
Gene Kim: Hey John, thank you so much for helping connect the dots in the presentation that you and Anand gave last year. This is terrific. Keep up the amazing work, and hopefully we'll see you in September as well.
John Rauser: Fantastic. Thanks Gene. Great to be here.
Gene Kim: Thank you, John. See ya.