The Rise of Agentic Testing: The Old Ways Are No Longer Enough
AI is fundamentally changing software development, including development pace, release frequency, and how to ensure quality. The use of AI continues to increase with greater investment within organizations across industries and is no longer optional. Companies that effectively leverage AI as a path to innovation have a competitive advantage over those that attempt to stick with tried-and-true approaches.
Using AI for software development alone isn’t enough. Users need software that works, is free of bugs, and is high-performing. As organizations increase their use of AI, they must also invest in AI-powered testing to achieve quality and speed.
During this talk, we will look at agentic testing, why it is, how to use it, and why it is essential. During this session, you can expect to learn:
What is agentic testing and what are its applications?
Why is agentic testing fundamental to any testing strategy?
How Agentic Test Automation simplifies and accelerates the testing?
How a real organization was able to apply these techniques.
Chapters
Full transcript
The complete talk, organized by section.
Scott Erlanger
Welcome, everybody, and thank you for joining our talk today. The topic of this talk is the rise of agentic testing: the old ways are no longer enough. We are going to talk first about the old ways: what they were, what they are, what people are doing, and why they are not adequate anymore. I am Scott Erlanger, director of product marketing at Tricentis. I am joined by my colleague Ian Flanagan, director of solution architecture at Tricentis. We are going to walk you through the challenges people are facing and what people can do to improve their testing and take advantage of agentic AI.
I want to take a step back and talk about what the software testing story has been so far. Let us start with manual testing. Manual testing is an essential part of testing; people still use it nowadays, even if they are doing test automation. But there are downsides. To test an application manually, you have to click through the application. It is slow, and it is hard to cover every single test case consistently. Even if you test it once, when a change is made you have to cover it again and make sure you are hitting all the cases. Different people often test differently depending on who is doing the testing. It is also high effort, so you need many people to do this level of manual testing.
The first approach to being able to address this and achieve more through manual testing is scaling. In order to do more releases, you need more people, more time, and more tests. Eventually this blows up and becomes out of control. Then comes test automation, which we will dig into more. This is when you automate and create tests that you are able to repeat, run again, reuse across the organization, and reuse across applications. This leads to higher quality and consistent repeatability, regardless of who is doing the testing.
When we look at an end-to-end enterprise type of application, it may include web, mobile, a custom application, a payment gateway, Salesforce, SAP, and NetSuite. Many companies have applications that look like this. How do you typically automate testing of a solution like this? Often it involves many different pieces and connecting them together: model-based testing, recorder-based testing, scripting, perhaps Appium, Selenium, unit testing, prebuilt steps, and performance testing. You piece together the overall strategy for this end-to-end application. It is a fairly complicated process as part of this automation.
Automation has many benefits, which is why it is an essential part of testing. First is speed: you can release faster and have a suite of tests you know exists and can run to achieve the quality you need for your applications. There is productivity: people who did not write the tests can run the tests, and you can leverage them for different projects and applications. There is less risk because you know what your coverage is and what your tests are at that point in time, so there is less potential risk of bugs getting released into production. Finally, higher quality leads to a more positive user experience, higher user satisfaction, and lower churn. Test automation has really upped the game in terms of quality.
There are examples from our customers. One was basically saying they cannot have people spending time on software testing tasks that can be automated. They found they could reuse their team on higher-priority tasks without having to spend time doing manual testing, which had a huge benefit. Another case linked together different SAP solutions, including S/4HANA, Fiori, and Concur, as part of the overall end-to-end test and made it easier. That is what test automation has allowed real businesses to achieve.
Getting back to the point of not having adequate testing, there are real costs associated with this. For example, 63% of people often deploy code without fully testing it. And 81%, from a Tricentis study, say that it costs between $500,000 and $5 million every year due to the cost of not fully testing out the code. These costs come from time spent debugging, time spent developing fixes, system downtime, loss of reputation due to system outages, and other problems. Testing is clearly important, and the applications are fairly complicated. But AI is really changing the process of software development.
Ian Flanagan
Sure is. This quote came out, obviously, about Claude/Anthropic. If anyone has seen this, it seems a little high to me, but I know that is where this is all going.
Scott Erlanger
Absolutely. Six months ago they estimated it was 90% of the code in 12 months from now, so you can see this trend is absolutely growing.
Ian Flanagan
Again, really interesting: 61% of CEOs say they are adopting AI agents. To me, the most interesting one is that 75% of software engineers are using AI code assistants of some sort, or will be by 2028. When you look back to 2023, it was less than 10%, so it is a huge significant change. We know AI is going to contribute a significant amount to the economy, and we see huge growth in the number of organizations using generative AI for code creation specifically. These are all from different sources, so lots of different people are finding similar results in terms of the growing use of AI. Developers are using more GitHub Copilot to be more productive; we are seeing it all the time.
The challenge, and I used to do development myself so I know this very well, especially if anyone has worked with Claude/Anthropic, ChatGPT, or those types of tools, is that there are design flaws and security issues. A lot of your time ends up having to redo what you did. Who has had that problem? Quick icebreaker: what did the zero say when it started dating? I want to find the one. If anyone wants a dumb bad joke after this, come talk.
Where do teams find the most value? Really being able to leverage AI tools to allow them to create automation faster and at scale.
Scott Erlanger
Absolutely. The point on this slide is that developer productivity is going up, and testing needs to evolve to keep up with that rate. As Ian said, 60% of DevOps professionals list testing as one of the most valuable areas for AI, one of the top areas where there is benefit and potential.
This slide brings together everything we have talked about so far. AI clearly has many benefits but also challenges. Faster development means you can generate code much faster, but you can also generate more complex code, and more complex code often has more issues. Traditional testing, like manual testing and scaling manual testing, can fall short. It has gaps that need to be addressed. Writing tests is time consuming. If you are using something like scripting, not the entire team may have the skills or testing experience to generate these tests. People doing manual testing may not have the skills to create tests, there may be gaps, and one of the biggest challenges is maintenance: constantly needing to fix tests that break.
Ian Flanagan
Absolutely. These all add up to the cost as well.
Scott Erlanger
This is where AI comes in. AI is generating the applications; now we will talk about where AI comes in to help on the testing front. This is a short survey of places where AI is used, not a complete list. AI is used through natural language processing as part of test creation, using a simple description to create things like test cases. It can create complete tests, and we will show that later. Automatic test case generation means that rather than scripting a test, manually testing, or building something with a low-code solution, it can build it from a simple description.
Self-healing tests address the maintenance point Ian mentioned. The tests can fix themselves so you do not have to spend the time on maintenance; the tests keep working. Enhanced test coverage uses AI to analyze the nature of applications, figure out where there are likely to be problems, and figure out where testing should focus. That lets your team focus on the needed testing where it is most needed. The use of AI is about efficiency, ease of use, focusing, and improving application quality through better testing.
One customer used Testim Copilot, a generative AI copilot solution and an earlier generation than what we will show in a minute. They found that during releases it was hard to keep up with testing, hard to maintain tests, the solution did not work well for all their global testers, and they had issues viewing reports across teams. By moving toward a copilot, they were able to create tests three times faster, achieve more with 50% more tests running in parallel, and increase tests by 30%. Their overall productivity grew through this first version of AI, Testim Copilot, at a publisher in Europe.
We have walked through the story so far: manual testing, scaling up, and test automation. Now we are moving toward the fourth pillar, agentic testing, where we think testing is going. Agentic testing uses AI agents. These agents work together to create tests autonomously. You do not have to interact with them step by step; you tell them what to do and they do the work for you. They analyze results, continuously get better, and find problems with tests. You can almost view these agents as coworkers. They are not necessarily something you query, get a response, query, get a response. You give them a task to do; they do the task, achieve the result, interface with different technologies, bring components together, and deliver an answer, which in this case is a test, with minimal intervention from the user.
Ian is going to walk us through some of the main components of agentic AI and how it all works together.
Ian Flanagan
Let us get into it. Users create different tasks, and those users interact with different agents. Those agents communicate with a large language model, depending on what that is, and that interfaces with MCP servers. Anyone using MCPs today? Those communicate with different tools or technologies either through APIs or directly with applications.
A user enters a request, it interacts with different agents, and those agents work with a large language model. The agents orchestrate and sequence different tools. They interface with an MCP server. That MCP server sends requests and coordinates responses. The MCP server sends those responses back to the large language model and the agents. This process repeats over and over again.
Scott Erlanger
Finally it sends the response back to the user. Notice this picture is not necessarily related to testing. You could add tools, add a fifth or sixth tool, or take them away. It is adaptable for many different processes. We wanted to make sure everybody understood how agentic AI works.
We talked a little before about copilots, and I want to highlight how agentic AI is different from copilots, which we will call assistive AI here. With assistive AI, it is more about insights and querying generative AI with a question about very specific, focused tasks: summarizing a test case, asking what a test case does, troubleshooting, finding bugs in a test, providing help, or doing a focused section of code generation for a test. You are doing the work and using the tool to help at each step.
Ian Flanagan
Imagine if you could provide a requirement and have it create an entire end-to-end test case for you within two or three seconds.
Scott Erlanger
Exactly. That is where agentic AI comes in. It is not the restrictive smaller problems. It is larger things where you can give it a requirement and say go. As Ian was saying, it has different agents that can talk to each other. If you need to coordinate performance testing with functional testing, agentic AI has multiple tools it can work with to achieve larger problems. At the end of the day, it is basically a coworker: you give it a task, it does it, gets it done, checks the result, and gives you the response back, rather than you walking it through a series of steps.
There are many opportunities from this new technology, but we want to make clear there are challenges too. It is new technology. You have to apply it correctly to the problems you are looking to solve. For example, you have to make sure you are telling it to solve the right tests in the first place. If you tell it to generate a test that is not particularly important, is not needed, or does not make sense for your application, it will not generate the correct response. It is a garbage-in, garbage-out problem.
There are also quality gaps in judgment. If you are an expert user of an application, you are more aware of certain use cases than the AI is. There is the possibility of accepting without verification, so you need validation of whatever is generated before you put it into production or into a test suite. Testing without domain knowledge is also a challenge.
Ian Flanagan
As you have more complicated systems and applications, it is important to have the right framework surrounding your agentic AI for it to come together and achieve the results you want.
Scott Erlanger
There are steps you can follow to address these challenges and make sure agentic AI is productive and generates quality results. First, determine the AI goals. You need to know what you are using AI for. If you just say, AI, go solve this, you probably will not get the best results. But if you have a plan and say AI should generate these tests in this area, then you have focused where you expect AI to operate and deliver results.
Establish guardrails, which can also be thought of as governance. Give it rules and guidelines for the scope it should consider when solving the problem. Developers have guidelines for what is within requirements for an application; this is giving instructions to the AI about what it should consider within scope for test generation.
The workflows have to be clear. You need to know what processes you are using this for and what processes you are testing. If it is an amorphous huge problem, it may not have a focused solution. Look at how AI is used as part of the process, and make sure the overall process and how AI connects to the rest of your processes works correctly. If AI is used as a subsection of an overall Salesforce testing flow, make sure the rest of the Salesforce flow works before you insert AI in the middle of it.
Over time, track performance. Are the tests it generates getting better or worse? Does it feel like it is no longer hitting the important cases? Do not assume it will continue working forever as your applications change.
Ian Flanagan
Or assume that what it generated was appropriate. The last point is making sure someone reviews what this is generating and that it is valid.
Scott Erlanger
It is important to think through and have a good strategy and overall plan before endeavoring into it, to ensure you get the results you are looking for.
At Tricentis, we have agents as part of our agentic solutions. They are customized for testing and specifically for the types of testing we have. You will see one for Salesforce, for example, customized for testing Salesforce applications. MCPs interface with all of our different AI tools and all of our different tools to connect them into an AI ecosystem. Different technologies are supported: SAP, Salesforce, web, and others. They are built to run with Tricentis tools. If you have the ability to control your agents and ecosystem, they can understand the tools they are built to work with as part of an overall platform that has security and other attributes. Rather than using lots of different AI components, one thing we have been looking into is pulling it together as one single interface you can go to for solving different problems with AI.
Ian is going to give a demo. One thing we will show is what we call agentic test automation. This is test automation generated through an agentic means. It can generate using natural language, use agents to iterate, and create an overall end-to-end test. You can take manual test descriptions and say, create a test for this test case, and it creates the test. It creates complete tests autonomously, with no human intervention involved. It helps users onboard quickly, because it is easier to type something in natural language than to learn a new tool. You can focus on strategy rather than manually building out each individual test step. And, as we discussed, you can reuse people to do other work because the overall testing effort is simplified.
Ian is now going to show demos of some of these capabilities.
Ian Flanagan
This is our agentic testing for Tosca. I do not know if anyone has used Tosca, but it is one of our key products. You can automate end-to-end complete flows: SAP, web, desktop, mobile, really any type of application you want. This is the agentic test creation. On the left-hand side, we provide a specific requirement and it generates an entire test case based on that. Is anyone doing any sort of prompt test creation today?
Prior to something like this, and this is a web app, if I were doing something like this with Playwright or Selenium, it probably would have taken me maybe 20 or 30 minutes to create this test. We provided a prompt, and it is going through and creating this entire flow.
Next is agentic test creation for Salesforce. Does anyone test Salesforce or build Salesforce applications here? Scott mentioned this earlier: one thing different about using something like Claude, Anthropic, ChatGPT, or some of those tools is that in this example I could have it create a test case in Playwright in 15 or 20 seconds. The problem is that it will not have the context. The difference with agents is that this agent was trained to look at the Salesforce cloud that we have configured. It could be community, sales, or marketing. This goes through using our agent AI and creates the entire test case.
Scott Erlanger
You can see it on the right. It is going through different steps. It was able to create the different steps, and we are rerunning it on the Salesforce environment.
Ian Flanagan
It is learning from what we are doing. If something was not working in the test, it goes back, changes it, fixes it, and reruns the test to make sure it is working. We train this model. It analyzes any custom Salesforce environment you have created and pulls in any custom fields. It is dramatically different than trying to do something like that with Playwright.
Scott Erlanger
In that amount of time, it created a test, ran it, and it passed. The point we want to get across is that what has worked well so far has worked well, but it will not necessarily get you to the next point. Faster test automation is essential to keep up with the rates of development, especially when developers are using AI to generate applications more quickly. Leveraging AI to solve complete problems rather than only assist with problems offloads the work your team has to do when creating these tests. People can do other work while AI is solving these problems. Finally, agentic AI can coordinate multiple tools using MCPs to scale and add tools that enrich what it has in its toolbox to create tests and ensure best practices. It provides scalability too. The bottom line is that agentic testing is what will help people reach the next level of software quality.
Thank you for joining us today. Stop by our booth, booth 12, in the back in the middle near the coffee and food. We would love to talk more about test automation and any quality issues you may have and help you out. You can also visit us online at the QR code to learn more. I look forward to talking with everybody else throughout the conference. Thank you.