Further Results of Our 500-Person GenAI and Developer Pilot

Log in to watch

Connect Feb 2025

Further Results of Our 500-Person GenAI and Developer Pilot

Fernando Cornago shares insights on his role in digital and e-commerce technology, focusing on a generative AI pilot designed to boost developer productivity. Initially met with skepticism, the pilot eventually demonstrated significant daily usage and improved efficiency. The discussion highlights the limited time developers spend on core tasks and the varying performance across teams, emphasizing the need for broader AI adoption and collaborative research on its effects.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

Gene Kim: All right, the next speaker up is Fernando Cornago, who I've been following and studying for eight years. He is now the global SVP of digital and e-commerce technology of that amazing technology organization. Last year he gave a phenomenal presentation on their 500-person GenAI developer pilot. He described his hopes that it would elevate developer productivity, and he had some absolutely remarkable findings that many of us have been pondering ever since, including last week.

Gene Kim: I'm delighted that some of his conjectures and observations he is going to be able to go into in even more detail. Specifically, the goal is to validate these intuitions and show that it is actually repeatable and can be spotted in the world at large. Thank you so much for being willing to share the continuation of that story.

Fernando Cornago

Fernando Cornago: Thanks a lot, Gene, for inviting me. Always great to talk to you and share some stuff.

Gene Kim: Fantastic. You gave, as I mentioned, this amazing talk. Can you set a timer for 20 minutes about this amazing pilot that you did? Can you talk a little bit about the pilot, specifically your goals for it, and maybe recap some of the stunning statistics?

Fernando Cornago: First of all, it's not 500 people anymore. Half a year later, it is 700 people using Copilot every day. The goals: as a technologist for a company like adidas, I always tell my team that the goals of tech are to try technology that is popping up, really succeed or fail fast, and help our business. We had to eat our own dog food. A couple of years ago, this new thing called GenAI for coding was coming, and of course we had to try it.

Fernando Cornago: How did we measure it? We measured it in three ways. First was adoption. I've been long in this industry, and there is no way that you can push a tool into our engineers. They will find a way not to use it, and it will never pay off. That was the proof, because we started two years ago, and Copilot was not the first tool we tried. The first pilot was a disaster.

Gene Kim: To be clear, it was because no one wanted to use it.

Fernando Cornago: Exactly. Ninety percent of our developers said, I'm wasting my time. I'm firefighting and troubleshooting with the tool, and I'm not getting any value from the recommendations, autocomplete, test-data creation, et cetera.

Fernando Cornago: Second, you know me from different talks: I'm obsessed with metrics. We measured quantitative metrics. For the pilot, just to be clear, I have not measured for the last six months, so I'm going to do it now and I'm going to see new metrics. But we checked usage, and 82% of the 500 users at that time were using it daily. We also took more detailed quantitative metrics, like the amount of pull requests or commits. For 60 to 70% of the people, the number of commits and pull requests, so the efficiency, was growing.

Fernando Cornago: The qualitative aspect was also very important. We surveyed all the users, and we got 91%. Remember, the other tool was 90% or 91% saying they would not use it. In this case, 91% said it was very useful and they did not want to work without it anymore. The satisfaction was between 60 and 80% of their time improved. The overall efficiency question was: are you more efficient, do you feel more efficient in your time? They consistently reported between 20 and 25% more efficiency in their craft. They were faster achieving what they wanted to do, whether that was a code change, creation of a new routine, a new algorithm, or a new feature.

Gene Kim: You mentioned when we talked last week that you were seeing double-digit gains in pull requests and commits. Can you share something more specific, or we can just leave it at that?

Fernando Cornago: The last time we measured, it was exactly this: between 20 and 30% improvement for 70% of the people already in the first month of using the tool. That is why now, six months later, we need to engage again in a couple more surveys and check the data, because I assume, and I really like the phrase embrace exponential, that it is going to be exponential. We have all the data.

Gene Kim: We have so much to talk about. You started the Gartner study that said the average developer spends only 25% of the time doing what they really want to do, as you put it: spending time on their craft, hands on keyboard, advancing the goals for themselves and for the team. I know you care a ton about liberating developers from things they should not be doing. What are your observations? What did you do about it? What have you seen? Where do you want to go?

Fernando Cornago: This was the most curious finding after the pilot. We checked the pilot for using generative AI and seeing how teams got better in their craft. In conversations with our board members and executive staff, they asked: now you are more efficient, does this mean we are cheaper? How much more efficient are we as a tech department thanks to this?

Fernando Cornago: This 20 or 25% of their time raised the question, 20% of how much time are our developers spending in their craft, in their IDE? We took a look at Gartner, and Gartner was saying that a company like ours would have between 20 and 25% of engineering time only in their craft. I could not believe that, because I walk through the floors of our different technology locations and I see them most of the time in the IDE.

Fernando Cornago: I recycled an exercise that we did in 2018, when we were starting hiring engineers and checking the time spent for an internal engineer compared to externals in the past. I found two trends. First, we are a little bit better than what Gartner said. Our engineers spend, on average, 36% of their time coding and testing, which I still think is too low, but it is what it is. Second, compared to 2018, we became much better.

Fernando Cornago: We tested seven different teams for this study, and they were teams at completely different technology maturity levels, and even different states of their project or the value they were delivering. Some were steady-state value, others were in a big project. We split their time between coding time, which is what Gartner was measuring, and productive time. We call it time on keyboard, but you can call it happy time, time on your goal, or craft time. It is not only coding; it is also analysis, design, documentation. It is all the work developers or engineers are hired for. It is what they love. It is when they thrive. The rest is troubleshooting, dealing with misalignments with the business, discussing roles and responsibilities, trying to find the root cause of a problem across different documentation, asking access for a system: all this is what we call waste, or really annoying time. I like happy time and annoying time, or value time and waste time.

Fernando Cornago: Coming back to the numbers, in 2018 we had about 47% value time or happy time. In 2024 we had an average up to 65% happy time. We increased by about 20 percentage points the time where our engineers are doing what they are hired for and what they like. However, the devil is in the details, because we analyzed seven teams and the results are hugely and extremely different from each other.

Fernando Cornago: We were able to extract two groups. High performers had up to 80% value time, or time on keyboard, or happy time. The record team had 70% during a month: 70% of the developers' hours were in the IDE, coding and testing. That is very extreme. We also have teams really struggling, with only 30% valuable, efficient, happy time. The other 70% was waste: trying to find their way into getting into the IDE and creating value.

Fernando Cornago: One last point that I did not mention in Vegas, because it was not aligned with our time. In December we had our employee health check, where we check team health with regard to how they perceive themselves providing value and how happy they are. These teams are, as you can imagine, much happier. They report much happier. They feel they provide more value, and they feel much more connected with the company vision than the other teams.

Gene Kim: I remember one of the observations was that these were teams working with some of the most modern parts of the estate, furthest away from ERP and legacy systems. It strongly correlates to the degree to which they have independence of action. Just to confirm, is that what differentiates those two populations?

Fernando Cornago: Yes. Of course sometimes it comes with seniority, but mainly the main factor when we analyzed the different teams was the architecture and how the architecture enabled the DevOps practices that we all believe in: release cycle times, number of releases, mean time to release, mean time to recover, et cetera. That was extreme. As I said, from the record team with 75% of their time coding or testing in the IDE, to a team with only 15% of their time, only 15% of the time of six or seven people, in the IDE. That is dramatic. The correlation is mainly with how mature the team is in our DevOps key metrics, which is always linked to architecture.

Fernando Cornago: I want to clarify one point. It is not the engineer's fault. Let's be clear. It is our decision as a company, because typically these engineers are working in parts of the code and monolith components of our architecture where we decided, due to economies of scale and efficiencies, not to create fancy microservices.

Fernando Cornago: For example, our Omni Hub order management ecosystem, which is under my responsibility as a system, is separating all our digital experiences, channels, marketing stack, e-commerce stack, apps, et cetera, from the physical world. They need to deal with TCO connectivity with warehouses that change their APIs once every three months and take one month to change. It is amazing. Thanks to them, the rest of the team can fly, but they cannot fly. We touch some of these core functions once every six months. Why would we have a microservice in continuous delivery and continuous deployment for that? We do not need that, or it is not our priority.

Fernando Cornago: However, when in the last three years we discovered parts of this Omni Hub tool with hot topics, areas that were going to change very frequently, like inventory visibility and delivery status, we carved out a microservice from the monolith so we could have our engineering practices on that and go much faster. Again, it is not the fault of the engineers. By the way, the closer you are to the ERP or core processes of the company, the harder it gets.

Gene Kim: That reminds me of this amazing quote from Dr. Alan MacCormack, a student of Dr. Carliss Baldwin studying architecture from an economics background. One of the key findings was that in organizations where people are working in complex codebases with bad architectures, people are fired at a rate nine times higher. It shows that people misattribute the performance of the system to the person instead of the architecture. I love what you said: this is intentionally done to liberate the other teams so they can work at speed and have happy time.

Fernando Cornago: We are not planning to fire any of our ERP engineers. We are not blaming them for the time they spend coding. It is very clear that when they change something, the same line of code can affect key core processes like financial reconciliation, inventory location, or replenishment. It does not make sense. It is complex per se, because we decided to be complex. We could take the decision to build our own ERP. I do not think that is ever going to be the case.

Gene Kim: To treat those two populations, you have the 70% people in happy time, and you have people closer to the ERP systems where everything is harder. You said, speculating, with those teams working close to the ERP, if you try to shove GenAI tools to them, they would probably say: do not give me more AI tools, give me better environments, give me better ways to test my code. Did I capture that correctly?

Fernando Cornago: Absolutely. First of all, they work in sometimes closed ecosystems, and different partners are in different states with regard to adoption of GenAI. Not everyone is opening their APIs, code bases, IDEs, or environments to a natural or smooth use of GenAI. Second, if I tell them, guys, I am going to solve your life by improving the 15% of the time that you spend in the IDE instead of focusing on the other 90% of their time that they spend firefighting, they will show me the middle finger and say, Fernando, are you crazy? Go home. Fix the environment. Fix the processes of the company. Do not give me GenAI. It is pure Theory of Constraints: stop the micro-improvements and really focus on what matters most.

Gene Kim: That is so good. I think this is one of the core anomalies in last year's DORA finding, that the more GenAI you adopted for developers, the worse stability and throughput got. We know that cannot be true. This is why we assembled this dream team of researchers to find under what conditions you can get these amazing benefits that you are talking about. Fernando, what is next? I am excited that we are going to have some science to back up your intuitions and observations. Tell us where adidas as a technology organization is going next and your own goals.

Fernando Cornago: First, with regard to tooling, we see exponential value provided. Usability is getting better, predictions are getting better, and the tools or IDE integration are becoming much more proactive, telling the developer what to do. As Steve and everyone in this community agrees: you may or may not be replaced by a tool or by an AI, but for sure you are going to be replaced by a person using AI first. This is what everyone needs to land. I think still we have 10 or 15% not adopting.

Fernando Cornago: Second, what is next? As I said, it is six months where I have seen 200 more developers adopting the tool, and six months where I do not have any qualitative or quantitative data. I will continue tracing both data and sharing with this community and with you. You mentioned to me research you are starting on how GenAI is changing the day-to-day life of our engineers. I have been focusing on different topics in the last six months, but I would love now, while we still have a community of this 10 to 20% not using it, to understand how the life, ceremonies, and practices of the teams are changing.

Fernando Cornago: Are you still doing peer programming, or is the AI doing it for you? In the same feature teams, do we have one or two people not using it, or is it full teams that do not want it or do not believe in it? There are a lot of metrics I am interested in inside the teams, and I would love to hear from you and Steve and different people what you are finding in the rest of the industry.

Gene Kim: Absolutely. For anyone else out there who is experimenting with GenAI and developers, and who also wants to share your experiences, and would be willing to have us do some data collection, please let us know. Fernando, I cannot tell you how grateful we are that you are giving us an opportunity to potentially do some experiments, clinical trials, and measurements. If you are half as excited as I am, I will be a pretty happy guy.

Fernando Cornago: I really love and take the session from Steve: try it, use it as if it is already working as it will work in two years from now. Going back to the goals of why we started using it so early, even so early that the first tool we used was not really the right tool: if you look into my portfolio, most of the things we are doing are AI enabling business for business change, business efficiency, business speed, business scaling. The same thing we are telling our business users, do not be afraid, use it, trust it, check it, we need to apply to ourselves. We need to learn how this is changing our workforce, our practices, and our jobs.

Gene Kim: Very good. The timer is sending us time to wrap this up. Fernando, thank you so much. I am looking forward to a whole bunch of collaboration and discussions in the weeks to come.

Fernando Cornago: Absolutely, Gene. See you hopefully in September in Las Vegas to share the continuation of the journey.

Gene Kim: Thank you, Fernando. Bye-bye.