Grappling with the Bitter Lesson
The "bitter lesson" is that at some point, improvements in AI depend less on human knowledge and experience than we expect. At O'Reilly, our business is sharing the knowledge of innovators, so we are at the cutting edge of grappling with the bitter lesson. In this talk, I share some of the organizational issues that surface when AI seems magically able to do things that challenge business processes and strategic moats based on our unique expertise. I talk through a particular product design case study. It highlights the leaps over past practice that are necessary to take full advantage of AI capabilities, but also why the speed of adoption is regulated not just by the willingness of our own employees to change, but by the willingness of our customers.
Chapters
Full transcript
The complete talk, organized by section.
Host Intro (Gene Kim)
I am so delighted that our last speaker for the day is Tim O'Reilly. It is difficult to overstate how much his work has influenced my life. I have learned so many things from him directly and indirectly. For decades of my career, I learned what I needed to learn from O'Reilly books.
I met so many friends at the Velocity Conference in the early 2010s, many of which have continued and led to decades of continued collaboration. Tim is an amazing scholar, and it amazes me how he has shaped our industry.
In the wreckage and gloom that followed the dot-com crash, he was so passionate talking about what had to come next, and that is what eventually became known as Web 2.0. Around 2010, he wrote a post called "Operations: The New Secret Sauce" that was the rallying cry for the DevOps movement because it finally felt like someone cared about operations, and he started to reveal that this was actually one of the core competencies and secret weapons of the tech giants.
It is amazing to see how AI has brought Tim back into the game. He will be sharing some incredible perspectives on what AI is doing to our organizations, how the skills that we value may change, and an incredible case study of what he is trying to do about it at O'Reilly Media. Here is Tim.
Tim O'Reilly
Thank you.
All right. At this point, you may think that the bitter lesson is that there is one more talk before the cocktail party, but it is actually a 2019 paper by Richard Sutton, which is really about the way that in training AI, we used to think that you had to put in lots of bespoke knowledge, but that no longer turns out to be the case, that AI can just learn from example. I want to talk about some of the implications of that as I have been experiencing it at my company.
How many of you read Ethan Mollick? Oh yeah, you really should. He is a professor at Wharton. He has a blog called One Useful Thing. His book, Co-Intelligence, is also great. I highly recommend it. Anyway, he wrote a post recently. I went, oh man, Ethan, you have been reading my mail. Of course, we do talk often, so maybe he has been reading my mail.
This essay is called "The Bitter Lesson versus the Garbage Can." He talks about this organizational theory called the Garbage Can Model, which represents a world in which unwritten rules, bespoke knowledge, and complex and other undocumented processes are critical. He says this is one of the reasons why AI adoption in organizations is hard.
Then he says, at least that is how it looks if we assume that AI needs to understand the organizations the way we do. But AI researchers have learned something important about these assumptions. He goes on to talk about the bitter lesson, which is this idea that encoding human understanding into an AI tends to be worse than just letting the AI figure out how to solve the problem.
Let me talk a little bit about my company. Gene mentioned this: in 2000, there was a Publishers Weekly cover that said the internet was built on O'Reilly books. Many of you here may know me still, or think of me still, as a publisher. But right around 2000 we became an online platform, originally just for e-books. Today it is more than 60,000 technical and business books from 50-plus publishers, 5,000 on-demand video courses, 200 live video courses a month, technology sandboxes, all kinds of other things. It is really a skill-building platform.
I want to start with a very early exposure to the bitter lesson, which is November 2018, when Google open-sourced BERT, which is the predecessor to today's LLMs. I immediately recognized that this was a game-changing event. I went around saying to people, oh my God, this is just like when IBM published the specs to the original PC because they did not quite get what it was going to do to their monopoly. BERT was Google's secret sauce. These large language models encode all this knowledge already. We started working immediately on some projects based on BERT and in 2020 released something called Answers, which was basically a search engine of all our content. Suddenly we could actually take people right to the page. We could take people right to the minute of a video. It was way different than the kind of metadata search we were doing before.
But we were still thinking of it very much through the metaphors that Google had given us for search. We would come up with eight to ten results, and we would start with the one that we thought was most likely, but we would leave it for the last mile. It was sort of a search metaphor. We later put it in as a sidebar to a book so you could ask further questions. We put it other places throughout the platform, but it was still very much in that metaphor, and we had not really understood how much the game had changed.
In 2022, when ChatGPT introduced the chat interface, all of a sudden, wow, it opened up so much more with the power of LLMs. Obviously the LLM itself was far beyond BERT as well. But even Google, which had invented the transformer architecture underneath the LLMs, was also limited in its imagination and its ambition because of its existing business model and processes.
I think that is a really important takeaway for you. It is not just that AI is this add-on feature. It is a transformative feature, and we have to understand how it may transform everything about our business, our organization, or it may not, but we have to get beyond the basics.
This is the first lesson. If you know me, you know that I love poetry and I like to bring it in. I will give you a quote from Ezra Pound, the famous 20th-century poet. He said, "Make it new." We did not do that enough with that first implementation of Answers.
Now I want to give you a case study of an ongoing project at O'Reilly where we are going, oh my gosh, we are not making it new deeply enough, and how we are starting to try to grapple with that. That is what I mean by grappling with the bitter lesson.
This is something that we call the O'Reilly Verifiable Skills Initiative. We are a learning platform, and our customers are these large organizations, many of them Fortune 500. They often have a learning department with a set of requirements. They have been asking us for more precisely targeted, competency-based learning so that people can skip what they know, master only what they need, and then prove what they have learned.
That means having skill frameworks and a lot of testing, which was not something, if you think about the days of O'Reilly books, we never did. It was all people who were self-motivated. But now a lot of our customers are corporate customers, and they want things that were not what we built originally.
We have a lot of the building blocks for what they have been asking for. We have comprehensive, carefully crafted learning content in the form of video courses, books, and live training. You look at a course like Python Fundamentals with Paul Deitel. Paul has been teaching this course for years. It is a 60-hour course, Python from beginner to expert. He has thought through the pedagogy. He is a really great trainer.
We have new kinds of content that cover new emerging skills. Our bestselling book right now is one called AI Engineering by Chip Huyen. A book like this really is a skill framework. The table of contents is an expression of how Chip thinks about this emerging job. The book is already enhanced with levels and quizzes and all this kind of stuff.
You go, what is missing? As a quick example of a quiz, we have this list of atomic skills as part of the interface. People can say, I want to learn about Kubernetes, and they can navigate to it. It shows related skills and shows the level. We are there a little bit, but it is not good enough. We started working on something much more comprehensive.
We also have live environments, sandboxes where people can practice and demonstrate their skills, but it is all a bit scattered. I mentioned tens of thousands of books, thousands of video courses, hundreds of live trainings each month, and they are each from a different author or trainer, each with a slightly different view of what needs to be taught and how. This is where Ethan's thing hit like a hammer. It is a classic example of what you could think of as the garbage can.
Our customers, while they love the depth of the O'Reilly content on the platform, are also going, wow, we cannot find our way around. We need it to be organized much more. So we set up for this massive reorganization around skill frameworks. What skills are required for a given job? How are these skills segmented by level? What are the specific skills at each step of the career ladder? How do skills overlap and branch? What content do we have for each of these skills? All this mapping work.
We looked at all these taxonomies that people have done for learning. The famous one is Bloom's Taxonomy. It starts with remembering things, then understanding them, then applying them more deeply, and being able to analyze and evaluate and ultimately create. We go, oh yeah, we can build something that is sort of a progression like that.
Then we looked at things like SFIA, the Skills Framework for the Information Age. It is a nonprofit thing. It is an eye chart. I blow it up and it is still an eye chart. Then there are all these things: here is how you think this through. We are going, wow, that is a lot of work.
But we say, we can try to build these skill frameworks with a combination of AI and human expertise. We are going to combine AI, expert input, and editorial judgment effectively. We are going to have all this process so that we can deal with the challenge with AI-generated outputs and apply best practices for QA and scalability. We are going to use AI to help us move fast, but we do not want to break things.
It is a lot of work. It takes a lot of time. We have this step-by-step methodology. We use one model to generate a list of skill competencies grouped by proficiency level, and ask it to integrate any missing skills. Then we review for gaps and accuracy by feeding that output into a different model and asking it to identify missing skills. Then we have one model talk to the other, and we iterate to refine this framework. We have it reviewed by our experts. Then we consolidate the skill lists, basically try to get it all organized, map it to our learning materials, and identify the gaps in coverage. We have human in the loop to make sure everything is right, and we deliver final outputs, which kind of look like: for this skill, this level, here is the spreadsheet that shows the content that goes with this skill at this level, and here is how you evaluate it. Kind of crazy.
Then we start thinking about the bitter lesson. Part of this was triggered for me because I was seeing the schedule, and the schedule was super long. I went, wait, there is something wrong here. I started thinking about what comes free with AI.
I just started throwing things at ChatGPT. The first one was when DeepSeek came out. I basically said, hey, ChatGPT, what skills would I need to use this repository? I pasted in the URL of the DeepSeek repository from GitHub, and it gave me a list. The first one really surprised me. It said, oh, you need to know how to build and run something on Azure. I went, oh, that was one I did not quite know. I did not expect this to be a repository that was set up to be on Azure, and I do not know anything about that.
Then I said, okay, tell me what I would need to know to set this up on Azure. It gave me a checklist. I went, wow, this is kind of like real time. It does not have any of that work that we were doing before, but it can do something that is kind of like what I think we want.
Then I tried some more things. I said, okay, here is the GitHub profile of a random developer that I found with an open repository on GitHub. I said, hey, ChatGPT, can you evaluate this guy's skills based on his repo? Tell me his technical skills, but also his organizational writing skills. It went down and gave me a pretty thorough list. All the way down, he was pretty good with everything. It did say that his documentation could be better. Pretty interesting analysis there.
Then I went one step further. I found, randomly, a job description on OpenAI's site for a Sora engineer, Sora being their video generator. I said, okay, how good a fit is this guy for this job? Again, it gave me a pretty good analysis. It said the core technical skills, especially in Python, machine learning, and optimization, would make him a good fit for the role, but the position's heavy focus on distributed systems, supercomputing and training, and kernel optimization might require him to gain more experience in these areas to be fully effective.
I am like, holy cow, this is out of the box. I did not have to do anything for that. I am going, wait a minute. Over here, we have this massive effort to capture all the knowledge of our experts into some framework, and over here there is the bitter lesson. So what do we do?
We have always thought of the depth and accuracy of our content as our moat. We tell ourselves that AI hallucinates, and it does. We tell ourselves that it is inconsistent in its outputs, and it is. It still does have gaps in its knowledge. Matthew Prince of Cloudflare calls it Swiss cheese. That is kind of a nice metaphor, but we can see that AI is catching up. Not only that, the leading platforms appear to have trained on our content anyway, without our permission. We actually did a study and kind of did a statistical analysis that made it pretty clear that ChatGPT had, and Anthropic has already sort of admitted it.
There is also the fact that this conversational UI opens up many possibilities for more engaging kinds of assessment and learning experiences. So why are we spending all the time building infrastructure to support some old-school application when we could be focused on building something new?
This was the mandate I gave the team: do not race the horse, ride it. A lot of what our customers want comes out of the box, so we need to build on top of it, not try to outrun it or compete with it.
We start prototyping directly in ChatGPT and ultimately in MCP, trying to write system prompts so that we can start to build some of these features that we want. We build a simple concept app that, given a skill, builds a skills framework on the fly and then assesses the user's place within it.
The app does work within the existing O'Reilly sandbox environment. There are a lot of skills where we can actually have people try things. It is not just quiz-based like the stuff that we were trying to do before. You can literally have somebody do something, and the AI can evaluate the work product that they create in the sandbox. There are a lot of really interesting opportunities there.
I tried to see if it would correctly assess my shell programming skills. I have not done much programming recently. I may do a little bit, but really I used to do a lot of shell programming. I thought that was a good place to start. Forty years ago I wrote these complex pipelines of sed and awk and shell scripts to manage our documentation business, which was a precursor to our publishing business.
It did a pretty good job assessing me. It says, you demonstrate strong intermediate skills with clear advanced potential. Your theoretical understanding is excellent. You just need more practice to keep these concepts fresh in active memory. I go, that is pretty damn accurate.
I go, okay, team, let us go. We have a long road ahead. We have to get the AI prompting and context right. We have to think through the product and user experience design. This is a complex system where we have to figure out what is the user experience for the actual learners, but also what is the user experience for the people on the corporate side who want to specify their learning goals, their job descriptions. There are a lot of things that we have to do to figure out how to integrate into customer systems and other environments. We also have to adapt our own business model.
What do I mean by that? There is a wonderful quote that I learned from some consultants that we worked with back around 2000, named Dan and Meredith Beam. They said, "A business model is a way that all the parts of a business work together to create customer value and competitive advantage." That means something very different than, oh, we have an advertising business model, or we have a subscription business model.
The heart of the O'Reilly business model is, yes, the bulk of our revenue comes from corporate subscriptions. We are a SaaS subscription business. But on the provider side, the people who actually put the content in, it is a revenue-share model. AI challenges that revenue-share model.
In fact, the whole platform business began because I was trying to find a way to make money for authors who were potentially threatened by online books. Back in 2000, I had a company called Books 24x7 that came trying to license O'Reilly books. They told me their model, and I did the math for them over dinner. I said, you are offering me six cents a book, which I am supposed to share with my authors. I do not think that is a model. I said we have to build something like that, but with a model that keeps compensating authors so they will keep producing the content that we sell.
Later on in the company's history, we came up with all kinds of other products, like the live training. It was a new way to make money for our suppliers, and of course it turned out to be loved by our customers. But we have this two-sided marketplace and we have to support both sides.
We have done quite a bit of work on this idea: how, as we use AI, do we manage to keep the payments to suppliers coming? Steve Wilson, who spoke earlier, is one of our authors. Let us say we make an AI summary of his book and people consume that instead of his book. How do we figure out the substitution, and what should the royalty be? It should be quite a bit higher, right? A lot of companies would just say, hey, let us take the money and run, but we are going, no, no. Part of our business model is to keep paying people. That is the challenge.
I ask you to think through: what is your business model, and how does AI change it? How does it interact with it?
There are other internal obstacles. The skills are not evenly distributed. The roles and organizational structures need to change. We are now in the middle of a reorg because of some of this. We have integration of new AI features in a UI that is already crammed with stuff. Then we have existing product roadmaps, with sunk cost of developing features that customers have said they want in the past, competing for development resources with those that are more where the puck is going.
Of course, there are barriers to consumer adoption: general uncertainty about AI, worries about AI playing a role in employee evaluation, reluctance to upload proprietary content into a third-party application, because that is one of the ways we could get their context.
This leads me to a concluding piece. I really recommend very highly a recent paper by Arvind Narayanan and Sayash Kapoor from Princeton. It is called "AI as Normal Technology." They look at the history of technology adoption. I had Arvind on my Live with Tim O'Reilly show, a biweekly podcast I do on the platform. One of the things he said is: look, the key idea here is that the logic behind the pace of advances in technology is different from the logic behind the speed at which it gets adopted. That depends on the rate at which human behavior can change and the rate at which organizations can figure out new business models.
Then he talks about a four-stage process: invention, then product development, then diffusion, and then adaptation. Take an example of a totally transformative technology, electricity. The early inventions came from Michael Faraday. The product phase was Thomas Edison. There was this competition between Edison and Tesla about what would be the distribution network. All of this stuff takes time to work out, and all the products came after that fundamental invention.
I think we are at a stage where, despite the hype cycle, the raw advances in power from LLMs do seem to be slowing down. That means product design and development is now the game. Teaching and learning from others, like an event like this, is now the game. Adapting our workforces, our businesses, and our society to AI is now the game.
That is where I come back to wrestling with the bitter lesson. AI is a very, very powerful thing. There is an image I have used in my career for the better part of 20 years. It is a painting by Eugene Delacroix, which is in Saint-Sulpice in Paris, of Jacob wrestling with the angel. It came into my mental toolbox through a poem by Rainer Maria Rilke, who talked about how these wrestlers of the Old Testament, like Jacob, wrestled with angels not because they thought they could win, but because it made them stronger.
He had this great line. He says, "What we fight with is so small, and when we win, it makes us small. What we want is to be defeated, decisively, by successively greater" -- he said beings, but I say challenges.
Wrestling with AI, and the bitter lesson being this expression of AI as a kind of powerful angel, or you could also say maybe it is a powerful devil, some people might say. It is this powerful force that we have to figure out how to harness and bring in and work with. I think if we do that, it will make us stronger.
I am super excited about this moment in technology because it is at moments like this that we reinvent everything we do. There again, I will give you another classical reference. Alexander the Great, when he was a young man, was told by one of his friends: how lucky you are that your father has basically conquered all of Greece; you are going to inherit all this.
Think of all of us sitting fat and sassy in our enterprises. We have been successful. Alexander said, what use is it to have everything and accomplish nothing? He wanted to accomplish something. We are at the point where we have the opportunity to wrestle with the angel and to accomplish great things.
Thank you very much.