Expert Discussion with Dr. Matt Beane (Author of The Skill Code)

Log in to watch

Connect Oct 2024

Expert Discussion with Dr. Matt Beane (Author of The Skill Code)

Dr. Matt Beane, author of The Skill Code, has spent over a decade studying what he calls the "novice optional" problem — the way advanced tools allow experts to solve complex problems independently, systematically cutting novices out of the learning experiences they need to develop real skill. Drawing on research across 32 contexts, from robotic surgery to bomb disposal, Beane argues that AI and automation are now accelerating this dynamic at unprecedented speed and scale, threatening to hollow out the next generation of talent across nearly every profession.

In this talk, you'll learn how to identify which subtasks within AI-assisted workflows are critical for novice skill development, why most AI deployment decisions are dangerously coarse, and what practical steps technology leaders can take to optimize for both productivity and skill growth simultaneously.

Chapters

Full transcript

The complete talk, organized by section.

Host Intro (Gene Kim)

Gene Kim opens the session by connecting it to Steve's "Death of the Junior Developer" post and to his own writing about LLM adoption in law firms and narrative fiction. He says using LLMs for fiction made him feel like a movie director rather than only a script writer: he could ask the model to rewrite something with more showing and less telling, and that experience changed him. Shortly after, Dr. Matt Beane reached out. Gene introduces Beane as the author of The Skill Code and says Beane has studied the "novice optional" problem for more than a decade: the forces that make life harder for people early in their careers. Gene calls Beane's surgical robot case study like being hit in the forehead, then asks him to introduce himself and his research journey.

Dr. Matt Beane

Beane starts with a story rather than a conventional research introduction. In 2012, he walked into operating rooms at five top-tier teaching hospitals and watched surgeons dealing with robotic surgical technology. He describes Kristen, a surgical resident trying to learn surgery. In a typical open surgery, Kristen would arrive early, help lead the team through putting the patient under and making the initial eight-inch incision for something like a prostatectomy, then call the attending about 45 minutes after the procedure began. The attending would gown up, and the two would spend the rest of the four-hour case with their hands in the patient. Kristen might hope to do some of the delicate nerve-sparing dissection. After the prostate was out, the attending would leave to do paperwork and Kristen would lead the close with a junior resident watching. By the end of the day, the patient would be fine and Kristen would be a better surgeon than when she walked in.

He contrasts that with robotic surgery. Six months later, Kristen might enter a robotic rotation and bring another prostate patient into the OR. This time, she and the attending attach a thousand-pound, four-arm robot to the patient, remove their scrubs, and sit at control consoles. For four hours, Kristen might get only 15 minutes of operative time if she is lucky. The attending wants to give her control and knows she needs to build skill, but also knows she will be slower and make more mistakes, and the patient comes first. Kristen therefore has no hope of getting near the nerves during the rotation.

Beane says he began collecting this data in 2012 and 2013. At the end of the study, a chief of urology at a top U.S. hospital told him to look at new residents Davis and Jensen: they trained in top programs that teach robotic surgery, but they "suck now" because they watched procedures happen instead of getting experience doing them. Beane summarizes the lesson: watching a movie does not make you an actor.

That finding blew his hair back. Between that point, his 2018 TED stage appearance, and a Harvard Business Review piece, he studied 31 additional contexts to test whether the problem generalized. What he saw in robotic surgery was that the robotic tool allowed the senior expert to do the work alone. When is the right time to involve a novice, knowing the novice will certainly be slower and make more mistakes? The answer, operationally, is never. Across organizations, the same logic shows up: people make a short-run productivity tradeoff that sacrifices learning, especially for novices. Experts and novices become separated in the work to an unprecedented degree because the tool lets the expert solve deeper problems at greater scale. That is what experts are born to do, while novices get left behind.

Beane says this is everywhere now. Professions are hollowing themselves out from the inside because the next generation of talent will struggle. He references a recently published Science paper by colleagues at OpenAI and Daniel Rock at Wharton, saying approximately 80% of working adults are at least 15% exposed to these technologies, meaning some part of their job could get a productivity boost from LLMs. That is more true for coding than for some other jobs, but it implies 2.7 billion working adults may be losing their primary path from novice to expert as jobs change and experts become more extended.

The book, he says, identifies the ingredients of work experience that actually build skill. Training is not enough compared with learning on the job. The Skill Code gives three basic criteria for assessing whether a workflow will support healthy novice skill development or expert skill development.

Beane then gives the exception he found: bomb disposal. Before robots, an expert might walk 300 meters from a bombproof truck and poke a stick at an IED while the junior bomb-disposal technician stayed 300 meters away in the truck. It was very hard for the junior to learn during the flow of work. With a robot, both people stay in the bombproof truck, the junior drives the robot, and the senior narrates their thought process and diagnosis as the work happens. Beane says this setup happened by accident in the military; nobody designed the workflow that way, but they are happy to have it. He frames the key question he now poses to organizations: how can you handle AI in a way that gets results and builds users' skill? It is an impossible-feeling problem, but failure is not an option.

Q&A

Gene asks Beane about the historical roots of the novice optional problem, including pottery. Beane says the fun part of the book was going back 160,000 years to the first stored-energy weapon, the bow. The archaeological evidence is not of the current problem, but of the fact that humans have built expertise primarily through the novice-expert relationship for a very long time. In pre-Hellenic Greece, and also in mainland China, evidence shows a shift from hand-pinched pots to wheel-spun pots. Written notes on strips of lead found in old wells show concern about what would happen to apprentices if masters could make pots so fast. The difference now is speed and scope: ancient crafts had five human generations to adjust slowly as the potting wheel diffused geographically; today the diffusion rate is incredibly fast, and the scope of problems experts can consume with new tools exceeds the reasonable boundary of junior skill. We have passed an event horizon where the junior can no longer meaningfully touch or participate in the work in a way necessary for the senior to get ahead.

Gene observes that across many professions, experts seem to prefer doing the work themselves when they can, implying coordination costs must be high. Beane agrees. The intrinsic value of working with a novice is real: humans are motivated by helping people who know less and learning from people who know more. But productivity gains can nuke that relational value. If an expert can use a tool to do more and better work on harder problems, they will take that deal because they spent a lifetime developing the appetite and capability to solve hard problems. Empirically, experts take the deal of enhanced productivity on more complicated problems over the relational benefit of engaging with novices every time.

Gene then asks about a labor-economics framing. In service firms such as law, accounting, and audit, organizations make a large investment in novices with the expectation that it will pay off later in their most profitable years. Beane says he was trained partly as an economist, as well as a sociologist and computer scientist. He compares novice development to a CapEx investment: dollars are put down to build a capability that takes time to mount, and returns come later. In robotic surgery, which was about ten years ahead of the software profession's LLM adoption curve, newly minted surgeons became legally empowered to use robotic tools at hospitals that had bought those tools and needed ROI, but they were not competent or confident enough to use them. Hospitals then used euphemisms such as proctoring or sabbaticals for remedial training, taking senior people offline for six to nine months while losing revenue and serving fewer patients. The receiving organization pays the tax and hides it under a new category. One way or another, he says, organizations will pay for legitimized talent that has the titles and resume signals but is hollow in the relevant skills and must be trued up.

Gene brings up a coding-assistant pilot involving about 50 people. The pilot did not achieve its hoped-for goals: junior developers wrote and committed much more code, but senior reviewers found the code was not fit for production because of small or large errors. Gene says Beane had suspected the senior people might have misallocated tasks: what looked like one assignment might really contain 70 to 100 subtasks, and novices may have been given the wrong subtasks.

Beane says that in the prior fall he turned his research agenda toward work involving LLMs, especially complex and collaborative problem solving. Soon after, OpenAI began paying attention, and he has been collaborating with research colleagues on work he hopes will come out soon. The reason organizations get gains or losses from LLMs, while also affecting novice skill development, is that they are making coarse and under-informed deployment choices: essentially on or off. A development task has to be decomposed far more finely into work packets and task graphs. Only then can an organization see that, out of 97 subtasks, maybe 11 are the collaborative, skill-enriching parts where senior and junior people should work together, while the rest can be automated away. Within those 11, maybe three will slow the team down if a junior is involved; that slowdown becomes a hyper-targeted tax paid for learning, supported by evidence rather than vague handholding. This precision is needed because many organizations are buying thousands or tens of thousands of licenses and turning them on by function or area, then hoping.

Gene asks why such fine-grained decomposition is necessary now, given that juniors have long been coached and mentored by seniors. Beane says it comes down to intensity and the event horizon he described earlier. New technology has always let experts do more and better work, but now the best a junior person can often do is contribute "terrible code that works" in a metaphorical sense. They cannot see the class of errors they might create. Involving them in an upskilling way is now an elective choice; the senior or manager no longer needs the junior involved to get the work done. The system is therefore reliant on willpower from people with power. Novices often do not know enough to complain because the situation is new to them and they are simply trying to do their best.

Gene asks about challenge, one of the three Cs in Beane's book. Beane says humans need to be near the edge of their capacity, totally focused, and significantly uncomfortable in order to learn. He compares it to Gene and Steve's pair-programming session and says surgeons are literally sweating at the control console when in that state. Workplaces often become padded playgrounds because humans dislike discomfort and create processes that reduce it, but those processes can interfere with skill development. The three Cs in The Skill Code are challenge, complexity, and connection. People who succeeded at building robotic surgical skill despite the novice optional barrier manufactured challenge for themselves, sometimes by operating when the senior surgeon left the room. That is not what we want, but it shows they were fighting for challenge. If junior or senior people are not challenged, they are dead in the water, and the same is true for organizations.

Gene then asks about connection and how technology leaders can preserve the next generation of senior leaders. Beane defines connection as a bond of trust and respect between human beings. It is strictly necessary for skill development. If Beane is a junior developer and Gene is the senior developer, Beane wants to earn Gene's trust and respect; that motivates him to work hard. Also, Gene will not give Beane a shot unless Beane has earned that trust and respect, and vice versa. For senior experts, helping a junior person who is really trying solve a problem they did not think was possible is deeply satisfying. But the way LLMs are commonly handled distances humans from each other because people can solve more and better on their own, reducing the likelihood of warm trust-and-respect bonds. We cannot go back to the prior world, but we need to examine the work that remains, identify interaction opportunities that build those bonds, and preserve them while still getting productivity gains.

Gene asks what advice Beane would give to a fictional leader running the kind of coding-assistant pilot discussed earlier. Beane gives a two-part answer. The generic answer is to read his book: the first third provides three ten-point checklists for examining workflows and asking how to get the best out of LLMs while preserving challenge, complexity, and connection. The second answer is that Beane has started a company, SkillBench, to build software for hyper-precise work decomposition and prediction based on the science he is doing with OpenAI. For now, the company can only take on a few organizations. The goal is a cost-effective metric for the joint optimization problem: when assessing an AI deployment, an organization should get two metrics at the same time, productivity and upskilling/de-skilling, including for whom. Those metrics should be combined so leaders make informed decisions before deployment and dynamically during deployment. The problem is not limited to software; it applies to any work involving junior and senior people plus automation. The developer community has a year or more of extra experience with the issue, and "The Death of the Junior Developer" is the first public signal Beane has seen from any profession saying this looks like a problem.

Gene asks for three or four concrete pieces of advice for a pilot leader overseeing a Copilot-style project. Beane says the leader needs a small SWAT team spanning levels of expertise, tenure, and role. That group should ask what rich data it needs for the joint optimization problem: how to know what good productivity looks like, how to know what skill enrichment or de-skilling looks like, and how to update each other and pivot during deployment. Within that group, people must be able to take off their stripes and tell the truth about what is actually happening. Otherwise, it becomes a dog-and-pony show and a waste of time. The junior person must be able to say, "LLMs are rad, but I don't get to play anymore." The group also needs to find shadow learners, people who have hacked their way to learning anyway. The goal is to understand what those people are doing and learn from them, not copy them. Beane says the team may need to pay cash to get excellent data, and he has seen that work.

Gene connects this to incremental experiment design and asks about individual AI users hiding their gains, keeping benefits for themselves rather than the organization. Beane says empirical data shows massive, widespread use of LLMs across most jobs involving digital work, but firm-level productivity gains do not seem to have shown up much yet. People do not have to tell anyone they are using these technologies, and others cannot tell if they did. If someone writes a memo five times faster, the next choice might be YouTube or telling the boss it was five times faster. Individuals can appropriate the gain and hide it perfectly if they want to. The teams and firms that win will cultivate a climate where people feel an excited, nervous obligation to tell colleagues what they can now do and jointly reshape the way work happens. Beane says he is not a fan of the term psychological safety because it is not just safety; it is accountability and pressure combined with confidence that there is no penalty for saying, "I found a new thing." Organizations should offer large bonuses, kudos, and even failure bounties for people who find stupid ways to use the technology. Successes and failures are already happening inside firms and teams; leaders must aggressively find them, make sense of them, reward them, and move forward.

Gene's last question asks what help Beane is looking for. Beane says his company is SkillBench, where he is co-founder and CEO. The company is still in stealth mode, though the website is up, and it has two or three more slots for organizations interested in pointy-end work: decomposing work, testing the research and system, and making fine-grained predictions and decisions about AI deployment. Interested organizations can email matt@skillbench.com. Beane says software is in a strong position to lead the world on this issue because the community has more evidence that the problem exists and should be managed.

Gene thanks Dr. Beane and says more fun adventures await. Beane thanks Gene for the invitation.