How Collective Intelligence Could Soon Reshape Medical Decision-Making

Jul 3

A nurse practitioner working in a rural Georgia primary care clinic can pull up a dynamic summary of the medical literature in seconds. Just a few years ago, this would have been unthinkable.

And yet, she still faces decisions that the literature doesn’t answer. Sometimes, relevant studies have never been done. Other times, the evidence exists but doesn’t quite fit the individual patient in front of her.

This problem is not confined to resource-limited settings. In one study, fewer than 3% of clinical decisions made by Harvard pediatric cardiologists were based on evidence directly applicable to the question at hand.

Much of everyday medicine is therefore practiced using experience and intuition. We draw on patterns we’ve seen before, lessons from prior patients, and judgments formed over years of training and practice.

But our experience is uneven and, as Sir William Osler warned, “apt to be misleading.” We are also subject to cognitive blind spots, including anchoring, availability, and confirmation bias. And our decisions can be noisy, randomly varying.

When answers aren’t obvious, thoughtfully combining multiple perspectives can outperform even the best clinicians working alone. Medicine has always done this in limited ways. What’s new is the ability to do it deliberately, systematically, and at scale. Let me explain.

We Typically Make Better Decisions As Groups

The idea that groups can outperform individuals is not new. in 1906, Francis Galton observed that while visotrs to a county fair wildly over- and under-estimated the weight of an ox, the median of the crowd’s guesses came within 1% of the true value.

Galton famously demonstrated the "wisdom of the crowds."

Since then, research on what is now called collective intelligence—the ability of groups to make better estimates and solve problems than individuals—has advanced considerably. Importantly, these gains depend less on how many people are involved than on how networks are designed and how information is gathered and combined.

Collective approaches are now foundational in many high-stakes contexts outside medicine, including complex problem-solving at organizations like NASA; large-scale collaboration on platforms such as Wikipedia and GitHub; forecasting through prediction markets like Polymarket; and even civic decision-making efforts such as vTaiwan.

Collective Healthcare Decisions Are Usually Better

Doctors have long worked together to manage uncertainty through bedside rounds, informal and formal consultations, conferences such as grand rounds, and committees like tumor boards. In each, physicians combine partial perspectives when no single clinician can see the whole picture.

Studies show that combining clinicians’ independent judgments can substantially improve accuracy. In one experiment, pooling ten physicians’ diagnoses on open-ended cases raised accuracy from 46% to 76%. In another, groups of non-specialists outperformed individual subspecialists in their own fields.

And yet, most clinical decisions are still made by one clinician at a time. Logistics limit how many can participate in rounds, conferences, or multidisciplinary committees. As a result, group judgment in medicine has been powerful, but narrow in scope and slow to spread.

That is changing. Three developments—advances in the science of collective intelligence, the rapid rise of artificial intelligence, and the growth of large clinician networks—are making it possible to move beyond small, local groups toward larger, faster, and more diverse networks of clinical judgment.

What Science Teaches us About Designing For Collective Intelligence

A new science has emerged for designing collective decision-making systems. At its core, performance depends on matching the task—problem-solving versus estimation (“wisdom of the crowds”)—to the right kind of network.

For simple problems with familiar solutions, fast-moving networks perform best because they spread new ideas quickly. For complex problems that require new ways of thinking, slower and more decentralized networks can be superior because they protect unconventional ideas from being suppressed too early.

Fast, centralized networks are best for simple problems. Slow, decentralized networks are better for complex problems.

Estimation problems hinge on a different design choice: how to combine individual judgments.

Ralf Kurvers at the Max Planck Institute of Human Development explained to me that the optimal aggregation rule depends on the distribution of expertise. When track records are available, weighting the judgments of proven high performers improves accuracy. When they are not, early performance can help set those weights. When neither is possible, equal weighting often performs best.

Damon Centola at the University of Pennsylvania, author of the upcoming book Goodthink, focuses instead on network structure. His group has shown that egalitarian information-exchange networks—where participants iteratively revise their independent judgments after seeing anonymized peer responses—promote individual learning and improve collective accuracy, reducing diagnostic errors and care disparities.

Louis Rosenberg draws on swarm behavior in nature—such as flocks of birds or schools of fish—to rethink how groups can reach better decisions. His company, Unanimous AI, has built systems that connect many small conversational groups (of around five people) using AI agents that share insights across the network, allowing large groups to deliberate rather than simply vote. In one study, this swarm-based approach improved radiologists’ diagnostic accuracy by roughly one-third.

Artificial Intelligence Extends What Collective Intelligence Can Do

Large language models—themselves a product of massive collective intelligence—can support better group judgment in ways beyond facilitating discussion. By processing unstructured data, harmonizing individual contributions, surfacing patterns, translating between languages, and summarizing complex inputs, LLMs make large-scale collaboration practical in ways that weren’t feasible before.

AI models can also serve as participants in collective systems. In the NOHARM study, constellations of models made fewer errors than individual models, sometimes substantially so. But the strongest results may come from combining groups of clinicians with groups of AI models rather than relying on either alone.

In one experiment using more than 2,000 open-ended clinical vignettes on the Human Diagnosis Project platform, researchers found that physician groups working with AI groups outperformed individual physicians, physician groups alone, individual AI models, and AI-only ensembles. One reason is that humans and AI make different kinds of mistakes. That “error diversity” is exactly what collective intelligence theory predicts: intelligently combining uncorrelated improves overall decision quality.

Adam Rodman, a Harvard physician and informaticist, is optimistic that human-model constellations will work well. His ARISE Network colleague, Dr. Ethan Goh, emphasized the importance of testing this in the real world.

How Expanding Networks Make Collective Intelligence Possible At Scale

Digital networks now connect more clinicians than ever before. During the pandemic, for example, clinicians and scientists used Twitter to share observations and collaborate in real time. Today, several large healthcare-specific networks are positioned to support collective intelligence more deliberately.

OpenEvidence has rapidly emerged as one of the most widely used platforms in healthcare. Designed to support decisions at the point of care with dynamic, evidence-based responses, the company reports that more than 50% of US doctors across 10,000 hospitals and practices now generate roughly 25 million clinical queries each month. (Disclosure: I am an advisor.)

Founder and CEO Daniel Nadler envisions collective intelligence supplementing medical evidence. “Think of it like mapmaking,” he told me. “We continually compare the literature with the questions clinicians are asking. The map usually fits the terrain quite well. When it doesn’t, we’re building ways to draw on groups of clinicians to fill in the gaps and make the guidance even better.”

Electronic health records can also harness the collective behavior of clinicians and spread insights across networks. Epic, the nation’s leading EHR vendor, is doing this with tools built on its 300-million-plus patient Cosmos database.

For example, Epic Chief Medical Officer Jackie Gerhart showed me how its “Best Care Choices” feature displays how similar patients have fared with different treatments, while the “Look-Alikes” tool helps clinicians connect with others caring for comparable patients.

Health systems themselves are also large clinician networks. Kaiser Permanente includes 25,000-plus affiliated physicians. Mass General Brigham employs more than 350 cardiologists. Can health systems design decision-making systems that enable many of their clinicians—perhaps even augmented by specialty-specific AI models—to contribute to the same patient’s care when it’s needed?

What It Will Take To Make This Work

Even if the technical pieces are now in place, turning collective intelligence into everyday clinical practice will require solving several hard problems, particularly around incentives and participation.

Human Dx founder Jay Komarneni frames this as both a practical and an ethical challenge: how to motivate people to contribute their knowledge, and how to ensure that those who do are rewarded. He is developing a cooperative model that compensates participants according to the value of their contributions. That approach may prove harder to implement in direct clinical care, where incentives remain tied primarily to billing codes rather than shared judgment.

Medical training and culture will also need to evolve. Jonathan Chen, a Stanford physician and data scientist, notes that clinicians trained to act decisively on their own may naturally resist external input. They must learn when to rely on their own judgment, when to enlist broader input, and how to weigh what comes back. Collective intelligence will not replace individual decision-making; it will require new norms about when and how to use it.

As the pre-eminent biologist Michael Levin has argued, intelligence is not confined to single minds but emerges from systems of cooperating parts. Healthcare intelligence today is fragmented across countless individuals and institutions. The opportunity is to design systems that let those fragments recombine, so that clinical decisions reflect not just one clinician’s experience, but the shared intelligence of many.

Acknowledgements: I thank Damon Centola, Jonathan Chen, Eric Elbogen, Ethan Goh, Jackie Gerhart, Jay Komarneni, Ralf Kurvers, Daniel Nadler, Adam Rodman, Louis Rosenberg, Rahul Shah, Sean Sylva, and Jacob Wright for discussing this topic with me.

Spencer Dorn