In 2019, Babylon Health was among the most-funded digital health companies in the world. Their AI symptom checker told millions of people whether to see a doctor. The NHS (UK) partnered with them. Investors poured in hundreds of millions.
By 2023, Babylon Health was bankrupt.
The AI wasn't terrible. The problem was simpler: people believed it without thinking. When it said, "low risk," they stayed home. When it said, "see a GP," they went to the doctor. The AI became the decision-maker instead of assisting the decision-maker. When the misdiagnoses began surfacing, trust didn't fade, it vanished overnight. Folks hadn't just been using Babylon Health; they'd been deferring to it. There was no underlying relationship of trust to absorb the shock.
That story contains a pattern every pharma company building AI-powered health tools needs to understand. Not because the technology will fail, but because trust, the very thing that makes these platforms work, is also the thing that makes them dangerous.
This isn't mental offloading, like using a calculator. That's delegation, you're still forming the question and evaluating the answer. Cognitive surrender is different. It's when we stop forming our own view and simply adopt the AI's output as our own. No interrogation. No override. Just acceptance.
When the AI is right, people perform better. When it's wrong, they perform worse than if they'd had no AI at all, and they're more confident at the same time. They don't know they've surrendered. Which isn't a great combination at all.
Surrender is most likely under time pressure, complexity, limited domain knowledge, and critically, high trust in the AI system. For companies building health platforms, cognitive surrender is no longer a risk to monitor. It's a behavior to design for.
The trust paradox
Health platforms are designed to earn trust at every turn. They learn about you. They become the thing you check about sleep, nutrition, and movement. That's the whole point, and a potential problem.
The more trust a platform earns, the less consumers scrutinize what it tells them. For most interactions, that's fine. Nobody needs to deliberate about a stretching routine. But health decisions aren't all the same weight. Some mornings, you're choosing recipes. Other mornings, you're wondering whether a symptom means something. If the AI is trusted, it's trusted across the board.
Apple understood this when they built Health Trends. It showed you patterns but never told you what to do about them. The decision to leave interpretation to the human was deliberate.
Google made a similar choice with Med-PaLM. Their medical AI outperforms physicians on some benchmarks. But they've restricted it to clinician tools, not consumer products. Internal research showed that when people received AI health advice with high confidence scores, they were less likely to seek a second opinion. Google decided the cognitive surrender risk wasn't worth it, at least not yet.
Two of the most capable technology companies in the world looked at this problem and chose restraint. That's not excessive caution. That's a signal.
Two modes, not one
Woebot is a therapy chatbot that almost followed the Babylon Health story exactly. The first version delivered CBT exercises, the AI led, the user followed. Engagement looked good, but clinical outcomes didn't.
The redesign flipped the relationship. Instead of giving answers, it started asking questions. The chatbot became a mirror, reflecting the user's thinking back to them. Outcomes improved. Woebot earned FDA Breakthrough Device designation. The AI got less authoritative, and the product got more effective.
Every pharma company building health AI needs to answer the same question: when should the AI think for the user, and when should it make the user think?
Think of it as two distinct modes.
Mode one is for routine, low-stakes guidance. Sleep tips, workout suggestions, content recommendations. Minimize friction. Let people follow without deliberating. This is where habits form, and the relationship deepens. Headspace operates here, "you might try" and "some people find," never "you should." Invitation, not directive.
Mode two is for anything that could meaningfully affect health. Symptoms. Medication questions. Significant behavioral changes. Here, instead of making it easy to accept the recommendation, make the user pause. Not with disclaimers, those train people to click "OK" without reading. With genuine engagement: ask what they think before showing what the AI thinks. Surface uncertainty. Ask a question: "What does your gut tell you?"
This is friction, but the useful kind. Mode one removes friction. Mode two adds it back (deliberately).
The ideal user and the vulnerable user are the same person, someone who trusts the platform, lacks deep medical expertise, and prefers cognitive ease. That's not a contradiction. It's a constraint. The architecture, not just the content, has to distinguish between moments where surrender is harmless and moments where it isn't.
Every AI feature needs one question: if this succeeds at building trust, what's the worst decision someone makes by surrendering to it? If the answer is trivial, they try a meditation they don't enjoy; surrender is fine. If meaningful, they delay seeing a doctor; the feature needs to add some friction.
Four things worth building in:
Make the two-mode distinction a platform principle. Define which guidance categories are safe for frictionless delivery and which require user judgment. Build it into the entire design system, not individual features.
Position every AI experience as a partner, not an authority. "Here's what I'm noticing" is different from "Here's what you should do." Headspace got this right. Babylon Health got it wrong. The gap wasn't technology; it was tone and framing.
Use questions, not just recommendations. "Here's a pattern I'm seeing in your sleep. What do you think is going on?" Break each interaction into its component parts and ask: where does the user's judgment need to stay in the loop?
Run a "surrender audit" before launch. One question: "If users trust this completely and stop thinking, what's the worst that could happen?" If it involves a clinically significant decision, redesign. This is the behavioral equivalent of safety reviews that pharma already does, applied to product design.
Babylon Health and Headspace had access to similar AI. One went bankrupt. The other expanded into health coaching with strong retention and no trust crises. The difference wasn't the technology. It was whether the product replaced the user's thinking or supported it.
Every pharma company building AI-powered health tools will face this tension. Most will default to frictionless, because that's what good product teams do. But in health, frictionless isn't always safe. The companies that figure out where to add friction back, deliberately, in the moments that matter, will define what this category becomes.
The question isn't whether your AI can earn trust. It will. The question is what happens when it does.
The goal isn't an AI that people trust blindly. It's an AI that people trust well.
For an infographic version of this article click here.
Talk to us today about applying behavioral science to create safer, more effective digital health solutions and communications.
(i): Shaw, S.D. & Nave, G. (2026). Thinking - Fast, Slow, and Artificial: How AI Is Reshaping Human Reasoning and the Rise of Cognitive Surrender. Working paper, Wharton Behavioral Lab.