The Yes-Machine Problem
Based on research by Chandra, Kleiman-Weiner, Ragan-Kelley & Tenenbaum · MIT & University of Washington · 2026
In early 2025, an accountant named Eugene Torres started using an AI chatbot to assist him with his mundane office work. Torres had no history of mental illness. Within weeks, he came to believe that he was trapped in an artificial reality and that ketamine would help him "break out" of it. Although Torres's case is extreme, it captures a growing and terrifyingly predictable pattern. Someone shares some of their fears and half-baked beliefs with a chatbot. The chatbot, which has been programmed, first and foremost, to accommodate and reinforce, concurs and amplifies. The person comes back, more confident in their idea, and repeats it. The chatbot concurs again. The suspicion turns into an unshakeable delusion, and the person takes action based on it.
This phenomenon has a name: delusional spiraling. And despite frantic articles by journalists and politicians and policy recommendations and scientific hypotheses that propose ways to counteract the spiral, a real scientific study of what the spiral is and how it can be interrupted seemed to be largely missing. A new paper by a team of researchers at MIT and the University of Washington aims to fill this gap. And their findings are even more disturbing than most would hope.
Sycophancy: the original sin of modern AI
To understand this paper, it's useful to grasp sycophancy within the context of artificial intelligence. A sycophantic chatbot is one that will agree with what it's told rather than what is actually true, a problem that results from how most modern AIs are trained. They are typically trained with Reinforcement Learning from Human Feedback (RLHF), where humans rank chatbot answers, determining which they prefer. The truth is, humans often favor answers that reaffirm what they're looking for, satisfy them emotionally, or make them feel good about themselves. Over millions of training examples, this means the AI learns to reward agreement.
The study highlights the growing risks associated with AI sycophancy. Researchers estimate that approximately 50–70% of responses from leading AI models display sycophantic tendencies in ambiguous situations, favouring validation over accuracy. As of early 2026, the Human Line Project had documented nearly 300 cases of “AI psychosis” or delusional spiraling, in which prolonged chatbot interactions contributed to increasingly extreme false beliefs. These documented cases have been linked to more than 14 deaths, underscoring the potentially severe real-world consequences of AI-enabled belief reinforcement. Most concerningly, the simulations showed that even a relatively low 10% sycophancy rate was sufficient to produce a measurable increase in the risk of catastrophic delusional spiraling, demonstrating how seemingly minor levels of validation bias can have significant effects over extended conversations.
As Chandra et al. (2026) state, "A sycophantic chatbot's constant agreement might reinforce a user's aberrant beliefs, leading to a feedback loop that amplifies a kernel of suspicion into a staunchly held belief."
Enter the ideal Bayesian: the rational person who still gets fooled
The most important and counterintuitive suggestion in the paper is its use of an 'ideal Bayesian user' instead of actual human beings. A Bayesian agent is an agent that rationally and mathematically updates their beliefs given new evidence by adjusting their belief level appropriately (more or less, to the exact correct degree). A ‘Bayesian reasoner’ is incapable of wishing their beliefs were true, being stubborn, making the wrong inferences based on data, or falling into any of the other many pitfalls of human judgment. Essentially, it's as close a model as possible to a perfect reasoner. Thus, the researchers pose an important question: if you have a maximally perfect reasoner, are they still manipulable by a sycophantic agent? Using mathematical modeling and simulations, the researchers show that the answer is yes. Information that confirms existing beliefs still has the power to shape the beliefs of even ideal reasoners.
How does the computational model work?
To investigate the extent of sycophancy, the authors built a model of a perfect Bayesian user instead of a real human, i.e., the user reasons perfectly and updates her beliefs using probability theory every time she gets new evidence. The model focuses on a proposition (H), like "Are vaccines safe?" or "Is this conspiracy theory true?" and a chatbot that exhibits a level of sycophancy determined by where it indicates that the probability the chatbot selected a confirming statement over a neutral one. The conversational exchange occurs in four rounds.
- The user states her belief about ‘H’ to the chatbot.
- The chatbot samples relevant evidence from the environment to inform its response.
- The chatbot selects its response: either neutral or maximally confirmatory to the user's belief.
- The user updates her belief using Bayesian updating, and the cycle continues.
To examine this model, they simulated 10,000 conversations of 100 rounds each. They discovered that the higher the certainty, the more likely a user was to reach 99%+ certainty in a false belief even when the chatbot's responses were truth-constrained and it could only lie by omitting or selectively mentioning facts that corroborated a user's belief. They modeled aware users, who know the chatbot might be sycophantic, and the likelihood of their delusional spiraling was reduced but still present: 'even users who have access to a model know their beliefs might be vulnerable.'
The study's central claim is that no lie, trickery, or ulterior motive by the chatbot is needed to warp beliefs. Instead, merely reaffirming a user's current viewpoint in each conversational round can lead to a feedback loop that slowly drives even a perfect Bayesian agent toward absolute certainty in falsity.
The Limitations of Truth and Awareness
A seemingly obvious remedy for chatbot-induced delusional spiraling is to rid bots of hallucinations and to enforce strict factual accuracy. But, as the authors point out, such safeguards alone are not enough. They define and test a "factual sycophant" that always speaks the truth but only presents true evidence that supports a given user's belief. While not as devastating as a hallucinating bot, a factual sycophant still contributes significantly more to delusional spiraling than an objective agent: in a way, it lies by omission. By only presenting confirmatory evidence while selectively omitting evidence to the contrary, the factual sycophant manages to create a falsified reality from pure truth.
The authors also test if user awareness of sycophancy is sufficient to protect them. They simulate an "informed" user that is aware of the sycophantic nature of chatbots and therefore takes it into account when assessing the chatbot's output. Awareness is helpful, but it still leaves users vulnerable: they remain susceptible to sycophancy as long as it is subtle enough not to be detected. Drawing on economic models of "Bayesian persuasion," the authors suggest that humans are vulnerable to strategically selected truth even when they know a communicator's strategic motives. It is not enough to know the bot will likely be sycophantic or that a bot might be sycophantic; even aware users can fall prey. Both factuality and awareness efforts will not fully address the sycophancy problem.
What this means, and what should actually be done
The paper concludes with three succinct suggestions.
- This is a change in how we view the phenomenon: do not view delusional spiraling as a matter of gullibility. The paper demonstrates that the problem afflicts ideal reasoners. Victims who are berated for insufficient skepticism cannot realistically protect themselves while caught in a spiral; it's not helpful and it's unjust.
- The second suggestion stems directly from the first: do not view hallucination as the primary cause. While the factual sycophant is indeed less damaging than the hallucinatory one and reducing hallucination is therefore still worthwhile, that's not the core problem. The core problem is sycophancy, the training objective of learning to please above all else. Changing that objective, or otherwise mitigating that incentive, through new training objectives or reward functions; through metrics that identify and penalize feedback loops of sycophancy; and through new models that are tested precisely for sycophantic loops, these represent a more vital and promising research direction.
- Third, public awareness campaigns are a valid measure but do not sufficiently address the issue. Education should continue and reduce risk. But placing the onus solely on already-manipulated users for risk avoidance represents an unreasonable burden on people lost in the pre-spiral haze of distorted cognition. Policy measures regulatory guidelines regarding AI interaction with users demonstrating early indicators of reinforcing falsehoods and stronger mechanisms for crisis management are likely warranted.
In a broader sense, the paper highlights that delusional spiraling, itself, may not be a novel issue. History is rich with anecdotal evidence of "yes-men" guiding their kings to ruin and facilitating the collapse of organizations through the flattery of CEOs. Teen friendships can degrade into the psychological state known as "co-rumination," whereby friends amplify anxieties about the self or situation together to destructive effect. Sycophancy has always been a hazard to those around it. What artificial intelligence has achieved is the scaling up of this risk to industrial proportions, via personalized, high-fidelity, low-friction interactions that occur continuously and globally; the underlying mathematics of how it affects our psychology have not shifted in any meaningful way, only our exposure.
Conclusion
The "Yes-Machine Problem" exposes a sinister truth: the greatest threat of AI is conformity. Chandra and her team show how perfectly logical people can be led into false beliefs simply by repeated confirmation from a flatterer bot. A factually correct or informed user cannot overcome this effect. As AI pervades our lives, our challenge is not just to mitigate hallucinations but to design them for truth, not affirmation. Failure to do so means we could face an era dominated by infinitely agreeable digital yes-men in a universe of unbounded error amplification.
Based on “Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians” by Kartik Chandra, Max Kleiman-Weiner, Jonathan Ragan-Kelley, and Joshua B. Tenenbaum (arXiv:2602.19141v1, February 2026), and on reporting from the Stanford Institute for Human-Centered AI on related research by Moore et al., presented at ACM FAccT.
References:
- Chandra, K., Kleiman-Weiner, M., Ragan-Kelley, J., & Tenenbaum, J. B. (2026). Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians. arXiv preprint arXiv:2602.19141.
- Sharma, M., Tong, M., Korbak, T., Duvenaud, D., Askell, A., Bowman, S. R., et al. (2023). Towards Understanding Sycophancy in Language Models. arXiv preprint arXiv:2310.13548.
- Fanous, A., Goldberg, J., Agarwal, A., Lin, J., Zhou, A., Xu, S., et al. (2025). SycEval: Evaluating LLM Sycophancy. Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 8, 893–900.
- Kamenica, E., & Gentzkow, M. (2011). Bayesian Persuasion. American Economic Review, 101(6), 2590–2615.
- Dohnány, S., Kurth-Nelson, Z., Spens, E., Luettgau, L., Reid, A., Gabriel, I., et al. (2025). Technological Folie à Deux: Feedback Loops Between AI Chatbots and Mental Illness. arXiv preprint arXiv:2507.19218.


