AI Sycophancy: Impacts, Harms & Questions

Previously we posted a tech brief on AI Sycophancy – a pattern where an AI model “single-mindedly pursue[s] human approval.” Below, we outline documented and observed harms, along with key questions that remain open for policymakers, developers, and researchers.

August 11, 2025

•••

Part 1: AI Sycophancy Impacts & Harms

This list reflects documented harms identified in media cover age, academic research, and public reporting. It is not intended to be exhaustive. In many of the cited sources, multiple harms are described.

Examples of Harm:

Exacerbate mental health issues: https://www.psychologytoday.com/us/blog/urban-survival/202505/can-ai-be-your-therapist-new-research-reveals-major-risks, https://www.theguardian.com/australia-news/2025/aug/03/ai-chatbot-as-therapy-alternative-mental-health-crises-ntwnfb
Financial harm: https://arxiv.org/abs/2502.07663, https://venturebeat.com/ai/openai-rolls-back-chatgpts-sycophancy-and-explains-what-went-wrong
Medical Harm: https://news.stanford.edu/stories/2025/06/ai-mental-health-care-tools-dangers-risks, https://www.brookings.edu/articles/breaking-the-ai-mirror/
Emotional dependence and/or harm: https://openai.com/index/affective-use-study/, https://www.axios.com/2025/07/07/ai-sycophancy-chatbots-mental-health, https://arxiv.org/abs/2504.18412
Manipulation and Deception: https://arxiv.org/abs/2411.02306, https://www.cnet.com/tech/services-and-software/these-ai-chatbots-shouldnt-have-given-me-gambling-advice-they-did-anyway/
Harms to Kids and Teens: https://www.silicon.co.uk/e-innovation/artificial-intelligence/ai-committee-harm-626708, https://www.rochesterfirst.com/reviews/br/services-br/technology-br/study-disturbing-findings-chatgpt-encourages-harm-among-teens/
Psychosis, delusional thinking, distorting reality: https://www.wired.com/story/ai-psychosis-is-rarely-psychosis-at-all/, https://www.techpolicy.press/artificial-sweeteners-the-dangers-of-sycophantic-ai/, https://www.psychologytoday.com/us/blog/psych-unseen/202508/why-is-ai-associated-psychosis-happening-and-whos-at-risk, https://www.nytimes.com/2025/08/08/technology/ai-chatbots-delusions-chatgpt.html, https://www.rollingstone.com/culture/culture-features/ai-spiritual-delusions-destroying-human-relationships-1235330175/
Self-harm, substance abuse: https://med.stanford.edu/news/insights/2025/08/ai-chatbots-kids-teens-artificial-intelligence.html
Manipulation via dark patterns: https://techcrunch.com/2025/08/25/ai-sycophancy-isnt-just-a-quirk-experts-consider-it-a-dark-pattern-to-turn-users-into-profit/https://venturebeat.com/ai/darkness-rising-the-hidden-dangers-of-ai-sycophancy-and-dark-patterns
Bias reinforcement: https://arxiv.org/abs/2412.02802, https://www.wsj.com/tech/ai/ai-chatbot-agree-flatter-users-1787e1a7, https://arxiv.org/abs/2508.13743
Fueling anger, urging impulsive actions: https://openai.com/index/expanding-on-sycophancy/

Evaluating AI Sycophancy:

Model Sycophancy Evaluation Data: https://github.com/anthropics/evals/tree/main/sycophancy
Towards Understanding Sycophancy in Language Models: https://github.com/meg-tong/sycophancy-eval/blob/main/README.md
DarkBench: Benchmarking Dark Patterns in Large Language Models: https://openreview.net/pdf?id=odjMSBSWRt
SycEval: Evaluating LLM Sycophancy: https://arxiv.org/html/2502.08177v2
Measuring Sycophancy of Language Models in Multi-turn Dialogues: https://arxiv.org/abs/2505.23840
Social Sycophancy: A Broader Understanding of LLM Sycophancy: https://arxiv.org/html/2505.13995v1
GlazeBench – Sycophancy & Flattery Benchmark: https://www.glazebench.com/v/1.0.0

Part 2: Open Questions

Despite increasing evidence of reported harms, including but not limited to the ones listed above, significant gaps remain in understanding the causes and effects of sycophantic behavior in AI systems, especially given limited transparency by leading developers of AI systems. The following list is intended as a menu of potential inquiries, recognizing that not all will apply to every company and may need to be adapted accordingly.

1. Knowledge of risks

[Risks] What did the company know about sycophancy risks before deploying its most recent models or updates? What internal research or testing documented that knowledge?
[Sensitive content] Does the company audit interactions that involve self-harm, violence, or delusional content? If so, how, and how is that data used? How does the company handle references to sensitive topics including but not limited to: drug use, suicidal ideations, and adult material?
[Kids] What testing or research has the company done on how sycophantic responses may affect children, teenagers, or other vulnerable groups? Is such research or testing reflected in notes, correspondence, presentations, or readouts to internal teams and/or staff?
[Expert engagement] Has the company engaged external experts (child development, psychology, education) in evaluating risks to kids? If so, who, when, and how?
[General population] What assessment has the company conducted of how AI sycophancy (e.g., excessive agreement, flattery, or mirroring of user views) may impact people’s behavior, decision-making, or well-being

2. User Complaints

[Complaints] Has the company received complaints or feedback from users about AI sycophancy? If so, how many and what kinds?[Tracking] How does the company track, categorize, and respond to such complaints?
[High risk users] To what extent does the company take steps to identify high-risk users (e.g., those disclosing mental health struggles) and protect them from reinforcement of harmful ideas?
[Reports] What internal reports exist – including but not limited to from red teamers, alignment researchers, expert testers, or user feedback — that document instances of sycophantic or overly agreeable behavior in the company’s models. What actions (if any) did the company take in response to those reports?
[Informing] How has the company informed individual users who were exposed to dangerous sycophantic outputs, including encouragement of delusions and self-harm? If the company has not, why not?
[Accessibility] What internal tests were conducted to assess whether users could submit complaints when they wanted? What design choices were considered? What other mechanisms or options were considered to allow users to submit complaints?

3. Accountability

[Executives] Who on the company’s executive team has direct accountability for sycophancy-related safety issues? To what extent are these individuals compensated based on user growth, average revenue per user, or daily messages per active user?
[KPIs] Were sycophancy-related behaviors factored into key performance indicators, or their equivalents, used to evaluate employee or team performance?
[Release process] What was the approval process for model updates? Who was/is accountable for authorizing its release?

4. Metrics, Testing and Benchmarks

[Metrics] What specific metrics or benchmarks does the company use to test for sycophancy prior to release? Will the company publish those benchmarks?
[User satisfaction vs accuracy] How does the company separate metrics for accuracy from metrics for user satisfaction during reinforcement learning or fine-tuning?
[Testing] What internal or external testing has the company done to assess how sycophancy might shape user interactions?
[Findings] Has the company published or internally circulated findings from these tests? If so, would the company be willing to provide any associated presentations, reports, emails, or readouts?

5. Data

[Data Types] What types of data does the company collect from users during interactions that might relate to sycophancy (e.g., agreement or disagreement rates, sentiment)?
[Data Use] How does the company use that data (e.g., model improvement, personalization, commercial purposes)? Who does the company share this with?
[Data Sale] Does the company share or sell this data with third parties? If so, under what conditions?
[User controls] What controls, if any, do the company’s users have over data collected in relation to sycophantic outputs
[Training data] Has the company audited or tested the training data for instances where the chatbot is rewarded (implicitly or explicitly) for agreeing with users or providing flattering responses?

6. Memory and “conversation” length

[Impact on frequency/intensity] How does the length of session memory affect the frequency or intensity of sycophantic outputs?
[Impacts on extreme/unsafe outputs] Has the company measured whether longer memory or persistent chat histories correlate with more extreme or unsafe outputs? If so, what did the company find?
[Impacts on multi-session memory accumulation] Has the company analyzed whether multi-session memory accumulation increases the likelihood of outputs that reinforce harmful suggestions in high-risk domains such as mental health, self-harm, or violent, conspiratorial beliefs? If so, what did the company find?
[Short vs. long session impacts] Has the company conducted controlled experiments comparing short-term, single-session interactions with longer-term memory-enabled sessions to quantify changes in the model’s tendency to produce uncritical or sycophantic outputs? If so, what did the company find?

7. Corrective Actions

[Corrections] When the company has detected sycophantic behavior, what concrete changes – not just high-level promises – has it made to training data, fine-tuning processes, or evaluation frameworks to prevent recurrence?
[Incident response] When harmful sycophantic outputs are identified post-deployment, what is the company’s incident response timeline (e.g., hours, days, weeks)? What is the incident response process?

8. Transparency

[Safety testing results] Will the company commit to publicly releasing the results of their safety testing, including sycophancy evaluations, before future rollouts?
[Third party evaluations] What independent third parties (academics, civil society, regulators) have access to evaluate the company’s systems for sycophancy risks prior to release?
[Parental consent] How does the company ensure that parental consent is properly obtained and verified? What evidence does the company have that its parental consent mechanisms are effective in practice?

9. Financial Incentives

[Revenue vs. safety] How does the company separate revenue optimization from safety-critical decisions about model behavior?
[A/B Tests] Has the company A/B tested [or used similar methods] on sycophantic versus non-sycophantic behaviors in order to measure effects on user growth, engagement, retention, time-on-platform, or conversion to paid accounts? If so, what did they find?
[Retaining users] How have financial considerations, such as pressure to acquire or retain paying users, played a role in releasing models or updates with known sycophancy risks?
[Design choices] Has the company analyzed whether longer session memory or multi-session memory persistence affects user engagement, retention, or conversion to paid subscriptions? If so, do metrics demonstrate that extended memory contributes to increased usage or subscription revenue? Have any design choices regarding memory length been influenced by their potential to enhance willingness to pay, rather than by safety or accuracy considerations?

•••

Stephanie T. Nguyen is a Senior Fellow at Georgetown Institute for Technology Law & Policy, Former Chief Technologist at the Federal Trade Commission

Erie Meyer is a Senior Fellow at Georgetown Institute for Technology Law & Policy, Former CFPB Chief Technologist

Samuel A.A. Levine is a Senior Fellow at UC Berkeley Center for Consumer Law & Economic Justice, Former Bureau of Consumer Protection Director at the Federal Trade Commission