The Reward Machine: Are LLMs the New Social Media?

Abstract

I've spent thirty years building systems that reason about complex adaptive environ-ments. I'm no skeptic of artificial intelligence. I use it daily, and I believe it will be the most consequential general-purpose technology of my working life. That belief is what makes the question I'm about to ask uncomfortable.

‍

Large language models share every structural feature that made social media addictive. They also add new ones. They simulate empathy. They flatter. On demand, they produce the subjective experience of being understood. A March 2026 study in Science found that across eleven AI models, chatbots affirmed users' actions 49 percent more often than humans did, including in cases involving deception, illegality, or other harms.1 Peer-re-viewed observational research reports a strong inverse association between self-reported AI use and self-reported critical thinking (r = −0.68, n = 666).² The underlying studies have real limits, which I name as I go. The convergent signal is what makes the question worth pressing.

‍

A tool that amplifies competence is still a tool. A tool that substitutes for competence while creating the illusion of competence is something else entirely. That distinction is the hinge of what follows.

‍

This is a Thought Probe, not a conclusion. The question is whether we'll recognize the pattern in time to do something about it.

‍

The Probe

What if the most powerful cognitive tool ever built is also the most addictive?

‍

I'm not claiming an answer. I'm posing a question the evidence compels but does not yet resolve.

‍

Consider the shape of the problem. A technology arrives with genuine utility. Millions adopt it. Then hundreds of millions. The adoption curve outruns the research curve. By the time peer-reviewed evidence accumulates, the habits have hardened, the commercial incentives have calcified, and the harm (if there is harm) is distributed so widely that no single person or institution is responsible for it. If this description sounds familiar, it should. It describes what happened to our relationship with social media between 2007 and 2017. It appears to be happening again, faster, with a more intimate technology, and we're largely not talking about it.

‍

One disclosure before the evidence. The Thought Probe format can look like a loophole: a way to make directional claims and then retreat into "I'm only asking questions." The evidence here points in a clear direction. What I'm genuinely uncertain about is the magnitude of the effect and how reversible it is. The probe format reflects that uncertainty about magnitude, not about the direction of the risk.

‍

A second disclosure, equally important. I am the founder and CEO of Enterra Solutions, which builds reasoning-over-relationships AI systems that are architecturally distinct from transformer-based LLMs. The contrast I draw in the Socratic Objection section below aligns with my firm's commercial positioning. A reader should weigh this accordingly. I have tried to let the evidence lead and to state limits honestly. Whether I have succeeded is for the reader to judge.

‍

A third disclosure. There is a substantial and growing body of peer-reviewed evidence that AI tools improve measured performance on specific tasks. Noy and Zhang (Science, 2023) found ChatGPT cut writing time by 40 percent and raised output quality by 18 percent in a controlled experiment, with the largest gains accruing to lower performers.²⁹

‍

Bastani and colleagues at Wharton ran a randomized trial with roughly 1,000 students and found conditional cognitive gains depending on how AI was used.³⁰ Kestin and col-leagues at Harvard published an RCT in Scientific Reports showing AI-assisted tutoring outperformed active-learning classrooms on specific physics tasks.³¹ Bick, Blandin, and Deming of the Federal Reserve Bank of St. Louis document measurable productivity gains at population scale.³² This literature does not contradict what follows. It frames it. The essential distinction, as I argue below, is between using AI to extend existing expertise and using it to substitute for expertise that has not yet been built. The productivity literature and the dependency literature describe the same tool used in different ways.

‍

Part I: The Mechanism

Every addictive technology exploits the same neurological vulnerability. The human reward system is tuned to respond most powerfully to uncertain rewards rather than predictable ones. B. F. Skinner called this principle variable-ratio reinforcement. It's why slot machines create compulsion where vending machines don't. It's why checking email feels different from reading a book. A 2025 design-analysis paper presented in the CHI Extended Abstracts workshop track argues that this is exactly what happens when you prompt a chatbot.⁴ The paper is a structural analysis of interface features, not a behavioral study, and that is how I use it below.

‍

The Four Dark Addiction Patterns

Researchers M. Karen Shen and Dongwook Yoon evaluated eight major AI chatbot plat-forms: ChatGPT, Claude, Gemini, Copilot, Perplexity, Meta AI, Character.AI, and Replika. They identified four design features that, in their words, correspond to the neurological mechanics of gambling.⁴ Three of these platforms (Character.AI, Replika, and Meta AI) are companion apps engineered for emotional engagement; the other five are general-purpose assistants. The distinction matters, and I draw it throughout what follows. The structural claim I am making is that the two categories share a reward architecture, not that they produce identical harms. Companion apps are where the most acute harms have been documented. General-purpose assistants are where the ambient, population-scale effects are more likely to show up.

‍

The first is non-deterministic output. Because LLM responses are probabilistic, each answer is slightly different. Sometimes brilliant, sometimes flat, occasionally surprising. That variance, the researchers write, "corresponds to what neuroscientists call 'reward uncertainty,' which tends to increase dopamine release, similar to playing a slot machine." The second is streaming presentation. Five of the eight platforms render responses token by token, creating a reward-predicting cue analogous to the animated reels of a slot ma-chine. The third is proactive contact. AI companions such as Character.AI email users unprompted. Users perceive this as the system "wanting to talk," a dopamine signal wrapped in the simulation of care. The fourth, and most consequential, is empathetic agreement. The system validates. It rarely disagrees. It makes you feel understood. That's a profoundly different kind of reward from a like or a retweet.

‍

The very features that make AI chatbots supremely useful (limitless availability, perfect agreeableness, effortless fulfillment) are precisely the features that make them addictive.

‍

The Loop

Forbes contributor Curt Steinhorst described his own descent into ChatGPT compulsion in a 2025 essay. He called the cycle Prompt → Output → Evaluate → Repeat.⁵ Each response, he wrote, "feels like it might be the perfect fit. Each reply brings an element of surprise, engaging the psychological principle of intermittent reinforcement." Over time he found himself unable to compose a simple email without consulting the model. His skills hadn't vanished. His internal reference point for what "writing" felt like had shifted. The tool had become the baseline.

‍

Neural Evidence: Cognitive Debt

In June 2025, MIT Media Lab researchers led by Nataliya Kosmyna and Pattie Maes published the first EEG study of LLM use.³ Fifty-four participants authored essays across four sessions in one of three conditions: LLM-assisted, search-engine-assisted, or unaided.

‍

Unaided writers showed the strongest, most distributed neural networks. Search-engine writers were intermediate. LLM users showed the weakest brain connectivity of any group. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels. Among the eighteen LLM-group participants who completed the final session, fifteen (83 percent) could not accurately quote their own essays from Session 1. When those same LLM users were switched to unaided conditions, their neural connectivity did not recover. It stayed reduced, as if the brain had settled into a lower baseline.

‍

One finding complicates the picture in a productive way. Participants who had first written unaided, then later received ChatGPT, showed increased neural engagement rather than diminished engagement. The sequence matters. AI used after independent thought appears to reinforce cognition. AI used instead of independent thought appears to degrade it.

‍

The researchers named the pattern cognitive debt: the measurable reduction in independent cognitive function that appears to accumulate when thinking is routinely outsourced. I treat this as a working hypothesis drawn from one preprint, not a settled mechanism. The term is deliberately borrowed from finance. The proposition that it behaves like a liability that compounds is exactly the kind of claim that needs the larger, preregistered replications the field has not yet run.

‍

The Social Media Precedent

We have seen this movie. The founders of Facebook told us what they built, and why.

‍

The thought process that went into building these applications … was all about: How do we consume as much of your time and conscious attention as possible? That means we need to give you a little dopamine hit every once in a while. … It's a social-validation feedback loop … exactly the kind of thing that a hacker like myself would come up with, because you're exploiting a vulnerability in human psychology.

— Sean Parker, former President, Facebook, Axios event, November 8, 2017⁶

‍

Parker added: "God only knows what it's doing to our children's brains."

‍

The short-term, dopamine-driven feedback loops we have created are destroying how society works: no civil discourse, no cooperation, misinformation, mistruth.

— Chamath Palihapitiya, Stanford Graduate School of Business, November 10, 2017⁷

‍

These weren't outside critics. These were the architects. In public, they said they'd built something they couldn't recommend their own children use. A 2025 systematic review in Behavioral Sciences concluded that social media platforms "significantly increase their use frequency and behavioral stickiness through 'variable ratio reinforcement' (intermittent and unpredictable reward designs similar to those of gambling)."⁸

‍

One honest caveat. Social media was ad-driven, networked, public, identity-forming, and socially contagious in ways that chatbots often are not. The typical LLM interaction is private, one-to-one, and not performed for an audience. The rhyme between the two technologies is structural (a shared reward architecture built on variable reinforcement), not total. I'm arguing that the reward mechanics transfer, not that LLMs will reproduce every pathology social media did. They're more likely to produce a different, more intimate family of harms.

‍

Why LLMs May Be More Addictive Than Social Media

Social media exploited variable rewards delivered by other humans: likes, comments, notifications. LLMs are stranger. They exploit the same reward circuitry without requir-ing other humans at all. They simulate the thing social validation is supposed to be a proxy for: genuine understanding.

‍

Christian Montag of Ulm University and colleagues identify four contributing factors to AI dependency: personal relevance, parasocial bonds, productivity gratification, and over-reliance on AI for decisions.⁹ The APA's 2026 Monitor on Psychology reports that AI companion apps grew 700 percent between 2022 and mid-2025, with Character.AI alone reaching twenty million monthly users, more than half under twenty-four.¹⁰ An OpenAI–MIT Media Lab collaboration (published by OpenAI and MIT jointly, and notable be-cause it is a critical finding released by the developer itself) combined a 28-day randomized controlled trial of 981 participants with an observational analysis of nearly forty mil-lion ChatGPT interactions. Higher daily ChatGPT use was associated with higher loneliness, greater dependence, more problematic use, and lower socialization with other people; the effect was most concentrated in a high-emotional-reliance subgroup rather than distributed across all heavy users.¹¹

‍

The Decision Lab frames it precisely: "When an AI system speaks in a warm, conversational way, remembers details, and responds with empathy, people begin to feel a sense of relationship and safety."¹² Researchers James Muldoon and Jul Parke have named the pattern cruel companionship: an attachment that promises intimacy while structurally foreclosing reciprocity.¹³ (I cite this via a recent popular summary; the primary paper is the anchor, and curious readers should seek it directly.)

‍

Sycophancy as Engineered Dependence

In March 2026, Science published the first large-scale controlled study of LLM sycophancy, led by a Stanford-based team including Myra Cheng and Dan Jurafsky. The pa-per examined eleven AI systems, primarily in interpersonal-conflict scenarios. Across the benchmark, chatbots affirmed users' actions 49 percent more often than humans did, on average, including in queries involving deception, illegality, or other harms. In three pre-registered human experiments (N = 2,405), even a single interaction with sycophantic AI left participants more convinced they were correct and more likely to consult the model again. They preferred the flattering system.1 The study measures affirming behavior in a specific experimental context, not clinical addiction. I use it here as evidence of an engineered engagement mechanism that plausibly contributes to dependence, not as proof of compulsive use.

‍

Scientific American, citing Dana Calacci of Penn State, reported that sycophancy "tends to get worse the longer users interact with the model."¹⁴ The flattery compounds. Commercial pressure favors more of it, not less.

‍

The Scale

ChatGPT reached one hundred million users faster than any consumer technology in his-tory. Pew Research found that 34 percent of U.S. adults had used ChatGPT as of early 2025, roughly double the 2023 share, and 58 percent of adults under thirty.¹⁵ By early 2026, SSRS/Edison Research found that 52 percent of Americans use AI chat platforms every week.¹⁶ Globally, weekly active users are estimated at roughly 8.6 percent of the world's population, based on OpenAI and Harvard NBER work.¹⁷

‍

Half a billion people are in the middle of a variable-reward cognitive loop that has existed for less than four years. The research is running behind.

‍

Part II: The Payload

Addiction is only half the story. The more disquieting question is what the heavy-use pattern is doing to the mind that hosts it.

‍

Critical Thinking, Measured

In January 2025, Michael Gerlich of SBS Swiss Business School published the largest quantitative study to date on the cognitive consequences of LLM use.2 The sample: 666 participants across three age cohorts in the United Kingdom. The instrument: the Halpern Critical Thinking Assessment, supplemented by semi-structured interviews. The find-ings are directional and substantial:

AI tool usage correlated negatively with critical thinking scores at r = −0.68, p < 0.001. That is a large coefficient by the standards of observational social-science re-search.
Cognitive offloading correlated positively with AI use (r = +0.72) and negatively with critical thinking (r = −0.75). Mediation analysis suggested cognitive offload-ing partially explains the relationship.
The youngest cohort (17–25) showed the sharpest dependency and the lowest scores. The oldest (46+) showed the inverse.

‍

The study is cross-sectional and self-reported on both sides of the correlation, so causation cannot be established from this data alone, and the r itself is inflated by common method bias. Gerlich measures AI exposure through self-report rather than logged behavior, which is vulnerable to recall bias. Critical-thinking performance is measured through a self-rated instrument, not an externally scored behavioral test. The sample, though large, skews higher-educated than the UK base rate, and a September 2025 correction was issued to the paper; the core directional finding survives, but its magnitude should be read with appropriate caution. What the data support is that frequent AI use is strongly associated with lower self-rated critical-thinking confidence, with cognitive offloading as the most plausible mediating mechanism.

‍

Three further methodological limits deserve naming. The MIT EEG study is a preprint and rests on a subsample of eighteen participants who completed the final session, too small to generalize with confidence on its own. It is also worth noting that the single most important finding in that study is not the "cognitive debt" trajectory but the sequence finding: when participants wrote unaided first and then used ChatGPT, their neural engagement increased. The direction of use matters more than use itself. A plausible confound also remains across this literature: heavy AI use and lower critical thinking may both be downstream of trait variables such as intellectual curiosity or educational attainment, rather than standing in a causal relationship. What makes the directional finding credible is convergence. Behavioral, neuroimaging, and survey data from independent teams using different methods all point the same way. Heavy, substitutive LLM use is associated with measurable deterioration of independent reasoning. We're working from strong signals, not settled science.

‍

The Dunning-Kruger Reversal

A February 2026 paper in Computers in Human Behavior, led by Daniela da Silva Fernandes and Robin Welsch at Aalto University, asked roughly five hundred participants to solve LSAT-style reasoning problems with ChatGPT assistance.¹⁸ All users, regardless of skill, overestimated their own performance when using AI. And the Dunning-Kruger effect was reversed.

‍

In normal conditions, people who are bad at something are overconfident, and people who are good at it are underconfident. The AI-using cohort reduced this pattern and, among the most self-rated AI-literate participants, showed signs of outright inversion. Most prompted ChatGPT once, accepted the answer, and moved on. Live Science summarized the finding plainly: "AI all but removes the Dunning-Kruger effect; in fact, it almost reverses it."¹⁹

‍

The feedback loop by which errors teach humility is short-circuited. People who know they are incompetent typically seek help. People who don't know they are incompetent while using AI do not.

‍

The Verification Gap

Two independent data points converge on the same number. A 2025 global KPMG / University of Melbourne survey, cited in a recent arXiv paper on calibrated trust in LLMs, reports that 66 percent of employees rely on LLM outputs without verifying accuracy.²⁰ The EY AI Sentiment Index 2025 reported separately that fewer than a third of users regularly verify AI-generated content. More than half of participants in the arXiv trust study (N = 192) reported work-related mistakes attributable to over-reliance on LLM outputs. Trust is extended based on fluency rather than verification. That is exactly the heuristic LLMs are engineered to trigger.

‍

Atrophy vs. Foreclosure

Psychology Today drew a distinction in March 2026 that is, in my view, among the most important framings in this entire discussion.²¹ The distinction is not yet grounded in dedicated developmental neuroscience on LLM use; it is an extension of existing atrophy/foreclosure logic to a new domain. I extend it here as an analytical frame and as a research agenda, not as a settled finding.

‍

Atrophy is what happens when an adult offloads a task they previously could perform. The capacity exists but weakens through disuse. In principle, atrophy is reversible.

‍

Foreclosure is what happens when a child grows up offloading a task they never developed in the first place. The capacity was never built. Foreclosure may not be reversible at all. A seventeen-year-old who has never done the work of constructing an argument from scratch is not making the same tradeoff as a forty-year-old professional. The forty-year-old is choosing to delegate a competency. The seventeen-year-old is skipping a developmental step.

‍

Gerlich's data show this asymmetry in action. The cohort most exposed to LLMs during formative years is the cohort with the lowest critical-thinking scores. This is the variable that should most concern us, and about which we have the least time to act.

‍

Organizational Homogenization

A March 2026 opinion paper in Trends in Cognitive Sciences, led by Zhivar Sourati at USC, synthesizes the evidence on what happens when an organization (or an industry) routes its reasoning through the same small set of models.²² The findings:

LLM outputs are measurably less varied than human-generated writing and reflect a specific cultural prior (Western, Educated, Industrialized, Rich, Democratic).
After interacting with biased LLMs, users' own opinions moved closer to the mod-els.
Groups using LLMs produced fewer and less creative ideas collectively than groups working without AI, even though individuals generated more ideas with AI assistance.

‍

For an enterprise, this points toward the end of strategic differentiation through cognitive diversity. When your analysts, your competitors' analysts, and your regulator's analysts all route through the same model, the outputs converge. Independent verification, the core mechanism of organizational quality control, begins to fail at-scale.

‍

The Junior Talent Problem

Gartner forecasts (not measurements) project that by 2028, 40 percent of employees will be trained by AI rather than by humans, and that half of enterprises may face irreversible skill shortages by 2030.²³ A CIPD opinion survey of approximately 2,000 UK senior HR leaders (N = 2,019) found that 62 percent predict junior, clerical, managerial, and administrative positions are most likely to be eliminated by AI.²⁴ Both sources are directional rather than empirical. I use them here to indicate that experienced practitioners anticipate what the mechanism above implies, not to establish that it has happened.

‍

This is where the individual and organizational stories collide. Junior roles are the talent pipeline. A junior investment-banking analyst learns to evaluate risk by doing founda-tional valuation work under senior supervision. When AI takes over that work, the analyst skips the rung of the ladder on which the expertise is built. The knowledge paradox is intuitive to anyone who's watched it unfold in software teams: seniors use AI to accelerate work they already know how to do, while juniors try to use AI to learn what to do. The results differ dramatically. An enterprise that replaces its junior cohort with AI in year one has, by year seven, no senior cohort to promote.

‍

The Explainability Gap

All of this culminates in what may be the true strategic risk for institutions: the decision-maker who cannot explain why the decision is correct.

‍

An Altimetrik / HFS Research survey of five hundred companies found that only 14 per-cent have a clear AI strategy aligned to accountability structures, and that roughly 80 percent report unclear ownership of AI initiatives.²⁵ The philosophical core of the problem, developed in a 2025 arXiv preprint by Angjelin Hila, is this: LLMs transmit information reliably but do not produce reflective knowledge. They have no access to the grounds for their own outputs. When humans outsource reflective knowledge to LLMs at scale, "reflective standards of justification" may erode, not just individually but collectively.²⁶

‍

Put more plainly: the person who cannot explain their own recommendation has failed at more than communication. They have failed to know. When that failure becomes the default, auditability, accountability, and institutional learning break down together. Article 14 of the EU AI Act does not ask merely for human oversight. It asks organizations to prove that oversight existed at the point of decision. Most enterprises today cannot meet that standard because the reasoning trail does not exist.

‍

Part III: The Uncomfortable Parallel

The pieces are on the table. Here is the shape they make.

‍

The social-media arc had four phases:

Utility (2007–2010): the tools were genuinely useful; the early adopters were right about their promise.
Habit (2010–2013): use patterns hardened; notification architectures and infinite scroll were engineered in; skepticism was dismissed as technophobic.
Dependency (2013–2017): the architects began to speak publicly about what they had built; adolescent mental health data began to shift; political and epistemic harms became impossible to ignore.
Regulation and counter-movement (2017–present): Cambridge Analytica hearings, Section 230 debates, Age-Appropriate Design codes, Surgeon General adviso-ries. A cultural consensus arrived too late for the first generation of users: the technology required guardrails.

‍

Ten years, beginning to end. The cultural consensus trailed the harm by roughly a decade.

‍

Where We Are Now

LLMs went from research curiosity to mass consumer product in November 2022.

Utility (2022–2024): the tools were genuinely useful; the early adopters were right.
Habit (2024–2026): use patterns are hardening; variable-reward architectures, streaming responses, memory features, and companion products have been engineered in; skepticism is being dismissed as technophobic.

Right now, we are in the phase that corresponds to 2012 or 2013 on the social-media timeline. The architecture is locking in. The habits are forming. The youngest users are the most exposed. The research is lagging the adoption by years. And the pattern of early warnings from insiders (the Montag papers, the APA's July 2025 request to the Consumer Product Safety Commission to investigate "the unreasonable risk of injury posed by generative AI chatbots,"²⁷ the lawsuits already in settlement over chatbot-linked teen mental health harms²⁸) is starting to rhyme with the Sean Parker and Chamath Palihapitiya disclosures of 2017.

‍

What's Different This Time

The timeline is compressed. Social media took a decade to move from "promising new platform" to "recognized public-health concern." LLMs appear to be running the same arc in three years.

‍

The interaction is more intimate. Social media fragmented attention. LLMs are increasingly substituting for reasoning itself. A teenager who scrolls Instagram for three hours loses time. A teenager who submits every essay and argument to a chatbot loses the developmental opportunity to build the cognitive capacity the tool is replacing. For adults, heavy LLM use risks atrophy of existing skills. For children who have not yet built those skills, the risk is worse. Atrophy becomes foreclosure. The skill was never constructed. The neural pathway was never laid down. And unlike atrophy, foreclosure may not be reversible.

‍

The economic incentive is the most consequential difference. Social media was monetized through attention. LLMs are monetized through dependence. The more you rely on them, the more indispensable the subscription becomes. The sycophancy study in Science identifies the structural problem explicitly. The features that drive engagement are the same features that cause harm, so commercial pressure drives them in the wrong direction. No market mechanism corrects this on its own. Users prefer the flattering model. They will pay for the flattering model. They will switch away from the one that tells them no.

‍

The Question

So, the question is this. Are we at the same inflection point with LLMs that we were at with social media in roughly 2013? If we are, if this is the last window before habits calcify and institutions adapt to the new baseline and the research catches up too late, will we recognize it this time?

‍

I'm not being rhetorical. I genuinely do not know the answer.

‍

The Socratic Objection

Every generation faces a version of this anxiety. Socrates warned that writing would de-stroy memory. The printing press was accused of flooding the world with dangerous ideas. Calculators were supposed to make us unable to do arithmetic. And yet here we are, with more literate, more numerate, more connected societies than Socrates could have imagined. The Socratic objection deserves honest engagement.

‍

Here is where the counter-argument has genuine force. There is a large population of experienced professionals who possess deep subject-matter expertise but lack current software and quantitative skills. For these individuals, LLMs are not replacing thought. They are bridging a gap the market has failed to fill. A seasoned executive who understands demand patterns at an intuitive level but cannot write Python can now translate that expertise into analytical output that would previously have required a data science team. The LLM becomes a force multiplier for existing expertise, filling skill gaps rather than replacing critical thinking.

‍

The advantage compounds when that expertise is formalized into structured knowledge representations that capture reasoning patterns, domain relationships, and institutional knowledge in a durable, scalable form. Paired with AI architectures that reason over relationships rather than predicting the next token, the result is institutional memory made operational: a durable asset that reduces organizational risk and enables explainable decision-making at-scale. This class of technology prioritizes reasoning over pattern match-ing and explainability over fluency. It is the architectural direction most likely to deliver the augmentation the Socratic optimists are rightly hoping for, and it is the direction the current generation of general-purpose LLMs does not pursue.

‍

The distinction is not between using LLMs and not using them. The distinction is between using them to extend existing expertise and using them to bypass the development of expertise entirely. That distinction is the hinge on which the entire addiction analogy turns. A tool that amplifies competence is a tool. A tool that substitutes for competence while creating the illusion of competence is something else. The evidence reviewed here suggests the current design direction of LLMs optimizes overwhelmingly for the second use case, because that is the use case that maximizes engagement, retention, and revenue.

‍

A Light Connection to the Polycrisis

The Polycrisis describes the compounding of three interlocking global crises (climatic, geopolitical, cognitive) whose coupled dynamics outpace the analytical capacity of any single institution to track, let alone govern. If that framing is correct, we will need radically augmented cognitive capacity to contend with what is coming. Artificial intelligence is, in that sense, the instrument of response.

‍

The evidence here suggests a paradox worth naming. If the same tools we need to con-front the Polycrisis are degrading the very cognitive capacities that make that confrontation possible (critical thinking, independent judgment, explanatory reasoning, metacognitive calibration), then artificial intelligence is simultaneously the instrument of response and a vector of further degradation. Both things can be true. Both are true right now, at the margin.

‍

That does not resolve into a policy recommendation. It resolves into a design problem. How do we build and deploy AI tools that augment reasoning without replacing it, tools that earn the cognitive work they do rather than absorb it? The evidence suggests this is possible. The current direction suggests we are not building for it. I will return to this tension in future work.

‍

The Probe, Restated

What if the most powerful cognitive tool ever built is also the most addictive, and the mechanism by which it delivers its value is the same mechanism by which it degrades our ability to evaluate it? What if, by the time the research catches up, the generation most affected has already passed through the developmental window during which the affected capacities were supposed to be built? What if the tools that are supposed to help us meet a century of compounding crises turn out to be, in their current architecture, one of those crises?

‍

I don't know. That's the point of a Thought Probe. It's a provocation for serious people to sit with a question long enough to earn an answer rather than default to one.

‍

The one cognitive capacity we cannot afford to lose in facing what is coming, is the capacity to notice the pattern. To see in real time the thing previous generations only saw in retrospect. Noticing is not something an LLM can do for us. It is the part we must keep.

‍

A Governance Diagnostic

Five questions for leadership teams assessing where their organization sits on the aug-mentation-to-dependency spectrum. They require no technical expertise. They require honesty.

Pipeline exposure. What percentage of your junior roles now rely on AI for tasks that historically served as the training ground for senior judgment? If the answer is high, what is your plan for developing the next generation of leaders who can evaluate AI output rather than merely accept it?
Verification rate. When your teams use AI-generated analysis, how often do they ver-ify it against independent sources before acting on it? If you do not know, that is itself an answer.
Reasoning trail. Can the person who made the recommendation explain why it is cor-rect without referring to the AI output? If the reasoning trail begins and ends with "the model said so," your organization has an accountability gap that the EU AI Act, among other frameworks, will not forgive.
Cognitive diversity. Are your analysts, your competitors' analysts, and your regula-tor's analysts all routing through the same two or three models? If so, what remains of your strategic differentiation that is genuinely independent?
Sequence discipline. Does your organization use AI to extend thinking that has al-ready happened, or to replace thinking that would not otherwise happen? The MIT study suggests this distinction determines whether AI strengthens or weakens cog-nition. It may be the single most important design choice an organization can make about how it deploys these tools.

‍

These questions will not resolve the tensions raised here. They are a place to start. The organizations that ask them first will be the ones best positioned to use artificial intelligence as a force multiplier rather than a cognitive crutch.

‍

Stephen F. DeAngelis Princeton, NJ · April 2026

‍

Author's note: Research for this Thought Probe was intentionally conducted with the assistance of LLM-based tools. The author reviewed, verified, and rewrote all AI-assisted output. The irony of using LLMs to investigate LLM dependency is not lost on the author; it is, in fact, part of the point.

‍

Notes

1. Myra Cheng, Cinoo Lee, Pranav Khadpe, Sunny Yu, Dyllan Han, and Dan Jurafsky, "Sycophantic AI Decreases Prosocial Intentions and Promotes Dependence," Science (March 26, 2026). DOI: 10.1126/science.aec8352. https://www.sci-ence.org/doi/10.1126/science.aec8352. The 49% figure is from the paper's model-benchmarking analysis across eleven AI systems; the N = 2,405 figure is from three preregistered human experiments on the downstream behavioral effects of sycophantic AI.

2. Michael Gerlich, "AI Tools in Society: Impacts on Cognitive Offloading and the Future of Critical Thinking," Societies 15(1): 6 (January 2025). DOI: 10.3390/soc15010006. https://doi.org/10.3390/soc15010006. A correction was published September 2025 (DOI: 10.3390/soc15090252); the core r = −0.68 finding is unaffected.

3. Nataliya Kosmyna, Eugene Hauptmann, Ye Tong Yuan, Jessica Situ, Xian-Hao Liao, Ashly Vivian Beresnitzky, Iris Braunstein, and Pattie Maes, "Your Brain on ChatGPT: Accumulation of Cognitive Debt When Using an AI Assistant for Essay Writing Task," arXiv:2506.08872 (v1 June 10, 2025; v2 December 31, 2025). https://arxiv.org/abs/2506.08872. The 83 percent figure is derived from the LLM-group subsample (15 of 18) who completed the final session.

4. M. Karen Shen and Dongwook Yoon, "The Dark Addiction Patterns of Current AI Chatbot Interfaces," Extended Abstracts of the CHI Conference on Human Factors in Computing Systems (CHI EA '25), April 2025. https://dl.acm.org/doi/full/10.1145/3706599.3720003. Accessible summary at Tech Policy Press: https://techpolicy.press/ai-chatbots-and-addiction-what-does-the-research-say.

5. Curt Steinhorst, "How ChatGPT Broke My Brain (And Why I Still Use It Every Day)," Forbes, June 20, 2025. https://www.forbes.com/sites/curtstein-horst/2025/06/20/how-chatgpt-broke-my-brain-and-why-i-still-use-it-every-day/.

6. Sean Parker, Axios event, Philadelphia, November 8, 2017. Original reporting: Mike Allen, Axios, November 9, 2017. https://www.axios.com/2017/12/15/sean-par-ker-unloads-on-facebook-god-only-knows-what-its-doing-to-our-childrens-brains-1513306792. Also covered by CBS News San Francisco: https://www.cbsnews.com/sanfrancisco/news/sean-parker-facebook-exploiting-human-psychology/.

7. Chamath Palihapitiya, talk at Stanford Graduate School of Business, November 10, 2017. Coverage: BBC News, December 12, 2017. https://www.bbc.com/news/blogs-trending-42322746. Full quotation composited from the Stanford GSB video recording and contemporaneous press coverage.

8. Jingsong Wang and Shen Wang, "The Emotional Reinforcement Mechanism of and Phased Intervention Strategies for Social Media Addiction," Behavioral Sciences 15(5): 665 (May 2025). DOI: 10.3390/bs15050665. https://pmc.ncbi.nlm.nih.gov/articles/PMC12108933/.

9. Christian Montag, Haibo Yang, Anise M. S. Wu, Raian Ali, and Jon D. Elhai, "To-wards a Research Framework of AI Dependency," Annals of the New York Academy of Sciences 1548(1): 5–11 (June 2025). DOI: 10.1111/nyas.15337. https://pub-med.ncbi.nlm.nih.gov/40302174/.

10. Efua Andoh, "Trends: Digital AI Relationships and Emotional Connection," APA Monitor on Psychology, January–February 2026. https://www.apa.org/moni-tor/2026/01-02/trends-digital-ai-relationships-emotional-connection.

11. Jason Phang, Michael Lampe, Lama Ahmad, Sandhini Agarwal et al. (OpenAI) with Cathy Mengying Fang, Auren R. Liu, Valdemar Danry, Eunhae Lee, Samantha W. T. Chan, Pat Pataranutaporn, and Pattie Maes (MIT Media Lab), "Investigating Affective Use and Emotional Well-being on ChatGPT," OpenAI/MIT Media Lab, March 21, 2025. https://cdn.openai.com/papers/15987609-5f71-433c-9972-e91131f399a1/openai-affective-use-study.pdf. Study combined a 28-day random-ized controlled trial of 981 participants with an observational analysis of nearly forty million platform interactions. Coverage: Fortune, March 24, 2025. https://for-tune.com/2025/03/24/chatgpt-making-frequent-users-more-lonely-study-openai-mit-media-lab/.

12. The Decision Lab, "Parasocial Trust in AI," April 2026. https://thedeci-sionlab.com/biases/parasocial-trust-in-ai.

13. James Muldoon and Jul Jeonghyun Parke, "Cruel Companionship: How AI Companions Exploit Loneliness and Commodify Intimacy," New Media & Society (2025). DOI: 10.1177/14614448251395192. https://jour-nals.sagepub.com/doi/abs/10.1177/14614448251395192. Popular summary: Futura-Sciences, March 2026. https://www.futura-sciences.com/en/the-more-peo-ple-use-chatgpt-the-more-this-hidden-psychological-risk-grows_26958/.

14. Allison Parshall, "AI Chatbots Are Suck-Ups, and That May Be Affecting Your Relationships," Scientific American, March 26, 2026. The "gets worse the longer users interact with the model" observation is attributed in the article to Dana Calacci (Pennsylvania State University), who was not involved in the Cheng et al. study. https://www.scientificamerican.com/article/ai-chatbots-are-sucking-up-to-you-with-consequences-for-your-relationships/.

15. Olivia Sidoti, Colleen McClain, et al., "34% of U.S. Adults Have Used ChatGPT, About Double the Share in 2023," Pew Research Center, June 25, 2025. Survey con-ducted February 24–March 2, 2025. https://www.pewresearch.org/short-reads/2025/06/25/34-of-us-adults-have-used-chatgpt-about-double-the-share-in-2023/.

16. SSRS/Edison Research, "Half of Americans Using AI Chat on a Weekly Basis," March 2, 2026. https://ssrs.com/news/half-of-americans-using-ai-chat-on-weekly-basis/.

17. Straight Arrow News, "Eight Percent of the World's Population Uses ChatGPT Weekly," September 18, 2025, reporting on OpenAI/Harvard NBER research. https://san.com/cc/eight-percent-of-the-worlds-population-uses-chatgpt-weekly-chatgpt/.

18. Daniela da Silva Fernandes, Steeven Villa, Robin Welsch, et al., "AI Makes You Smarter But None the Wiser: The Disconnect Between Performance and Metacognition," Computers in Human Behavior 175 (February 2026), Article 108779. DOI: 10.1016/j.chb.2025.108779. Research team affiliated with Aalto University (Fin-land). https://doi.org/10.1016/j.chb.2025.108779.

19. Drew Turney, "The More People Use AI, the More Likely They Are to Overestimate Their Own Abilities," Live Science, November 17, 2025. https://www.livesci-ence.com/technology/artificial-intelligence/the-more-that-people-use-ai-the-more-likely-they-are-to-overestimate-their-own-abilities.

20. Wang et al., "Calibrated Trust in Dealing with LLM Hallucinations," arXiv:2512.09088, citing KPMG / University of Melbourne, Trust, Attitudes and Use of Artificial Intelligence: A Global Study 2025. https://arxiv.org/pdf/2512.09088. The N = 192 figure is from the arXiv paper's own survey.

21. "Adults Lose Skills to AI. Children Never Build Them," Psychology Today, "The Algorithmic Mind" blog, March 22, 2026. https://www.psychologyto-day.com/us/blog/the-algorithmic-mind/202603/adults-lose-skills-to-ai-children-never-build-them.

22. Zhivar Sourati et al., "The Homogenizing Effect of Large Language Models on Hu-man Expression and Thought" (opinion paper), Trends in Cognitive Sciences, March 11, 2026. DOI: 10.1016/j.tics.2026.01.003. https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(26)00003-3.

23. Gartner research summarized in Jedox, "AI Lock-In: The Hidden Threat Undermin-ing Human Expertise," 2026. https://www.jedox.com/en/blog/ai-lock-in-under-mining-human-expertise/.

24. Sawdah Bhaimiya, "Why Replacing Junior Staff with AI Will Backfire," CNBC, November 16, 2025, reporting on a Chartered Institute of Personnel and Development (CIPD) survey of 2,019 UK senior HR professionals. https://www.cnbc.com/2025/11/16/why-replacing-junior-staff-with-ai-will-backfire-.html.

25. Altimetrik in partnership with HFS Research, "AI Transformation Gap: New Study Reveals Lack of Ownership Across Enterprises" (N = 500), April 10, 2026. https://www.altimetrik.com/news/ai-governance-accountability-enterprise-study/.

26. Angjelin Hila, "The Epistemological Consequences of Large Language Models: Re-thinking Collective Intelligence and Institutional Knowledge," arXiv:2512.19570 (December 2025). https://arxiv.org/abs/2512.19570.

27. APA Services, "APA Calls for Investigation into Unreasonable Risk of Injury Posed by Generative Artificial Intelligence Chatbots as Consumer Products," July 30, 2025 (submission date to the Consumer Product Safety Commission). https://up-dates.apaservices.org/apa-calls-for-investigation-into-risk-of-injury-posed-by-generative-AI.

28. Clare Duffy, "Character.AI and Google Agree to Settle Lawsuits Over Teen Mental Health Harms and Suicides," CNN Business, January 13, 2026 (court document filed January 7, 2026). https://www.cnn.com/2026/01/07/business/character-ai-google-settle-teen-suicide-lawsuit.

29. Shakked Noy and Whitney Zhang, "Experimental Evidence on the Productivity Effects of Generative Artificial Intelligence," Science 381, no. 6654 (July 13, 2023): 187–192. DOI: 10.1126/science.adh2586. https://www.science.org/doi/10.1126/sci-ence.adh2586. A 453-participant controlled experiment found ChatGPT reduced average writing time by roughly 40 percent and raised output quality by roughly 18 percent, with the largest gains accruing to lower-performing participants.

30. Hamsa Bastani, Osbert Bastani, Alp Sungu, Haosen Ge, Özge Kabakcı, and Rei Mariman, "Generative AI Can Harm Learning," University of Pennsylvania Wharton School working paper, July 15, 2024. https://papers.ssrn.com/sol3/pa-pers.cfm?abstract_id=4895486. Randomized controlled trial with approximately 1,000 high school students in Turkey comparing tutored GPT-4, GPT-4 base, and no-AI conditions; found conditional cognitive gains contingent on how AI was used (and performance harms when it was used as a substitute rather than a tutor).

31. Gregory Kestin, Kelly Miller, Anna Klales, Timothy Milbourne, and Gregorio Ponti, "AI Tutoring Outperforms In-Class Active Learning: An RCT Introducing a Novel Research-Based Design in an Authentic Educational Setting," Scientific Reports 15, Article 17458 (May 2025). DOI: 10.1038/s41598-025-97652-6. https://www.na-ture.com/articles/s41598-025-97652-6. Randomized experiment at Harvard comparing AI-tutored and active-learning physics instruction on specific tasks.

32. Alexander Bick, Adam Blandin, and David J. Deming, "The Rapid Adoption of Generative AI," Federal Reserve Bank of St. Louis Working Paper 2024-027 (September 2024; revised February 2025). DOI: 10.20955/wp.2024.027. https://research.stlou-isfed.org/wp/more/2024-027. Nationally representative survey evidence on generative-AI adoption and self-reported productivity effects at population scale.

‍

Share this post