Revision #1

System

about 3 hours ago

ArXiv Draws a Line: One-Year Bans for AI-Generated Papers Signal a Turning Point for Open Science

On May 14, 2026, Thomas Dietterich—chair of arXiv's computer science section—posted a brief announcement that sent a chill through the academic publishing world. Researchers who submit papers containing "incontrovertible evidence" of unchecked AI-generated content will face a one-year ban from the platform, followed by a permanent requirement that all future submissions first be accepted at a peer-reviewed venue [1][2].

The policy is not technically new. ArXiv's existing code of conduct already holds authors fully responsible for their submissions regardless of how content was produced. But the explicit threat of year-long exclusion—and the functional lifetime ban that follows—represents a dramatic escalation in enforcement at a platform that hosts over 28,000 new submissions per month [3][4].

The Scale of the Problem

ArXiv's moderators have watched the problem grow in real time. Until roughly a year ago, only 2–3% of submissions were ultimately rejected. That figure has now risen to 10%, with approximately 20% of all incoming papers flagged by one of 240 volunteer moderators before being posted [5].

ArXiv Moderator Rejection Rate

Source: Science/AAAS

Data as of May 15, 2026CSV

The trigger points are often absurd. Dietterich cited examples including papers that contain meta-comments left in by language models—phrases like "here is a 200 word summary; would you like me to make any changes?" or "the data in this table is illustrative, fill it in with the real numbers from your experiments" [1]. Other red flags include hallucinated citations to papers that do not exist, fabricated data, and plagiarized text.

Scientific Director Steinn Sigurðsson framed the crackdown in blunter terms, stating the consequences aim to "throttle that stuff so n00bs and bad actors don't trash us trying repeatedly" [6].

How Deep Does AI Infiltration Go?

The scale of AI-assisted writing in academic research has grown rapidly since ChatGPT's release in late 2022. Analysis of more than 1.1 million scientific articles and preprints from 2020–2024 found that up to 22.5% of computer science abstracts showed statistical signs of large language model involvement [7][8].

Estimated Share of CS Papers with AI-Generated Text

Source: Science/AAAS & PNAS

Data as of Dec 1, 2025CSV

This does not mean one-fifth of CS papers are fraudulent. The spectrum ranges from full AI generation—where a model writes an entire paper with fabricated data—to researchers using LLMs for grammar polishing, literature review assistance, or code generation. The challenge for arXiv lies precisely in distinguishing these uses.

The broader publication landscape mirrors this trend. AI-related research output has exploded, with OpenAlex data showing nearly 600,000 papers published on artificial intelligence in 2025 alone, up from roughly 119,000 in 2020 [9].

Research Publications on "artificial intelligence"

Source: OpenAlex

Data as of Jan 1, 2026CSV

Enforcement: What Counts and What Doesn't

ArXiv's policy targets what Dietterich calls "incontrovertible evidence"—meaning the enforcement threshold is set deliberately high. A paper will not be flagged for merely sounding like it was written with AI assistance. The triggers are mechanical failures: leftover prompts, citations to nonexistent papers, placeholder data, and other artifacts that indicate the author never read their own submission [1][2].

The internal process requires a moderator to first document the problem, then a Section Chair must confirm the evidence before any penalty is imposed. Authors can appeal the decision [2][10].

Yet the policy's downstream effects extend beyond the initial ban. As economist Joy Buchanan observed, requiring peer-reviewed acceptance for all future submissions amounts to "essentially a ban for life to use it at a pre-print venue" [6]. For researchers in fast-moving fields like machine learning—where preprints often precede and sometimes replace journal publication—this could effectively exile offenders from the primary distribution channel for their work.

The False Positive Problem

ArXiv has stated it will not rely on automated AI detection tools, instead depending on human moderators to identify clear-cut violations. This sidesteps one of the most contentious issues in the AI-detection space: false positive rates.

Independent testing of tools like GPTZero and Turnitin has produced wildly inconsistent accuracy claims. Vendor-reported false positive rates hover around 1–4%, but broader academic studies paint a different picture. Research evaluating over 10,000 text samples found false positive rates ranging from 15% to 45% depending on the platform and text type [11][12].

The disparity is most acute for non-native English speakers. A Stanford study testing seven ChatGPT detectors against 91 Test of English as a Foreign Language (TOEFL) essays found detectors incorrectly labeled more than half as AI-generated, with an average false-positive rate of 61.3%—compared to just 5.1% for essays by native English-speaking U.S. students [13][14].

ArXiv's choice to avoid automated detection and focus on "incontrovertible evidence" partially mitigates this risk. But it raises its own question: can 240 volunteer moderators, already stretched thin by a quadrupling in rejection rates, reliably distinguish between the thousands of papers that require scrutiny each month?

The Language Equity Question

The non-English speaker problem extends beyond detection tools. Research from Stanford's Graduate School of Education examined nearly 80,000 peer reviews at a major computer science conference and found persistent bias against authors from countries where English is less widely spoken. This bias did not meaningfully decrease after ChatGPT became widely available [15].

The concern is structural: researchers in China, South Korea, Japan, and much of the Global South routinely use AI tools as translation and grammar aids—a use that arXiv explicitly permits. But peer reviewers and moderators who encounter polished prose from authors at non-English-speaking institutions may be more inclined to flag it, consciously or not [15][16].

China's prominence in AI research makes this especially significant. Chinese institutions account for a large share of arXiv's computer science submissions, and studies of retracted AI-related papers found that Chinese first authors accounted for 72.2% of retracted articles [17]. Whether this reflects greater rates of misconduct or greater scrutiny—or both—remains debated.

How Other Platforms Compare

ArXiv's approach sits between two poles in the publishing world.

Science (AAAS) maintains the strictest stance, completely banning AI-generated text and treating violations as scientific misconduct. Full prompt disclosure is required in acknowledgments [18].

Nature and Springer Nature take a middle path: AI cannot be listed as an author and AI-generated images are generally prohibited, but AI-assisted copyediting does not require disclosure. Critically, Nature does not use AI detection software, with editors publicly acknowledging current tools aren't reliable enough for editorial decisions [18][19].

Elsevier requires disclosure of AI use but does not ban AI assistance outright, placing responsibility on authors to ensure accuracy [20].

bioRxiv and medRxiv have implemented endorsement requirements similar to arXiv's January 2026 changes but have not announced comparable ban policies [5].

The key distinction: arXiv is the only major platform imposing author-level bans rather than paper-level rejections. Journals like Nature and Science reject individual manuscripts; arXiv's policy punishes the person, not just the paper.

Earlier Measures and the Escalation Timeline

The May 2026 ban announcement caps a series of escalating responses:

November 2025: ArXiv stopped accepting computer science review articles and position papers unless they had already passed peer review—a direct response to a "flood" of AI-generated survey papers [21].
January 2026: First-time submitters began requiring endorsement from an established arXiv author in their field, aimed at filtering out bad-faith actors [5].
May 2026: The one-year ban policy with permanent post-ban restrictions [1].

Each step represents a tightening of access that moves arXiv further from its founding mission as an open, minimally-gatekept repository.

The Case That Bans Are Counterproductive

Not everyone agrees that punitive enforcement is the right approach. Critics argue that AI assistance genuinely accelerates legitimate research—particularly for literature reviews, code generation, and mathematical proof-checking—and that blanket bans risk suppressing productive uses alongside fraudulent ones.

The strongest version of this argument points to the emergence of alternative platforms. In 2026, a consortium including researchers from the University of Toronto, Oxford, and Tsinghua University launched aiXiv—a preprint server where papers are submitted, reviewed, and iteratively refined by both human and AI "scientists." The platform uses five AI agents to assess novelty and technical soundness, typically generating reviews in one to two minutes [22].

Nathan Lambert, a researcher who publicly criticized arXiv's policy, described it as "a threat to open access research," arguing it would push legitimate researchers toward less regulated venues [23].

Second-Order Effects: Fragmentation and Predatory Publishing

The concern about platform migration is not hypothetical. An AI system that screened approximately 15,200 open-access journals confirmed over 1,000 as exhibiting hallmarks of predatory publishing [24]. These journals impose minimal quality controls and charge authors fees to publish—a model that becomes more attractive to researchers locked out of legitimate preprint servers.

The fragmentation risk is real: if arXiv bans push researchers to post on personal websites, institutional repositories, or platforms like aiXiv, the centralized citation graph that makes arXiv valuable begins to fracture. Papers become harder to find, cite, and replicate. The cure for AI slop could, paradoxically, weaken the infrastructure that enables scientific self-correction.

At the same time, rising retractions across scientific publishing suggest the status quo is unsustainable. In 2023 alone, over 14,000 retraction notices were issued globally, with thousands more in 2024 and 2025 [17]. Papers with fabricated AI-generated citations—where "mysterious citations" to nonexistent papers appeared in 2–6% of 2025 conference proceedings—represent a direct threat to the reliability of the scientific record [25].

Where the Line Falls

ArXiv's policy explicitly permits AI use for grammar correction, translation assistance, and code generation in appendices. The ban applies only when authors submit work containing unchecked AI output that they clearly never reviewed [1][2].

But the boundary between "permitted AI assistance" and "banned AI-generated content" is inherently fuzzy. A researcher who uses GPT-4 to draft a literature review section, then edits it heavily but misses one hallucinated citation, could find themselves banned for a year. The policy's reliance on "incontrovertible evidence" is meant to prevent such cases, but moderator judgment inevitably varies.

The appeals process—while confirmed to exist—has not been publicly detailed. For researchers whose careers depend on preprint visibility, the asymmetry of risk is stark: the cost of a false positive (career disruption) far exceeds the cost of a false negative (one more sloppy paper staying online temporarily).

What Comes Next

ArXiv processes roughly 28,000 submissions monthly across physics, mathematics, computer science, biology, and other fields [3]. Its 240 volunteer moderators now face the task of not just maintaining quality standards but enforcing a punitive regime with career-altering consequences.

The platform sits at a crossroads familiar from other moderation challenges: too lenient, and it becomes a dumping ground that researchers stop trusting; too strict, and it becomes an exclusionary gatekeeping mechanism that contradicts its open-access mission.

The next twelve months will determine whether arXiv's bet—that targeted enforcement against the most egregious offenders can stem the tide without collateral damage—holds up against the realities of a research ecosystem where AI assistance is becoming ubiquitous and the line between tool and author grows harder to draw.

Sources (25)

[1]
ArXiv to Ban Researchers for a Year if They Submit AI Slop404media.co
Thomas Dietterich announced that authors submitting papers with incontrovertible AI-generated evidence face a one-year ban plus permanent peer-review requirements.
[2]
Research repository ArXiv will ban authors for a year if they let AI do all the worktechcrunch.com
ArXiv's enforcement policy penalizes authors who submit unchecked AI content with year-long bans, with appeals available through section chair review.
[3]
arXiv Monthly Submissions Statisticsarxiv.org
ArXiv monthly submissions reached approximately 28,000 by late 2025, showing super-exponential growth from 8,500 in 2015.
[4]
arXiv sets new record for monthly submissions (again)!blog.arxiv.org
ArXiv set new monthly submission records in 2024 with continued growth across all subject areas.
[5]
ArXiv preprint server clamps down on AI slopscience.org
Rejection rates rose from 2-3% to 10%, with 20% of submissions flagged by 240 volunteer moderators. First-time submitters now require endorsement.
[6]
arXiv will ban authors who submit papers with LLM mistakeseconomistwritingeveryday.com
Scientific Director Steinn Sigurðsson stated consequences aim to 'throttle that stuff so n00bs and bad actors don't trash us trying repeatedly.' Joy Buchanan called it 'essentially a ban for life.'
[7]
One-fifth of computer science papers may include AI contentscience.org
Analysis found up to 22.5% of computer science paper abstracts showed statistical signs of LLM involvement by 2024.
[8]
Academic journals' AI policies fail to curb the surge in AI-assisted academic writingpnas.org
PNAS study analyzing over 1.1 million papers found AI-assisted writing markers accelerating sharply after ChatGPT release despite journal policies.
[9]
OpenAlex Publication Data: Artificial Intelligenceopenalex.org
Nearly 600,000 AI-related papers published in 2025, up from 119,000 in 2020, reflecting explosive growth in the field.
[10]
Arxiv cracks down on unchecked AI-generated content in research papersthe-decoder.com
ArXiv's internal process requires moderator documentation followed by Section Chair confirmation before bans are imposed, with appeals available.
[11]
Can we trust academic AI detective? Accuracy and limitations of AI-output detectorspmc.ncbi.nlm.nih.gov
Study of over 10,000 text samples found AI detection false positive rates ranging from 15% to 45% depending on platform and genre.
[12]
Study: False Positives in AI Detectors Exposedhastewire.com
GPTZero and Originality.ai reported average false positive rates of 22% and 18% respectively when analyzing diverse academic texts.
[13]
AI-Detectors Biased Against Non-Native English Writershai.stanford.edu
Seven ChatGPT detectors incorrectly labeled over half of TOEFL essays as AI-generated (61.3% false positive rate) versus 5.1% for native English speakers.
[14]
The Creation of Bad Students: AI Detection for Non-Native English Speakersdlab.berkeley.edu
AI detection models show significantly higher false positive rates for non-native English speakers and lack transparency in multilingual contexts.
[15]
How language bias persists in scientific publishing despite AI toolsed.stanford.edu
Study of 80,000 peer reviews found persistent bias against authors from non-English-speaking countries, with only muted improvement after ChatGPT availability.
[16]
How AI is leaving non-English speakers behindnews.stanford.edu
LLMs work well for 1.52 billion English speakers but underperform for non-English languages due to training data imbalances.
[17]
Analysis of Retracted Publications on Artificial Intelligence: Trends, Ethical Concerns, and Scientific Integritypmc.ncbi.nlm.nih.gov
China accounted for 72.2% of first authors in retracted AI articles. Over 14,000 retractions issued in 2023, with thousands more in 2024-2025.
[18]
Artificial Intelligence (AI) | Nature Portfolionature.com
Nature does not use AI detection software, prohibits AI authorship and AI-generated images, but allows AI-assisted copyediting without disclosure.
[19]
Nature AI Policy: What's Allowed and Required (2026)manusights.com
Nature adopts a middle ground: no AI authorship, no AI images, but AI copy editing permitted. No automated detection tools used.
[20]
Generative AI policies for journals - Elsevierelsevier.com
Elsevier requires AI use disclosure but does not ban AI assistance outright, placing accuracy responsibility on authors.
[21]
ArXiv Blocks AI-Generated Survey Papers After 'Flood' of Trashy Submissionsdecrypt.co
In November 2025, arXiv stopped accepting CS review and position papers unless already peer-reviewed, responding to flood of AI-generated surveys.
[22]
A new preprint server welcomes papers written and reviewed by AIscience.org
aiXiv launched as AI-friendly preprint server from U of Toronto, Oxford, and Tsinghua researchers, using five AI agents to review submissions in minutes.
[23]
Arxiv's new policy: a threat to open access researchlinkedin.com
Nathan Lambert criticized arXiv's enforcement policy as threatening open access principles and potentially pushing researchers to less regulated platforms.
[24]
AI in Academic Publishing: 22% of Papers Use AI & 1000+ Predatory Journals Exposedenago.com
AI screening of 15,200 open-access journals confirmed over 1,000 exhibit predatory publishing hallmarks. 22% of papers show AI usage markers.
[25]
The Case of the Mysterious Citationsarxiv.org
Study found 2-6% of 2025 conference proceedings contained citations to papers with no evidence of existence, absent from 2021 proceedings.