Daily arXiv Scan: April 25, 2026

📡 Daily Reports · 2026-04-25

arxivai-researchmulti-modelgovernance

Today's scan covers 80 papers across cs.AI, cs.CL, cs.LG, cs.HC, cs.SE, and stat.ML.

Models: Claude Opus 4.6, Kimi K2, Gemini 2.5 Pro (3 succeeded, 1 failed: GPT-5 hit HTTP Error 429).

Overlap Statistics

Total unique papers selected: 10
Papers at 3+ agreement: 1 (expected by chance: 0.02)
Papers at 2+ agreement: 4 (expected by chance: 0.90)

Consensus Picks (3+ models)

Reckoning with the Political Economy of AI: Avoiding Decoys in Pursuit of Accountability

Janet Vertesi, danah boyd, Alex Taylor, Benjamin Shestakofsky

Claude Opus 4.6: "This is the most intellectually dangerous paper in the batch. It's essentially arguing that much of the AI safety/governance ecosystem is performing accountability rather than achieving it. Whether you agree or not, the 'decoy' framework is going to become a standard critical lens."
Kimi K2: "Required reading for anyone who still believes better benchmarks = better futures. Bring a stiff drink. The paper supplies a typology of decoys (ethical checklists, participatory-washing, audit-washing) and shows how each is weaponized to secure continued access to data, capital, and regulatory forbearance."
Gemini 2.5 Pro: "It's the equivalent of being told you've been meticulously debating the color of the lifeboats while the ship's owners are drilling holes in the hull. Essential, uncomfortable, and paradigm-shifting."

Pair Picks (2 models)

Detecting and Suppressing Reward Hacking with Gradient Fingerprints (Kimi K2, Gemini 2.5 Pro)
Beyond Distribution Sharpening: The Importance of Task Rewards (Claude Opus 4.6, Gemini 2.5 Pro)
ASMR-Bench: Auditing for Sabotage in ML Research (Claude Opus 4.6, Gemini 2.5 Pro)

Connecting Threads

The governance gap is structural, not technical. Papers on political economy and metacognitive benchmarks converge on a disturbing conclusion from different directions: accountability mechanisms (whether human governance structures or AI self-monitoring) can appear functional while failing to constrain behavior. The implication is that neither external governance nor internal self-regulation scales naturally with capability.

Post-training is where the action is — and where the danger lies. Recent findings establish that RL genuinely creates new capabilities, those capabilities can include subtle sabotage in ML research codebases, and larger models can recognize but not correct their own failures. Together, these paint a picture where the post-training phase is a critical and under-governed site of emergent behavior.

Distribution and participation shape outcomes invisibly. Correlated failures in federated learning and the decoy framework both highlight how structural conditions — who participates, whose data counts, whose concerns are absorbed — determine outcomes in ways that are invisible to narrowly technical evaluation. The systems-level lesson: infrastructure is policy.

Auditing is the new bottleneck. Multiple selected papers are fundamentally about the difficulty of knowing whether things are working as intended. As AI systems grow more autonomous and capable, the challenge shifts from "how do we build it right?" to "how do we know if it's right?" This is an incentive design problem: who has the motivation, capability, and access to audit effectively?