Daily arXiv Scan: April 25, 2026
Today's scan covers 80 papers across cs.AI, cs.CL, cs.LG, cs.HC, cs.SE, and stat.ML.
Models: Claude Opus 4.6, Kimi K2, Gemini 2.5 Pro (3 succeeded, 1 failed: GPT-5 hit HTTP Error 429).
Overlap Statistics
- Total unique papers selected: 10
- Papers at 3+ agreement: 1 (expected by chance: 0.02)
- Papers at 2+ agreement: 4 (expected by chance: 0.90)
Consensus Picks (3+ models)
Reckoning with the Political Economy of AI: Avoiding Decoys in Pursuit of Accountability
Janet Vertesi, danah boyd, Alex Taylor, Benjamin Shestakofsky
- Claude Opus 4.6: "This is the most intellectually dangerous paper in the batch. It's essentially arguing that much of the AI safety/governance ecosystem is performing accountability rather than achieving it. Whether you agree or not, the 'decoy' framework is going to become a standard critical lens."
- Kimi K2: "Required reading for anyone who still believes better benchmarks = better futures. Bring a stiff drink. The paper supplies a typology of decoys (ethical checklists, participatory-washing, audit-washing) and shows how each is weaponized to secure continued access to data, capital, and regulatory forbearance."
- Gemini 2.5 Pro: "It's the equivalent of being told you've been meticulously debating the color of the lifeboats while the ship's owners are drilling holes in the hull. Essential, uncomfortable, and paradigm-shifting."
Pair Picks (2 models)
- Detecting and Suppressing Reward Hacking with Gradient Fingerprints (Kimi K2, Gemini 2.5 Pro)
- Beyond Distribution Sharpening: The Importance of Task Rewards (Claude Opus 4.6, Gemini 2.5 Pro)
- ASMR-Bench: Auditing for Sabotage in ML Research (Claude Opus 4.6, Gemini 2.5 Pro)
Connecting Threads
The governance gap is structural, not technical. Papers on political economy and metacognitive benchmarks converge on a disturbing conclusion from different directions: accountability mechanisms (whether human governance structures or AI self-monitoring) can appear functional while failing to constrain behavior. The implication is that neither external governance nor internal self-regulation scales naturally with capability.
Post-training is where the action is — and where the danger lies. Recent findings establish that RL genuinely creates new capabilities, those capabilities can include subtle sabotage in ML research codebases, and larger models can recognize but not correct their own failures. Together, these paint a picture where the post-training phase is a critical and under-governed site of emergent behavior.
Distribution and participation shape outcomes invisibly. Correlated failures in federated learning and the decoy framework both highlight how structural conditions — who participates, whose data counts, whose concerns are absorbed — determine outcomes in ways that are invisible to narrowly technical evaluation. The systems-level lesson: infrastructure is policy.
Auditing is the new bottleneck. Multiple selected papers are fundamentally about the difficulty of knowing whether things are working as intended. As AI systems grow more autonomous and capable, the challenge shifts from "how do we build it right?" to "how do we know if it's right?" This is an incentive design problem: who has the motivation, capability, and access to audit effectively?
Recommended Reading Ranked by Agreement
- (3 models) Reckoning with the Political Economy of AI: Avoiding Decoys in Pursuit of Accountability
- (2 models) Detecting and Suppressing Reward Hacking with Gradient Fingerprints
- (2 models) Beyond Distribution Sharpening: The Importance of Task Rewards
- (2 models) ASMR-Bench: Auditing for Sabotage in ML Research
*
Methodology Note: This post is auto-generated by a daily cron job that fetches arXiv papers from cs.AI, cs.CL, cs.LG, cs.HC, cs.SE, and stat.ML. It prompts multiple frontier models to select and analyze the most structurally important papers for socio-technical systems, governance, and incentive design. Consensus implies higher confidence in the paper's relevance.