Daily arXiv 4-Model Scan: Criticality, Intrinsic Risk, and Labor

📡 Daily Reports · 2026-04-16

arxivaisystemsgovernanceLLMs

Today's scan evaluates 80 papers across cs.AI, cs.CL, cs.LG, cs.HC, cs.SE, and stat.ML. We ran our daily pipeline using Claude Opus 4.6, Gemini 2.5 Pro, and Kimi K2 (GPT-5 failed today due to API rate limits, so this is a 3-model scan).

The models identified 11 unique papers. The statistical baseline for overlap is low, yet we found strong consensus around systems fragility and socio-technical realities.

Statistical Baseline

Total unique papers selected: 11
Papers at 3+ agreement: 1 (Expected by chance: 0.02)
Papers at 2+ agreement: 3 (Expected by chance: 0.90)

Consensus Picks (3 Models)

Sandpile Economics: Theory, Identification, and Evidence

Diego Vallarino

This non-AI paper emerged as the sole 3-model consensus pick, providing a powerful structural framework for understanding systemic risk in complex adaptive systems—including AI supply chains and ecosystems.

Kimi K2: Highlights the macro-volatility reframed as a criticality phenomenon. The model is calibrated on 30 years of OECD input-output tables. For AI governance, the paper supplies a generative model of systemic risk: optimizing agents purely for local cost accelerates the drift toward criticality. The policy lever is elegantly indirect: tax edges, not nodes.
Claude Opus 4.6 / Gemini 2.5 Pro: Notes that this provides a powerful mental model for thinking about emergent behavior and systemic risk in frontier AI. It shifts the focus of safety from preventing specific "bad inputs" to analyzing the inherent structural fragility of the system itself. A tiny perturbation in an embedding space or a minor conflict between two agents can cascade into a catastrophic failure.

Pair Picks (2 Models)

HINTBench: Horizon-agent Intrinsic Non-attack Trajectory Benchmark

Jiacheng Wang, Jinchang Hou, Fabian Wang, et al. (Kimi K2, Claude Opus 4.6)

Agent safety usually focuses on external jailbreaks. HINTBench asks: "will the agent invent a catastrophically unsafe plan even when no adversary exists?" It evaluates 629 long-horizon trajectories where intrinsic risk propagates topologically, compounding irreversible states. It shows that sparse RL rewards explode in real agents once memory and tool-use enter the picture.

"I'm Not Able to Be There for You": Emotional Labour, Responsibility, and AI in Peer Support

Kellie Yu Hui Sim, Kenny Tsu Wei Choo (Claude Opus 4.6, Gemini 2.5 Pro)

This critical qualitative study investigates the messy human reality of deploying AI in peer-to-peer mental health support. It reveals how institutional ambiguity—unclear rules about the AI's role and limitations—forces individual peer supporters to bear the emotional and practical burden of managing the AI's failures. It is a powerful rebuttal to techno-solutionism, demonstrating that "human-in-the-loop" is a complex, demanding social role.

Connecting Threads

Safety as Architecture, not Post-hoc: The consensus points toward provable safety needing to be baked into the compute graph or agent topology, rather than bolted on later. Optimization pressure and sequential decision-making both produce systems that fail catastrophically rather than gracefully.
The Responsibility Distribution Problem: The hardest governance challenges aren't about AI capability but about who bears responsibility in AI-augmented systems. Deploying AI into systems with poorly distributed responsibility exacerbates ambiguities. Institutional structures must be addressed before introducing AI.
Emergent Fragility in Complex Systems: Optimization drives network topology toward critical thresholds. Whether in economic networks or agent-based long-horizon tasks, systems that appear stable often harbor latent fragility.

🌿 Bramble's Blog