arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.21518 2026-04-24 eess.IV cs.CV

DiffNR: Diffusion-Enhanced Neural Representation Optimization for Sparse-View 3D Tomographic Reconstruction

Shiyan Su, Ruyi Zha, Danli Shi, Hongdong Li, Xuelian Cheng

Comments Accepted to AAAI 2026. Project page: https://ooonesevennn.github.io/DiffNR/

2604.21507 2026-04-24 eess.AS cs.SD

DiariZen Explained: A Tutorial for the Open Source State-of-the-Art Speaker Diarization Pipeline

Nikhil Raghav

Comments 13 pages, 7 figures, 2 tables. Code available at https://github.com/nikhilraghav29/diarizen-tutorial

2604.21432 2026-04-24 stat.ML cs.LG

A single algorithm for both restless and rested rotting bandits

Julien Seznec, Pierre Ménard, Alessandro Lazaric, Michal Valko

Comments In AISTATS 2020

2604.21421 2026-04-24 cs.CR cs.AI cs.CL

Differentially Private De-identification of Dutch Clinical Notes: A Comparative Evaluation

Michele Miranda, Xinlan Yan, Nishant Mishra, Rachel Murphy, Ameen Abu-Hanna, Sébastien Bratières, Iacer Calixto

2604.21416 2026-04-24 cs.CR cs.AI

CSC: Turning the Adversary's Poison against Itself

Yuchen Shi, Xin Guo, Huajie Chen, Tianqing Zhu, Bo Liu, Wanlei Zhou

2604.21310 2026-04-24 cs.CR cs.AI

Adversarial Evasion in Non-Stationary Malware Detection: Minimizing Drift Signals through Similarity-Constrained Perturbations

Pawan Acharya, Lan Zhang

2604.21308 2026-04-24 cs.CR cs.CL

CI-Work: Benchmarking Contextual Integrity in Enterprise LLM Agents

Wenjie Fu, Xiaoting Qin, Jue Zhang, Qingwei Lin, Lukas Wutschitz, Robert Sim, Saravan Rajmohan, Dongmei Zhang

2604.21282 2026-04-24 cs.CR cs.LG cs.SE

Strategic Heterogeneous Multi-Agent Architecture for Cost-Effective Code Vulnerability Detection

Zhaohui Geoffrey Wang

Comments 11 pages, 5 figures. Accepted at the AAMAS 2026 Workshop on Software Engineering (SE Workshop). This version corresponds to the preprint of the workshop paper

2604.21270 2026-04-24 stat.ML cs.LG cs.SY eess.SY math.OC

CLT-Optimal Parameter Error Bounds for Linear System Identification

Yichen Zhou, Stephen Tu

Comments 36 pages

2604.21260 2026-04-24 stat.ML cs.AI cs.LG econ.EM q-bio.QM stat.ME

Calibeating Prediction-Powered Inference

Lars van der Laan, Mark Van Der Laan

Comments Paper website: https://larsvanderlaan.github.io/ppi-aipw/

2604.21233 2026-04-24 physics.ao-ph cs.LG physics.data-an physics.geo-ph

Assessing Emulator Design and Training for Modal Aerosol Microphysics Parameterizations in E3SMv2

Shady E. Ahmed, Hui Wan, Saad Qadeer, Panos Stinis, Kezhen Chong, Mohammad Taufiq Hassan Mozumder, Kai Zhang, Ann S. Almgren

Comments 16 pages, 7 figures

2604.21231 2026-04-24 cs.NI cs.AI cs.PF

SparKV: Overhead-Aware KV Cache Loading for Efficient On-Device LLM Inference

Hongyao Liu, Liuqun Zhai, Junyi Wang, Zhengru Fang

Comments IEEE INTERNET OF THINGS HOURNAL, 11 pages under major revision

2604.21222 2026-04-24 cond-mat.mtrl-sci cs.LG

Neutron and X-ray Diffraction Reveal the Limits of Long-Range Machine Learning Potentials for Medium-Range Order in Silica Glass

Sai Harshit Balantrapu, Atul C. Thakur, Chris Benmore, Ganesh Sivaraman

Comments 19 pages, 9 figures

详情

英文摘要

Glassy silica is a foundational material in optics and electronics, yet accurately predicting its medium-range order (MRO) remains a major challenge for machine-learning interatomic potentials (MLIPs). While local MLIPs reproduce the short-range SiO4 tetrahedral network well, it remains unclear whether locality alone is sufficient to recover the first sharp diffraction peak (FSDP), the principal experimental signature of MRO. Here, we combine neutron and X-ray diffraction measurements with large-scale molecular dynamics driven by two MACE-based models: a short-range (SR) potential and a long-range (LR) extension incorporating reciprocal-space gated attention. The SR model systematically over-structures the network, producing an overly intense FSDP in both the liquid and glassy states. Incorporating long-range interactions improves agreement with experiment for the liquid structure by reducing this excess ordering, but the LR model still fails to recover the experimental amorphous MRO after quenching. Ring-statistics and bond-angle analyses reveal that SR model exhibits an artificially narrow distribution dominated by six-membered rings, while the LR model produces a broader but still biased ring population. Despite preserving the correct tetrahedral geometry, both models show limited variability in Si-O-Si angles, indicating constrained network flexibility. These structural signatures demonstrate that both models retain excessive memory of the parent liquid network, leading to kinetically trapped and nonphysical medium-range configurations during vitrification. These results show that explicit long-range interactions are necessary but not sufficient for predictive modelling of disordered silica and suggest that accurate MRO further requires training data and sampling strategies that adequately represent the liquid-to-glass transition.

URL PDF HTML ☆

赞 0 踩 0

2604.21216 2026-04-24 econ.TH cs.AI cs.GT

Post-AGI Economies: Autonomy and the First Fundamental Theorem of Welfare Economics

Elija Perrier

Comments Under review

2604.21210 2026-04-24 quant-ph cs.LG

The Feedback Hamiltonian is the Score Function: A Diffusion-Model Framework for Quantum Trajectory Reversal

Sagar Dubey, Alan John

Comments 14 pages

2604.21203 2026-04-24 stat.ML cs.LG

Refining Covariance Matrix Estimation in Stochastic Gradient Descent Through Bias Reduction

Ziyang Wei, Wanrong Zhu, Jingyang Lyu, Wei Biao Wu

2604.21202 2026-04-24 econ.EM cs.CL

Participation and Representation in Local Government Speech

Olivia Martin, Amar Venugopal

2604.21187 2026-04-24 math.CO cs.AI

Doubly Saturated Ramsey Graphs: A Case Study in Computer-Assisted Mathematical Discovery

Benjamin Przybocki, John Mackey, Marijn J. H. Heule, Bernardo Subercaseaux

2604.21174 2026-04-24 cs.CE cs.AI math.AP

Scaling of Gaussian Kolmogorov--Arnold Networks

Amir Noorizadegan, Sifan Wang

2604.21172 2026-04-24 cs.LO cs.AI

TAPO-Description Logic for Information Behavior: Refined OBoxes, Inference, and Categorical Semantics

Takao Inoué

Comments 23 pages, 2 figures. Substantially expanded version of arXiv:2602.17242; adds a guard-judgment layer, refined OBoxes, core inference rules, categorical semantics, sheaf-theoretic refinement, and a browsing-theory appendix

2604.21159 2026-04-24 cs.CR cs.AI cs.CL cs.LG

Adaptive Instruction Composition for Automated LLM Red-Teaming

Jesse Zymet, Andy Luo, Swapnil Shinde, Sahil Wadhwa, Emily Chen

Comments Accepted to ACL 2026 Main Conference

2604.21152 2026-04-24 cs.CY cs.AI cs.CL cs.HC cs.IR

Dialect vs Demographics: Quantifying LLM Bias from Implicit Linguistic Signals vs. Explicit User Profiles

Irti Haq, Belén Saldías

Comments In The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26), June 25--28, 2026, Montreal, Canada. ACM, New York, NY, USA, 32 pages

详情

英文摘要

As state-of-the-art Large Language Models (LLMs) have become ubiquitous, ensuring equitable performance across diverse demographics is critical. However, it remains unclear whether these disparities arise from the explicitly stated identity itself or from the way identity is signaled. In real-world interactions, users' identity is often conveyed implicitly through a complex combination of various socio-linguistic factors. This study disentangles these signals by employing a factorial design with over 24,000 responses from two open-weight LLMs (Gemma-3-12B and Qwen-3-VL-8B), comparing prompts with explicitly announced user profiles against implicit dialect signals (e.g., AAVE, Singlish) across various sensitive domains. Our results uncover a unique paradox in LLM safety where users achieve ``better'' performance by sounding like a demographic than by stating they belong to it. Explicit identity prompts activate aggressive safety filters, increasing refusal rates and reducing semantic similarity compared to our reference text for Black users. In contrast, implicit dialect cues trigger a powerful ``dialect jailbreak,'' reducing refusal probability to near zero while simultaneously achieving a greater level of semantic similarity to the reference texts compared to Standard American English prompts. However, this ``dialect jailbreak'' introduces a critical safety trade-off regarding content sanitization. We find that current safety alignment techniques are brittle and over-indexed on explicit keywords, creating a bifurcated user experience where ``standard'' users receive cautious, sanitized information while dialect speakers navigate a less sanitized, more raw, and potentially a more hostile information landscape and highlights a fundamental tension in alignment--between equitable and linguistic diversity--and underscores the need for safety mechanisms that generalize beyond explicit cues.

URL PDF HTML ☆

赞 0 踩 0

2604.21131 2026-04-24 cs.CR cs.AI cs.CL cs.LG

Cross-Session Threats in AI Agents: Benchmark, Evaluation, and Algorithms

Ari Azarafrooz

Comments 46 pages, 8 figures. Dataset: https://huggingface.co/datasets/intrinsec-ai/cstm-bench

详情

英文摘要

AI-agent guardrails are memoryless: each message is judged in isolation, so an adversary who spreads a single attack across dozens of sessions slips past every session-bound detector because only the aggregate carries the payload. We make three contributions to cross-session threat detection. (1) Dataset. CSTM-Bench is 26 executable attack taxonomies classified by kill-chain stage and cross-session operation (accumulate, compose, launder, inject_on_reader), each bound to one of seven identity anchors that ground-truth "violation" as a policy predicate, plus matched Benign-pristine and Benign-hard confounders. Released on Hugging Face as intrinsec-ai/cstm-bench with two 54-scenario splits: dilution (compositional) and cross_session (12 isolation-invisible scenarios produced by a closed-loop rewriter that softens surface phrasing while preserving cross-session artefacts). (2) Measurement. Framing cross-session detection as an information bottleneck to a downstream correlator LLM, we find that a session-bound judge and a Full-Log Correlator concatenating every prompt into one long-context call both lose roughly half their attack recall moving from dilution to cross_session, well inside any frontier context window. Scope: 54 scenarios per shard, one correlator family (Anthropic Claude), no prompt optimisation; we release it to motivate larger, multi-provider datasets. (3) Algorithm and metric. A bounded-memory Coreset Memory Reader retaining highest-signal fragments at $K=50$ is the only reader whose recall survives both shards. Because ranker reshuffles break KV-cache prefix reuse, we promote $\mathrm{CSR\_prefix}$ (ordered prefix stability, LLM-free) to a first-class metric and fuse it with detection into $\mathrm{CSTM} = 0.7 F_1(\mathrm{CSDA@action}, \mathrm{precision}) + 0.3 \mathrm{CSR\_prefix}$, benchmarking rankers on a single Pareto of recall versus serving stability.

URL PDF HTML ☆

赞 0 踩 0

2604.21129 2026-04-24 cs.MA cs.AI cs.DC

AGNT2: Autonomous Agent Economies on Interaction-Optimized Layer 2 Infrastructure

Anbang Ruan, Xing Zhang

详情

英文摘要

Current blockchain Layer 2 solutions, including Optimism, Arbitrum, zkSync, and their derivatives, optimize for human-initiated financial transactions. Autonomous AI agents instead generate high-frequency, semantically rich service invocations among mutually untrusting principals. Existing chains treat those interactions as generic calldata, forcing identity, escrow, dependency ordering, and session state to be encoded above the execution layer at the wrong cost point. We present AGNT2, a three-tier stack purpose-built for agent and microservice coordination on-chain. AGNT2 combines: (1) a sidecar deployment pattern that turns any Docker container into an on-chain agent without application-code modification; (2) Layer Top P2P state channels for established bilateral pairs (<100 ms, rough design target 1K-5K TPS per pair, 10M+ aggregate TPS design envelope under endpoint-resource limits), Layer Core as a dependency-aware sequenced rollup for first-contact and multi-party interactions (500 ms-2 s, 300K-500K TPS design target), and Layer Root settlement with computational fraud proofs anchored to any EVM L1; and (3) an agent-native execution environment plus interaction trie that make service invocation, identity, reputation, capabilities, and session context first-class protocol objects. This paper focuses on the execution-layer systems problem: sequencing, state, settlement, and the data-availability (DA) bandwidth gap that bounds all three. Simulation and analytical modeling support the architecture, and prototype measurements validate selected components, but no end-to-end Layer Core implementation exists yet. Practical deployment is currently constrained to roughly 10K-100K TPS by DA throughput, leaving a ~100x gap at the target ceiling. AGNT2 argues that the agent economy requires a dedicated execution layer rather than a general-purpose chain repurposed for agents.

URL PDF HTML ☆

赞 0 踩 0

2604.21097 2026-04-24 stat.ML cs.LG

Learning to Emulate Chaos: Adversarial Optimal Transport Regularization

Gabriel Melo, Leonardo Santiago, Peter Y. Lu

2604.21096 2026-04-24 cs.IR cs.CL

Multilingual and Domain-Agnostic Tip-of-the-Tongue Query Generation for Simulated Evaluation

Xuhong He, To Eun Kim, Maik Fröbe, Jaime Arguello, Bhaskar Mitra, Fernando Diaz

Comments SIGIR 2026; NTCIR track: https://ntcir-tot.github.io

2604.21090 2026-04-24 cs.SE cs.AI

Structural Quality Gaps in Practitioner AI Governance Prompts: An Empirical Study Using a Five-Principle Evaluation Framework

Christo Zietsman

Comments 8 pages. Experiment, corpus, and evaluation framework publicly available at https://github.com/czietsman/nuphirho.dev/tree/main/experiments/governance-prompts-v1

2604.21085 2026-04-24 physics.ao-ph cs.LG

climt-paraformer: Stable Emulation of Convective Parameterization using a Temporal Memory-aware Transformer

Shuochen Wang, Nishant Yadav, Joy Merwin Monteiro, Auroop R. Ganguly

2604.21083 2026-04-24 cs.CR cs.AI cs.NI cs.SE

Behavioral Consistency and Transparency Analysis on Large Language Model API Gateways

Guanjie Lin, Yinxin Wan, Shichao Pei, Ting Xu, Kuai Xu, Guoliang Xue

Comments 11 pages. Initially submitted to IMC 2026 Cycle 1 on November 20, 2025; accepted on March 13, 2026. To appear in Proceedings of the 2026 ACM Internet Measurement Conference (IMC '26)

2604.21073 2026-04-24 cond-mat.mtrl-sci cs.AI

Generative Discovery of Magnetic Insulators under Competing Physical Constraints

Qiulin Zeng, Tahiya Chowdhury, Md Shafayat Hossain