arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2510.06002 2026-04-30 cs.AI cs.CL cs.IR

Deterministic Legal Agents: A Canonical Primitive API for Auditable Reasoning over Temporal Knowledge Graphs

Hudson de Martim

Comments Substantially revised version consolidating the paper as a formal SAT-Graph API specification: clarifies Probability Isolation and post-anchoring determinism, broadens semantic anchoring to open and thematic legal queries, refines the data models and temporal primitives, and strengthens the use cases, limitations, and bibliography

详情

英文摘要

In high-stakes legal domains, retrieval must preserve not only semantic relevance, but also the hierarchy, temporality, and causal provenance of legal norms. Standard Retrieval-Augmented Generation (RAG), based mainly on semantic similarity over text fragments, cannot reliably provide this level of control. Prior work on SAT-Graph RAG addressed the representation problem by modeling legal materials as structure-aware temporal knowledge graphs. This paper addresses the next problem: how an LLM-based reasoning agent can interact with such a graph without reintroducing unreliable retrieval behavior. We specify the SAT-Graph API, a canonical primitive interface for auditable reasoning over temporal knowledge graphs, developed and illustrated in the legal domain. The API exposes typed, atomic, and composable primitives that mediate between a probabilistic language model and a deterministic symbolic substrate. Its design follows Probability Isolation: uncertainty is confined to intent translation, semantic anchoring, and final narrative synthesis, while structural, temporal, and causal graph traversals are executed through deterministic operations over canonical anchors. The interface shifts legal RAG from single-shot Retrieve-then-Generate to active Reason-Act-Observe. An agent decomposes a legal question into an explicit execution plan, invokes primitives for point-in-time retrieval, context reconstruction, provenance tracing, and impact analysis, and produces an answer grounded in an auditable log of graph operations. The result is a formal architectural specification, not an empirical benchmark: a secure interaction protocol that decouples legal knowledge representation from agentic reasoning. Although illustrated in law, the primitive model is domain-portable to other temporally versioned, provenance-sensitive, and authority-governed knowledge bases.

URL PDF HTML ☆

赞 0 踩 0

2510.03093 2026-04-30 cs.CL cs.SD

Revisiting Direct Speech-to-Text Translation with Speech LLMs: Better Scaling than CoT Prompting?

Oriol Pareras, Gerard I. Gállego, Federico Costa, Cristina España-Bonet, Javier Hernando

Comments To appear in Proc. ICASSP 2026, May 04-08, 2026, Barcelona, Spain

2509.25236 2026-04-30 cs.AI cs.LG eess.SP

Networks of Causal Abstractions: A Sheaf-theoretic Framework

Gabriele D'Acunto, Paolo Di Lorenzo, Sergio Barbarossa

Comments Major changes: added convergence proof for CK diffusion dynamics; clarified relation with our prior work arXiv:2602.02623, removing substantial overlap; shortened background

2509.23980 2026-04-30 cs.CV

Towards Redundancy Reduction in Diffusion Models for Efficient Video Super-Resolution

Jinpei Guo, Yifei Ji, Shengwei Wang, Zheng Chen, Yufei Wang, Sizhuo Ma, Yong Guo, Baiang Li, Jusheng Zhang, Yulun Zhang, Jian Wang

2509.23410 2026-04-30 cs.LG cs.AI cs.PF

PATCH: Learnable Tile-level Hybrid Sparsity for LLMs

Younes Hourri, Mohammad Mozaffari, Maryam Mehri Dehnavi

2509.21983 2026-04-30 cs.RO cs.AI

Hybrid Diffusion for Simultaneous Symbolic and Continuous Planning

Sigmund Hennum Høeg, Aksel Vaaler, Chaoqi Liu, Olav Egeland, Yilun Du

Comments 10 pages, 11 figures. This work has been submitted to the IEEE for possible publication. See https://sigmundhh.com/hybrid_diffusion/ for the project website

2509.20265 2026-04-30 cs.LG cs.CL

Failure Modes of Maximum Entropy RLHF

Ömer Veysel Çağatan, Barış Akgün

Comments 23 pages, 12 figures

2509.17625 2026-04-30 cs.LG cs.CY physics.soc-ph stat.ME

Comparing Data Assimilation and Likelihood-Based Inference on Latent State Estimation in Agent-Based Models

Blas Kolic, Corrado Monti, Gianmarco De Francisci Morales, Marco Pangallo

2509.16591 2026-04-30 cs.CL

Heterogeneous Adaptive Policy Optimization: Tailoring Optimization to Every Token's Nature

Zheng Liu, Mengjie Liu, Siwei Wen, Mengzhang Cai, Bin Cui, Conghui He, Wentao Zhang

2509.07523 2026-04-30 cs.LG

RoseCDL: Robust and Scalable Convolutional Dictionary Learning for Rare event and Anomaly Detection

Jad Yehya, Mansour Benbakoura, Cédric Allain, Benoît Malezieux, Matthieu Kowalski, Thomas Moreau

Comments Accepted to the 29th International Conference on Artificial Intelligence and Statistics (AISTATS) 2026

2508.19900 2026-04-30 cs.LG

Adaptive Scaling of Policy Constraints for Offline Reinforcement Learning

Tan Jing, Xiaorui Li, Chao Yao, Xiaojuan Ban, Yuetong Fang, Renjing Xu, Zhaolin Yuan

2508.12672 2026-04-30 cs.LG cs.AI

Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering

Emmanouil Kritharakis, Dusan Jakovetic, Antonios Makris, Konstantinos Tserpes

Comments Accepted at the 3rd Workshop on Advancements in Federated Learning (WAFL@ECML-PKDD 2025)

2508.09547 2026-04-30 cs.CV cs.AI

GoViG: Goal-Conditioned Visual Navigation Instruction Generation via Multimodal Reasoning

Fengyi Wu, Yifei Dong, Yilong Dai, Guangyu Chen, Qifeng Wu, Huiting Huang, Hang Wang, Qi Dai, Alexander G. Hauptmann, Zhi-Qi Cheng

Comments Accepted to ACL 2026 Findings. 22 pages, 12 figures, Code: https://github.com/F1y1113/GoViG

2508.07220 2026-04-30 cs.LG cs.AI

Neural Bridge Processes

Jian Xu, Yican Liu, Delu Zeng, John Paisley, Qibin Zhao

2508.04325 2026-04-30 cs.CL cs.AI cs.CV cs.LG cs.MM

Beyond the Leaderboard: Rethinking Medical Benchmarks for Large Language Models

Wenting Chen, Guo Yu, Yiu-Fai Cheung, Meidan Ding, Jie Liu, Zizhan Ma, Wenxuan Wang, Linlin Shen

Comments Accepted by ACL 2026

2507.21420 2026-04-30 cs.CV cs.CL

ReGATE: Learning Faster and Better with Fewer Tokens in MLLMs

Chaoyu Li, Yogesh Kulkarni, Pooyan Fazli

Comments ACL 2026. Project page: https://people-robots.github.io/regate

2507.12549 2026-04-30 cs.LG cs.CC stat.ML

The Serial Scaling Hypothesis

Yuxi Liu, Konpat Preechakul, Kananart Kuwaranancharoen, Yutong Bai

Comments ICLR 2026. Equal contribution by the first two authors. Project page: https://serial-scaling-hypothesis.github.io

2507.01544 2026-04-30 cs.LG

MARVIS: Modality Adaptive Reasoning over VISualizations

Benjamin Feuer, Lennart Purucker, Oussama Elachqar, Chinmay Hegde

2507.01449 2026-04-30 cs.CL

LogitSpec: Accelerating Retrieval-based Speculative Decoding via Next Next Token Speculation

Tianyu Liu, Qitan Lv, Hao Li, Xing Gao, Xiao Sun, Xiaoyan Sun

2506.23323 2026-04-30 cs.CV

FA-Seg: A Fast and Accurate Diffusion-Based Method for Open-Vocabulary Segmentation

Huy Che, Vinh-Tiep Nguyen

详情

DOI: 10.1016/j.neucom.2025.131844
Journal ref: Neurocomputing 660 (2026) 131844

英文摘要

Open-vocabulary semantic segmentation (OVSS) aims to segment objects from arbitrary text categories without requiring densely annotated datasets. Although contrastive learning based models enable zero-shot segmentation, they often lose fine spatial precision at pixel level, due to global representation bias. In contrast, diffusion-based models naturally encode fine-grained spatial features via attention mechanisms that capture both global context and local details. However, they often face challenges in balancing the computation costs and the quality of the segmentation mask. In this work, we present FA-Seg, a Fast and Accurate training-free framework for open-vocabulary segmentation based on diffusion models. FA-Seg performs segmentation using only a (1+1)-step from a pretrained diffusion model. Moreover, instead of running multiple times for different classes, FA-Seg performs segmentation for all classes at once. To further enhance the segmentation quality, FA-Seg introduces three key components: (i) a dual-prompt mechanism for discriminative, class-aware attention extraction, (ii) a Hierarchical Attention Refinement Method (HARD) that enhances semantic precision via multi-resolution attention fusion, and (iii) a Test-Time Flipping (TTF) scheme designed to improve spatial consistency. Extensive experiments show that FA-Seg achieves state-of-the-art training-free performance, obtaining 43.8% average mIoU across PASCAL VOC, PASCAL Context, and COCO Object benchmarks while maintaining superior inference efficiency. Our results demonstrate that FA-Seg provides a strong foundation for extendability, bridging the gap between segmentation quality and inference efficiency. The source code is available at https://github.com/chequanghuy/FA-Seg.

URL PDF HTML ☆

赞 0 踩 0

2506.20876 2026-04-30 cs.CL

Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicine

Sebastian Joseph, Lily Chen, Barry Wei, Michael Mackert, Iain J. Marshall, Paul Pu Liang, Ramez Kouzy, Byron C. Wallace, Junyi Jessy Li

Comments ACL 2026 Findings camera-ready

2506.18499 2026-04-30 cs.LG cs.AI cs.DB

PuckTrick: A Library for Making Synthetic Data More Realistic

Alessandra Agostini, Andrea Maurino, Blerina Spahiu

Comments 17 pages, 3 figures

2506.16742 2026-04-30 cs.CV

Uncertainty-Aware Information Pursuit for Interpretable and Reliable Medical Image Analysis

Md Nahiduzzaman, Steven Korevaar, Zongyuan Ge, Feng Xia, Alireza Bab-Hadiashar, Ruwan Tennakoon

Comments Accepted to IEEE Transactions on Medical Imaging (IEEE TMI 2025)

2506.13116 2026-04-30 cs.LG cs.CL

Crime Hotspot Prediction Using Deep Graph Convolutional Networks

Tehreem Zubair, Syeda Kisaa Fatima, Noman Ahmed, Asifullah Khan

2506.02494 2026-04-30 cs.CL cs.AI cs.CV

MINOS: A Multimodal Evaluation Model for Bidirectional Generation Between Image and Text

Junzhe Zhang, Huixuan Zhang, Xinyu Hu, Li Lin, Mingqi Gao, Shi Qiu, Xiaojun Wan

Comments Accepted to the Findings of ACL 2026

2505.24867 2026-04-30 cs.CV cs.AI

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Ujjwal Upadhyay, Mukul Ranjan, Zhiqiang Shen, Mohamed Elhoseiny

Comments Accepted at IEEE/CVF Conference on Computer Vision and Pattern Recognition 2026 Project page at https://timeblindness.github.io

2505.22910 2026-04-30 cs.CL

Talent or Luck? Evaluating Attribution Bias in Large Language Models

Chahat Raj, Mahika Banerjee, Jinhao Pan, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu

Comments Accepted to ACL Findings 2026

2505.22897 2026-04-30 cs.CL

VIGNETTE: Socially Grounded Bias Evaluation for Vision-Language Models

Chahat Raj, Bowen Wei, Aylin Caliskan, Antonios Anastasopoulos, Ziwei Zhu

Comments Accepted to ACL 2026

2505.21190 2026-04-30 cs.CL cs.AI

Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, Min Gwan Kim, Jung-Oh Lee, Hyuk Gi Hong, Eun Woo Doe, Hangyul Yoon, Jiyoun Kim, Harshita Sharma, Daniel C. Castro, Javier Alvarez-Valle, Edward Choi

Comments CHIL (Conference on Health, Inference, and Learning) 2026

2505.20414 2026-04-30 cs.CV cs.AI cs.RO

RetroMotion: Retrocausal Motion Forecasting Models are Instructable

Royden Wagner, Omer Sahin Tas, Felix Hauser, Marlon Steiner, Dominik Strutz, Abhishek Vivekanandan, Jaime Villa, Yinzhe Shen, Carlos Fernandez, Christoph Stiller

Comments CVPRW26