arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.06922 2026-03-17 cs.LG

NerVE: Nonlinear Eigenspectrum Dynamics in LLM Feed-Forward Networks

Nandan Kumar Jha, Brandon Reagen

Comments Accepted to ICLR 2026. Project page: https://nerve-eigenspectrum.github.io

详情

英文摘要

We introduce NerVE, a unified eigenspectral framework for understanding how feed-forward networks (FFNs) in large language models (LLMs) organize and regulate information flow in high-dimensional latent space. Despite FFNs dominating the parameter budget, their high-dimensional dynamics remain poorly understood. NerVE addresses this gap through lightweight, memory-efficient tracking of eigenspectrum dynamics via four complementary metrics: Spectral Entropy (dispersion), Participation Ratio (effective dimensionality), Eigenvalue Early Enrichment (top-heaviness), and Jensen-Shannon divergence (distributional shifts). Our key insight is that FFN nonlinearities reinject variance across eigenmodes, fundamentally governing latent dimension utilization, and that optimizer geometry strongly modulates the extent of this variance reinjection. We validate NerVE across model scales, and diverse architectural and optimizer configurations, each uniquely shaping FFN dynamics: normalization schemes controlling variance flow; FFN weight geometries constraining latent space; positional encoding and activation functions regulating information flow; and optimizer choices redistributing effective capacity across depth. Across these settings, NerVE consistently recovers stable spectral signatures that correlate with model's generalization ability and respond predictably to design choices, generalizing beyond transformer to MLP-Mixer architectures, providing actionable insights for architectural and optimizer choices beyond trial-and-error.

URL PDF HTML ☆

赞 0 踩 0

2603.05345 2026-03-17 cs.CL

A Multilingual Human Annotated Corpus of Original and Easy-to-Read Texts to Support Access to Democratic Participatory Processes

Stefan Bott, Verena Riegler, Horacio Saggion, Almudena Rascón Alcaina, Nouran Khallaf

Comments Will be published in LREC26

2603.05310 2026-03-17 cs.SD cs.AI

Latent-Mark: An Audio Watermark Robust to Neural Resynthesis

Yen-Shan Chen, Shih-Yu Lai, Ying-Jung Tsou, Yi-Cheng Lin, Bing-Yu Chen, Yun-Nung Chen, Hung-yi Lee, Shang-Tse Chen

2603.05110 2026-03-17 cs.CV cs.LG

BLINK: Behavioral Latent Modeling of NK Cell Cytotoxicity

Iman Nematollahi, Jose Francisco Villena-Ossa, Alina Moter, Kiana Farhadyar, Gabriel Kalweit, Abhinav Valada, Toni Cathomen, Evelyn Ullrich, Maria Kalweit

2603.03726 2026-03-17 cs.CV

QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment

Guohua Zhang, Jian Jin, Meiqin Liu, Chao Yao, Weisi Lin

Comments Accepted by CVPR 2026

2603.01893 2026-03-17 cs.CV

Generative Visual Chain-of-Thought for Image Editing

Zijin Yin, Tiankai Hang, Yiji Cheng, Shiyi Zhang, Runze He, Yu Xu, Chunyu Wang, Bing Li, Zheng Chang, Kongming Liang, Qinglin Lu, Zhanyu Ma

Comments Project page: https://pris-cv.github.io/GVCoT/

2603.01535 2026-03-17 cs.CV

Benchmarking Semantic Segmentation Models via Appearance and Geometry Attribute Editing

Zijin Yin, Bing Li, Kongming Liang, Hao Sun, Zhongjiang He, Zhanyu Ma, Jun Guo

Comments Accepted to IEEE TPAMI 2026

详情

英文摘要

Semantic segmentation takes pivotal roles in various applications such as autonomous driving and medical image analysis. When deploying segmentation models in practice, it is critical to test their behaviors in varied and complex scenes in advance. In this paper, we construct an automatic data generation pipeline Gen4Seg to stress-test semantic segmentation models by generating various challenging samples with different attribute changes. Beyond previous evaluation paradigms focusing solely on global weather and style transfer, we investigate variations in both appearance and geometry attributes at the object and image level. These include object color, material, size, position, as well as image-level variations such as weather and style. To achieve this, we propose to edit visual attributes of existing real images with precise control of structural information, empowered by diffusion models. In this way, the existing segmentation labels can be reused for the edited images, which greatly reduces the labor costs. Using our pipeline, we construct two new benchmarks, Pascal-EA and COCO-EA. We benchmark a wide variety of semantic segmentation models, spanning from closed-set models to open-vocabulary large models. We have several key findings: 1) advanced open-vocabulary models do not exhibit greater robustness compared to closed-set methods under geometric variations; 2) data augmentation techniques, such as CutOut and CutMix, are limited in enhancing robustness against appearance variations; 3) our pipeline can also be employed as a data augmentation tool and improve both in-distribution and out-of-distribution performances. Our work suggests the potential of generative models as effective tools for automatically analyzing segmentation models, and we hope our findings will assist practitioners and researchers in developing more robust and reliable segmentation models.

URL PDF HTML ☆

赞 0 踩 0

2603.00552 2026-03-17 cs.AI

EMPA: Evaluating Persona-Aligned Empathy as a Process

Shiya Zhang, Yuhan Zhan, Ruixi Su, Ruihan Sun, Ziyi Song, Zhaohan Chen, Xiaofan Zhang

2602.23413 2026-03-17 cs.LG cs.CL cs.NE

EvoX: Meta-Evolution for Automated Discovery

Shu Liu, Shubham Agarwal, Monishwaran Maheswaran, Mert Cemri, Zhifei Li, Qiuyang Mang, Ashwin Naren, Ethan Boneh, Audrey Cheng, Melissa Z. Pan, Alexander Du, Kurt Keutzer, Alvin Cheung, Alexandros G. Dimakis, Koushik Sen, Matei Zaharia, Ion Stoica

2602.22680 2026-03-17 cs.AI

Toward Personalized LLM-Powered Agents: Foundations, Evaluation, and Future Directions

Yue Xu, Qian Chen, Zizhan Ma, Dongrui Liu, Wenxuan Wang, Xiting Wang, Li Xiong, Wenjie Wang

2602.21268 2026-03-17 cs.AI math.CO

A Dynamic Survey of Soft Set Theory and Its Extensions

Takaaki Fujita, Florentin Smarandache

Comments Book.143 pages. Publisher: Neutrosophic Science International Association (NSIA) Publishing House. ISBN: 978-1-59973-859-8

2602.21262 2026-03-17 cs.CL cs.LG cs.MA

Under the Influence: Quantifying Persuasion and Vigilance in Large Language Models

Sasha Robinson, Katherine M. Collins, Ilia Sucholutsky, Kelsey R. Allen

2602.20729 2026-03-17 cs.LG

Fuz-RL: A Fuzzy-Guided Robust Framework for Safe Reinforcement Learning under Uncertainty

Xu Wan, Chao Yang, Cheng Yang, Jie Song, Mingyang Sun

2602.20722 2026-03-17 cs.AI

Buffer Matters: Unleashing the Power of Off-Policy Reinforcement Learning in Large Language Model Reasoning

Xu Wan, Yansheng Wang, Wenqi Huang, Mingyang Sun

2602.20501 2026-03-17 cs.CV

Probing and Bridging Geometry-Interaction Cues for Affordance Reasoning in Vision Foundation Models

Qing Zhang, Xuesong Li, Jing Zhang

Comments 11 pages, 12 figures, Accepted to CVPR 2026

2602.09823 2026-03-17 cs.SD cs.CL eess.AS

Covo-Audio Technical Report

Wenfu Wang, Chenxing Li, Liqiang Zhang, Yiyang Zhao, Yuxiang Zou, Hanzhao Li, Mingyu Cui, Hao Zhang, Kun Wei, Le Xu, Zikang Huang, Jiajun Xu, Jiliang Hu, Xiang He, Zeyu Xie, Jiawen Kang, Youjun Chen, Meng Yu, Dong Yu, Rilin Chen, Linlin Di, Shulin Feng, Na Hu, Yang Liu, Bang Wang, Shan Yang

Comments Technical Report

2602.05192 2026-03-17 cs.AI math.AG math.CO math.GT math.HO math.RA

First Proof

Mohammed Abouzaid, Andrew J. Blumberg, Martin Hairer, Joe Kileel, Tamara G. Kolda, Paul D. Nelson, Daniel Spielman, Nikhil Srivastava, Rachel Ward, Shmuel Weinberger, Lauren Williams

Comments 9 pages, including the statements of the ten questions

2602.01842 2026-03-17 cs.LG

Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models

Jinbin Bai, Yixuan Li, Yuchen Zhu, Yi Xin, Qingyu Shi, Aosong Feng, Xiaohong Liu, Molei Tao, Jianru Xue, Xiangtai Li, Ming-Hsuan Yang

2602.00009 2026-03-17 cs.CL cs.AI cs.IR

Unlocking Electronic Health Records: A Hybrid Graph RAG Approach to Safe Clinical AI for Patient QA

Samuel Thio, Matthew Lewis, Spiros Denaxas, Richard JB Dobson

Comments 26 pages, 5 figures, 2 tables

详情

DOI: 10.3389/fdgth.2026.1780700
Journal ref: Frontiers in Digital Health, vol. 8, 2026

英文摘要

Electronic health record (EHR) systems present clinicians with vast repositories of clinical information, creating a significant cognitive burden where critical details are easily overlooked. While Large Language Models (LLMs) offer transformative potential for data processing, they face significant limitations in clinical settings, particularly regarding context grounding and hallucinations. Current solutions typically isolate retrieval methods focusing either on structured data (SQL/Cypher) or unstructured semantic search but fail to integrate both simultaneously. This work presents MediGRAF (Medical Graph Retrieval Augmented Framework), a novel hybrid Graph RAG system that bridges this gap. By uniquely combining Neo4j Text2Cypher capabilities for structured relationship traversal with vector embeddings for unstructured narrative retrieval, MediGRAF enables natural language querying of the complete patient journey. Using 10 patients from the MIMIC-IV dataset (generating 5,973 nodes and 5,963 relationships), we generated enough nodes and data for patient level question answering (QA), and we evaluated this architecture across varying query complexities. The system demonstrated 100\% recall for factual queries which means all relevant information was retrieved and in the output, while complex inference tasks achieved a mean expert quality score of 4.25/5 with zero safety violations. These results demonstrate that hybrid graph-grounding significantly advances clinical information retrieval, offering a safer, more comprehensive alternative to standard LLM deployments.

URL PDF HTML ☆

赞 0 踩 0

2601.18188 2026-03-17 cs.CV cs.AI

\textsc{NaVIDA}: Vision-Language Navigation with Inverse Dynamics Augmentation

Weiye Zhu, Zekai Zhang, Xiangchen Wang, Hewei Pan, Teng Wang, Tiantian Geng, Rongtao Xu, Feng Zheng

Comments 27 pages, 11 figures

2601.18077 2026-03-17 cs.CL

Sparks of Cooperative Reasoning: LLMs as Strategic Hanabi Agents

Mahesh Ramesh, Kaousheik Jayakumar, Aswinkumar Ramkumar, Pavan Thodima, Aniket Rege, Emmanouil-Vasileios Vlatakis-Gkaragkounis

详情

英文摘要

Cooperative reasoning under incomplete information remains challenging for both humans and multi-agent systems. The card game Hanabi embodies this challenge, requiring theory-of-mind reasoning and strategic communication. We benchmark 17 state-of-the-art LLM agents in 2-5 player games and study the impact of context engineering across model scales (4B to 600B+) to understand persistent coordination failures and robustness to scaffolding: from a minimal prompt with only explicit card details (Watson setting), to scaffolding with programmatic, Bayesian-motivated deductions (Sherlock setting), to multi-turn state tracking via working memory (Mycroft setting). We show that (1) agents can maintain an internal working memory for state tracking and (2) cross-play performance between different LLMs smoothly interpolates with model strength. In the Sherlock setting, the strongest reasoning models exceed 15 points on average across player counts, yet still trail experienced humans and specialist Hanabi agents, both consistently scoring above 20. We release the first public Hanabi datasets with annotated trajectories and move utilities: (1) HanabiLogs, containing 1,520 full game logs for instruction tuning, and (2) HanabiRewards, containing 560 games with dense move-level value annotations for all candidate moves. Supervised and RL finetuning of a 4B open-weight model (Qwen3-Instruct) on our datasets improves cooperative Hanabi play by 21% and 156% respectively, bringing performance to within ~3 points of a strong proprietary reasoning model (o4-mini) and surpassing the best non-reasoning model (GPT-4.1) by 52%. The HanabiRewards RL-finetuned model further generalizes beyond Hanabi, improving performance on a cooperative group-guessing benchmark by 11%, temporal reasoning on EventQA by 6.4%, instruction-following on IFBench-800K by 1.7 Pass@10, and matching AIME 2025 mathematical reasoning Pass@10.

URL PDF HTML ☆

赞 0 踩 0

2601.17336 2026-03-17 cs.CV

AGE-Net: Spectral--Spatial Fusion and Anatomical Graph Reasoning with Evidential Ordinal Regression for Knee Osteoarthritis Grading

Xiaoyang Li, Runni Zhou, Xinghao Yan, Liehao Yan, Zhaochen Li, Chenjie Zhu, Rongrong Fu, Yuan Chai

2601.15668 2026-03-17 cs.SD

EmotionThinker: Prosody-Aware Reinforcement Learning for Explainable Speech Emotion Reasoning

Dingdong Wang, Shujie Liu, Tianhua Zhang, Youjun Chen, Jinyu Li, Helen Meng

Comments ICLR 2026 (Oral). Project page: https://github.com/dingdongwang/EmotionThinker

2601.10168 2026-03-17 cs.CV cs.AI cs.RO

RAG-3DSG: Enhancing 3D Scene Graphs with Re-Shot Guided Retrieval-Augmented Generation

Yue Chang, Rufeng Chen, Zhaofan Zhang, Yi Chen, Yifan Tian, Sihong Xie

2601.08955 2026-03-17 cs.CL cs.AI cs.LG

Imagine-then-Plan: Agent Learning from Adaptive Lookahead with World Models

Youwei Liu, Jian Wang, Hanlin Wang, Beichen Guo, Wenjie Li

2601.05823 2026-03-17 cs.CV

Boosting Latent Diffusion Models via Disentangled Representation Alignment

John Page, Xuesong Niu, Kai Wu, Kun Gai

2601.04658 2026-03-17 cs.SD cs.AI

LAMB: LLM-based Audio Captioning with Modality Gap Bridging via Cauchy-Schwarz Divergence

Hyeongkeun Lee, Jongmin Choi, KiHyun Nam, Joon Son Chung

Comments 5 pages, 2 figures; Accepted to ICASSP 2026

2601.03220 2026-03-17 cs.LG stat.ML

From Entropy to Epiplexity: Rethinking Information for Computationally Bounded Intelligence

Marc Finzi, Shikai Qiu, Yiding Jiang, Pavel Izmailov, J. Zico Kolter, Andrew Gordon Wilson

Comments Code available at https://github.com/shikaiqiu/epiplexity

详情

英文摘要

Can we learn more from data than existed in the generating process itself? Can new and useful information be constructed from merely applying deterministic transformations to existing data? Can the learnable content in data be evaluated without considering a downstream task? On these questions, Shannon information and Kolmogorov complexity come up nearly empty-handed, in part because they assume observers with unlimited computational capacity and do not target the useful information content. In this work, we identify and exemplify three seeming paradoxes in information theory: (1) information cannot be increased by deterministic transformations; (2) information is independent of the order of data; (3) likelihood modeling is merely distribution matching. To shed light on the tension between these results and modern practice, and to quantify the value of data, we introduce epiplexity, a formalization of information capturing what computationally bounded observers can learn from data. Epiplexity captures the structural content in data while excluding time-bounded entropy, the random unpredictable content exemplified by pseudorandom number generators and chaotic dynamical systems. With these concepts, we demonstrate how information can be created with computation, how it depends on the ordering of the data, and how likelihood modeling can produce more complex programs than present in the data generating process itself. We also present practical procedures to estimate epiplexity which we show capture differences across data sources, track with downstream performance, and highlight dataset interventions that improve out-of-distribution generalization. In contrast to principles of model selection, epiplexity provides a theoretical foundation for data selection, guiding how to select, generate, or transform data for learning systems.

URL PDF HTML ☆

赞 0 踩 0

2601.02046 2026-03-17 cs.CV cs.AI

Agentic Retoucher for Text-To-Image Generation

Shaocheng Shen, Jianfeng Liang, Chunlei Cai, Cong Geng, Huiyu Duan, Xiaoyun Zhang, Qiang Hu, Guangtao Zhai

Comments Accepted by CVPR2026

2601.00557 2026-03-17 cs.CL cs.SD eess.AS

A Language-Agnostic Hierarchical LoRA-MoE Architecture for CTC-based Multilingual ASR

Yuang Zheng, Dongxu Chen, Yuxiang Mei, Dongxing Xu, Jie Chen, Yanhua Long

Comments 5 pages, submitted to IEEE Communications Letters