arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2510.13851 2026-04-07 cs.CL cs.LG

EvoEdit: Evolving Null-space Alignment for Robust and Efficient Knowledge Editing

Sicheng Lyu, Yu Gu, Xinyu Wang, Jerry Huang, Sitao Luan, Yufei Cui, Xiao-Wen Chang, Peng Lu

Comments Accepted to Findings of ACL 2026

详情

英文摘要

Large language models (LLMs) require continual updates to rectify outdated or erroneous knowledge. Model editing has emerged as a compelling paradigm for introducing targeted modifications without the computational burden of full retraining. Existing approaches are mainly based on a locate-then-edit framework. However, in sequential editing contexts, where multiple updates are applied over time, they exhibit significant limitations and suffer from catastrophic interference, i.e., new edits compromise previously integrated updates and degrade preserved knowledge. To address these challenges, we introduce EvoEdit, a novel editing strategy that mitigates catastrophic interference through sequential null-space alignment, enabling stable and efficient model editing. By performing sequential null-space alignment for each incoming edit, EvoEdit preserves both original and previously modified knowledge representations and maintains output invariance on preserved knowledge even across long edit sequences, effectively mitigating interference. Evaluations on real-world sequential knowledge-editing benchmarks show that EvoEdit achieves better or comparable performance than prior state-of-the-art locate-then-edit techniques, with up to 3.53 times speedup. Overall, these results underscore the necessity of developing more principled approaches for designing LLMs in dynamically evolving information settings, while providing a simple yet effective solution with strong theoretical guarantees.

URL PDF HTML ☆

赞 0 踩 0

2510.12866 2026-04-07 cs.RO cs.CV

Learning to Grasp Anything by Playing with Random Toys

Dantong Niu, Yuvan Sharma, Baifeng Shi, Rachel Ding, Matteo Gioia, Haoru Xue, Henry Tsai, Konstantinos Kallidromitis, Anirudh Pai, Caitlin Regan, Shankar Sastry, Trevor Darrell, Jitendra Malik, Roei Herzig

2510.11604 2026-04-07 cs.AI

Explainability, risk modeling, and segmentation based customer churn analytics for personalized retention in e-commerce

Indrajith Ekanayake, Sanjula De Alwis

2510.09901 2026-04-07 cs.AI

Autonomous Agents for Scientific Discovery: Orchestrating Scientists, Language, Code, and Physics

Lianhao Zhou, Hongyi Ling, Cong Fu, Yepeng Huang, Michael Sun, Wendi Yu, Xiaoxuan Wang, Xiner Li, Xingyu Su, Junkai Zhang, Xiusi Chen, Chenxing Liang, Xiaofeng Qian, Heng Ji, Wei Wang, Marinka Zitnik, Shuiwang Ji

2510.09228 2026-04-07 cs.CV cs.AI

Clear Roads, Clear Vision: Advancements in Multi-Weather Restoration for Smart Transportation

Vijay M. Galshetwar, Praful Hambarde, Prashant W. Patil, Akshay Dudhane, Sachin Chaudhary

Comments Accepted for publication in IEEE Transactions on Intelligent Transportation Systems (2026)

2510.07985 2026-04-07 cs.LG cs.AI cs.CR

Fewer Weights, More Problems: A Practical Attack on LLM Pruning

Kazuki Egashira, Robin Staab, Thibaud Gloaguen, Mark Vero, Martin Vechev

Comments ICLR 2026. Code: https://github.com/eth-sri/llm-pruning-attack

2510.06800 2026-04-07 cs.CL cs.AI cs.HC cs.MA

FURINA: A Fully Customizable Role-Playing Benchmark via Scalable Multi-Agent Collaboration Pipeline

Haotian Wu, Shufan Jiang, Chios Chen, Yiyang Feng, Hehai Lin, Heqing Zou, Yao Shu, Chengwei Qin

详情

英文摘要

As large language models (LLMs) advance in role-playing (RP) tasks, existing benchmarks quickly become obsolete due to their narrow scope, outdated interaction paradigms, and limited adaptability across diverse application scenarios. To address this gap, we introduce FURINA-Builder, a novel multi-agent collaboration pipeline that automatically constructs fully customizable RP benchmarks at any scale. It enables evaluation of arbitrary characters across diverse scenarios and prompt formats, as the first benchmark builder in RP area for adaptable assessment. FURINA-Builder simulates dialogues between a test character and other characters drawn from a well-constructed character-scene pool, while an LLM judge selects fine-grained evaluation dimensions and adjusts the test character's responses into final test utterances. Using this pipeline, we build FURINA-Bench, a new comprehensive role-playing benchmark featuring both established and synthesized test characters, each assessed with dimension-specific evaluation criteria. Human evaluation and preliminary separability analysis justify our pipeline and benchmark design. We conduct extensive evaluations of cutting-edge LLMs and find that o3 and DeepSeek-R1 achieve the best performance on English and Chinese RP tasks, respectively. Across all models, established characters consistently outperform synthesized ones, with reasoning capabilities further amplifying this disparity. Interestingly, we observe that model scale does not monotonically reduce hallucinations. More critically, for reasoning LLMs, we uncover a novel trade-off: reasoning improves RP performance but simultaneously increases RP hallucinations. This trade-off extends to a broader Pareto frontier between RP performance and reliability for all LLMs. These findings demonstrate the effectiveness of FURINA-Builder and the challenge posed by FURINA-Bench.

URL PDF HTML ☆

赞 0 踩 0

2510.03152 2026-04-07 cs.CV cs.CE cs.LG cs.SI

Markovian Reeb Graphs for Simulating Spatiotemporal Patterns of Life

Anantajit Subrahmanya, Chandrakanth Gudavalli, Connor Levenson, B. S. Manjunath

Comments 17 pages, 4 figures

2509.26433 2026-04-07 cs.LG cs.AI

ACT: Agentic Classification Tree

Vincent Grari, Tim Arni, Thibault Laugel, Sylvain Lamprier, James Zou, Marcin Detyniecki

Comments 25 pages, 8 figures

2509.25348 2026-04-07 cs.CV cs.HC cs.MM

Editing Physiological Signals in Videos Using Latent Representations

Tianwen Zhou, Akshay Paruchuri, Josef Spjut, Kaan Akşit

Comments Accepted to CVPR 2026 Subtle Visual Computing Workshop, 13 pages, 8 figures, 7 tables

2509.24702 2026-04-07 cs.CV

Enhancing Physical Plausibility in Video Generation by Reasoning the Implausibility

Yutong Hao, Chen Chen, Ajmal Saeed Mian, Chang Xu, Daochang Liu

2509.24186 2026-04-07 cs.CL cs.AI

Measuring Competency, Not Performance: Item-Aware Evaluation Across Medical Benchmarks

Zhimeng Luo, Lixin Wu, Adam Frisch, Daqing He

2509.18218 2026-04-07 cs.AI

Similarity Field Theory: A Mathematical Framework for Intelligence

Kei-Sing Ng

详情

英文摘要

We posit that transforming similarity relations form the structural basis of comprehensible dynamic systems. This paper introduces Similarity Field Theory, a mathematical framework that formalizes the principles governing similarity values among entities and their evolution. We define: (1) a similarity field $S: U \times U \to [0,1]$ over a universe of entities $U$, satisfying reflexivity $S(E,E)=1$ and treated as a directed relational field (asymmetry and non-transitivity are allowed); (2) the evolution of a system through a sequence $Z_p=(X_p,S^{(p)})$ indexed by $p=0,1,2,\ldots$; (3) concepts $K$ as entities that induce fibers $F_α(K)={E\in U \mid S(E,K)\ge α}$, i.e., superlevel sets of the unary map $S_K(E):=S(E,K)$; and (4) a generative operator $G$ that produces new entities. Within this framework, we formalize a generative definition of intelligence: an operator $G$ is intelligent with respect to a concept $K$ if, given a system containing entities belonging to the fiber of $K$, it generates new entities that also belong to that fiber. Similarity Field Theory thus offers a foundational language for characterizing, comparing, and constructing intelligent systems. At a high level, this framework reframes intelligence and interpretability as geometric problems on similarity fields--preserving and composing level-set fibers--rather than statistical ones. We prove two theorems: (i) asymmetry blocks mutual inclusion; and (ii) stability implies either an anchor coordinate or asymptotic confinement to the target level (up to arbitrarily small tolerance). Together, these results constrain similarity-field evolution and motivate an interpretive lens applicable to large language models. AI systems may be aligned less to safety as such than to human-observable and human-interpretable conceptions of safety, which may not fully determine the underlying safety concept.

URL PDF HTML ☆

赞 0 踩 0

2509.18052 2026-04-07 cs.CL cs.CY

The PIMMUR Principles: Ensuring Validity in Collective Behavior of LLM Societies

Jiaxu Zhou, Jen-tse Huang, Xuhui Zhou, Man Ho Lam, Xintao Wang, Hao Zhu, Wenxuan Wang, Maarten Sap

Comments 13 pages, 9 figures, 3 tables; add more papers in our systematic audit (39 in total)

2509.11481 2026-04-07 cs.RO cs.AI cs.LG

RAPTOR: A Foundation Policy for Quadrotor Control

Jonas Eschmann, Dario Albani, Giuseppe Loianno

2509.07435 2026-04-07 cs.CV

DreamLifting: A Plug-in Module Lifting MV Diffusion Models for 3D Asset Generation

Ze-Xin Yin, Jiaxiong Qiu, Liu Liu, Xinjie Wang, Wei Sui, Zhizhong Su, Jian Yang, Jin Xie

Comments 16 pages, 9 figures, TVCG 2026, project page: https://zx-yin.github.io/dreamlifting/

详情

DOI: 10.1109/TVCG.2026.3679384
Journal ref: IEEE Transactions on Visualization and Computer Graphics 2026

英文摘要

The labor- and experience-intensive creation of 3D assets with physically based rendering (PBR) materials demands an autonomous 3D asset creation pipeline. However, most existing 3D generation methods focus on geometry modeling, either baking textures into simple vertex colors or leaving texture synthesis to post-processing with image diffusion models. To achieve end-to-end PBR-ready 3D asset generation, we present Lightweight Gaussian Asset Adapter (LGAA), a novel framework that unifies the modeling of geometry and PBR materials by exploiting multi-view (MV) diffusion priors from a novel perspective. The LGAA features a modular design with three components. Specifically, the LGAA Wrapper reuses and adapts network layers from MV diffusion models, which encapsulate knowledge acquired from billions of images, enabling better convergence in a data-efficient manner. To incorporate multiple diffusion priors for geometry and PBR synthesis, the LGAA Switcher aligns multiple LGAA Wrapper layers encapsulating different knowledge. Then, a tamed variational autoencoder (VAE), termed LGAA Decoder, is designed to predict 2D Gaussian Splatting (2DGS) with PBR channels. Finally, we introduce a dedicated post-processing procedure to effectively extract high-quality, relightable mesh assets from the resulting 2DGS. Extensive quantitative and qualitative experiments demonstrate the superior performance of LGAA with both text- and image-conditioned MV diffusion models. Additionally, the modular design enables flexible incorporation of multiple diffusion priors, and the knowledge-preserving scheme effectively preseves the 2D priors learned on massive image dataset, which leads to data efficient finetuning to lift the MV diffuison models for 3D generation with merely 69k multi-view instances.

URL PDF HTML ☆

赞 0 踩 0

2509.04434 2026-04-07 cs.CV

Durian: Dual Reference Image-Guided Portrait Animation with Attribute Transfer

Hyunsoo Cha, Byungjun Kim, Hanbyul Joo

Comments Accepted to ICLR 2026, Project Page: https://hyunsoocha.github.io/durian

2509.04016 2026-04-07 cs.RO

Odometry Calibration and Pose Estimation of a 4WIS4WID Mobile Wall Climbing Robot

Branimir Ćaran, Vladimir Milić, Marko Švaco, Bojan Jerbić

Comments ACCEPTED FOR IEEE EUROPEAN CONFERENCE ON MOBILE ROBOTS 2025. PREPRINT VERSION. ACCEPTED JUNE, 2025 AND PRESENTED SEPTEMBER, 2025

详情

DOI: 10.1109/ECMR65884.2025.11163129

英文摘要

This paper presents the design of a pose estimator for a four wheel independent steer four wheel independent drive (4WIS4WID) wall climbing mobile robot, based on the fusion of multimodal measurements, including wheel odometry, visual odometry, and an inertial measurement unit (IMU) data using Extended Kalman Filter (EKF) and Unscented Kalman Filter (UKF). The pose estimator is a critical component of wall climbing mobile robots, as their operational environment involves carrying precise measurement equipment and maintenance tools in construction, requiring information about pose on the building at the time of measurement. Due to the complex geometry and material properties of building facades, the use of traditional localization sensors such as laser, ultrasonic, or radar is often infeasible for wall-climbing robots. Moreover, GPS-based localization is generally unreliable in these environments because of signal degradation caused by reinforced concrete and electromagnetic interference. Consequently, robot odometry remains the primary source of velocity and position information, despite being susceptible to drift caused by both systematic and non-systematic errors. The calibrations of the robot's systematic parameters were conducted using nonlinear optimization and Levenberg-Marquardt methods as Newton-Gauss and gradient-based model fitting methods, while Genetic algorithm and Particle swarm were used as stochastic-based methods for kinematic parameter calibration. Performance and results of the calibration methods and pose estimators were validated in detail with experiments on the experimental mobile wall climbing robot.

URL PDF HTML ☆

赞 0 踩 0

2509.00203 2026-04-07 cs.LG cs.CE

Estimating Parameter Fields in Multi-Physics PDEs from Scarce Measurements

Xuyang Li, Mahdi Masmoudi, Rami Gharbi, Nizar Lajnef, Vishnu Naresh Boddeti

2508.13998 2026-04-07 cs.RO cs.AI cs.LG

Embodied-R1: Reinforced Embodied Reasoning for General Robotic Manipulation

Yifu Yuan, Haiqin Cui, Yaoting Huang, Yibin Chen, Fei Ni, Zibin Dong, Pengyi Li, Yan Zheng, Hongyao Tang, Jianye Hao

Comments Embodied-R1 technical report v2; Published as a conference paper at ICLR 2026

2508.10053 2026-04-07 cs.LG stat.ML

xRFM: Accurate, scalable, and interpretable feature learning models for tabular data

Daniel Beaglehole, David Holzmüller, Adityanarayanan Radhakrishnan, Mikhail Belkin

2507.13920 2026-04-07 cs.LG

Causal Process Models: Reframing Dynamic Causal Graph Discovery as a Reinforcement Learning Problem

Turan Orujlu, Christian Gumbsch, Martin V. Butz, Charley M Wu

2507.12165 2026-04-07 cs.LG

Multi-Component VAE with Gaussian Markov Random Field

Fouad Oubari, Mohamed El-Baha, Raphael Meunier, Rodrigue Décatoire, Mathilde Mougeot

2507.06367 2026-04-07 cs.LG math.AG

The Riemannian Geometry Associated to Gradient Flows of Linear Convolutional Networks

El Mehdi Achour, Kathlén Kohn, Holger Rauhut

2507.04701 2026-04-07 cs.CL

XiYan-SQL: A Novel Multi-Generator Framework For Text-to-SQL

Yifu Liu, Yin Zhu, Yingqi Gao, Zhiling Luo, Xiaoxia Li, Xiaorong Shi, Yuntao Hong, Jinyang Gao, Yu Li, Bolin Ding, Jingren Zhou

Comments Published in IEEE TKDE

2507.02212 2026-04-07 cs.CV cs.CL cs.LG

SciGA: A Comprehensive Dataset for Designing Graphical Abstracts in Academic Papers

Takuro Kawada, Shunsuke Kitada, Sota Nemoto, Hitoshi Iyatomi

Comments 28 pages, 21 figures, 9 tables. Accepted to CVPR Findings 2026. Project page: https://iyatomilab.github.io/SciGA/

2506.19591 2026-04-07 cs.CV cs.AI cs.LG eess.IV

Vision Transformer-Based Time-Series Image Reconstruction for Cloud-Filling Applications

Lujun Li, Yiqun Wang, Radu State

Comments This paper has been accepted as a conference paper at the 2025 IEEE International Geoscience and Remote Sensing Symposium (IGARSS)

2506.13130 2026-04-07 cs.CV cs.AI cs.CL

ZINA: Multimodal Fine-grained Hallucination Detection and Editing

Yuiga Wada, Kazuki Matsuda, Komei Sugiura, Graham Neubig

Comments CVPR 2026 Main Conference

2506.07002 2026-04-07 cs.CV

BePo: Dual Representation for 3D Occupancy Prediction

Yunxiao Shi, Hong Cai, Jisoo Jeong, Yinhao Zhu, Shizhong Han, Amin Ansari, Fatih Porikli

Comments CVPR 2026 Workshop on Autonomous Driving

2506.04869 2026-04-07 cs.CV

Geological Field Restoration through the Lens of Image Inpainting

Vladislav Trifonov, Ivan Oseledets, Ekaterina Muravleva