arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2505.17209 2026-04-10 cs.RO cs.AI

LiloDriver: A Lifelong Learning Framework for Closed-loop Motion Planning in Long-tail Autonomous Driving Scenarios

Huaiyuan Yao, Pengfei Li, Bu Jin, Yupeng Zheng, An Liu, Lisen Mu, Qing Su, Qian Zhang, Yilun Chen, Peng Li

Comments 7 pages, 3 figures

详情

英文摘要

Recent advances in autonomous driving research towards motion planners that are robust, safe, and adaptive. However, existing rule-based and data-driven planners lack adaptability to long-tail scenarios, while knowledge-driven methods offer strong reasoning but face challenges in representation, control, and real-world evaluation. To address these challenges, we present LiloDriver, a lifelong learning framework for closed-loop motion planning in long-tail autonomous driving scenarios. By integrating large language models (LLMs) with a memory-augmented planner generation system, LiloDriver continuously adapts to new scenarios without retraining. It features a four-stage architecture including perception, scene encoding, memory-based strategy refinement, and LLM-guided reasoning. Evaluated on the nuPlan benchmark, LiloDriver achieves superior performance in both common and rare driving scenarios, outperforming static rule-based and learning-based planners. Our results highlight the effectiveness of combining structured memory and LLM reasoning to enable scalable, human-like motion planning in real-world autonomous driving. Our code is available at https://github.com/Hyan-Yao/LiloDriver.

URL PDF HTML ☆

赞 0 踩 0

2505.15960 2026-04-10 cs.CL

Efficient PRM Training Data Synthesis via Formal Verification

Ryo Kamoi, Yusen Zhang, Nan Zhang, Sarkar Snigdha Sarathi Das, Ranran Haoran Zhang, Wenpeng Yin, Rui Zhang

Comments ACL 2026 Findings. Datasets, models, and code are provided at https://github.com/psunlpgroup/FoVer. Please also refer to our project website at https://fover-prm.github.io/

2505.13126 2026-04-10 cs.AI cs.CL

Iterative Formalization and Planning in Partially Observable Environments

Liancheng Gong, Wang Zhu, Jesse Thomason, Li Zhang

Comments In Findings of ACL 2026

2505.07315 2026-04-10 cs.AI cs.LG

FedIFL: A federated cross-domain diagnostic framework for motor-driven systems with inconsistent fault modes

Zexiao Wang, Yankai Wang, Xiaoqiang Liao, Xinguo Ming, Weiming Shen

Comments The paper is being withdrawn as we found that it did not fully articulate the representation of deep implicit features, which is the core focus of our work. Additionally, the experiments were incomplete and lacked sufficient analysis. We plan to revise the paper, clarify these aspects, and enhance the experimental validation before resubmitting

详情

英文摘要

Due to the scarcity of industrial data, individual equipment users, particularly start-ups, struggle to independently train a comprehensive fault diagnosis model; federated learning enables collaborative training while ensuring data privacy, making it an ideal solution. However, the diversity of working conditions leads to variations in fault modes, resulting in inconsistent label spaces across different clients. In federated diagnostic scenarios, label space inconsistency leads to local models focus on client-specific fault modes and causes local models from different clients to map different failure modes to similar feature representations, which weakens the aggregated global model's generalization. To tackle this issue, this article proposed a federated cross-domain diagnostic framework termed Federated Invariant Features Learning (FedIFL). In intra-client training, prototype contrastive learning mitigates intra-client domain shifts, subsequently, feature generating ensures local models can access distributions of other clients in a privacy-friendly manner. Besides, in cross-client training, a feature disentanglement mechanism is introduced to mitigate cross-client domain shifts, specifically, an instance-level federated instance consistency loss is designed to ensure the instance-level consistency of invariant features between different clients, furthermore, a federated instance personalization loss and an orthogonal loss are constructed to distinguish specific features that from the invariant features. Eventually, the aggregated model achieves promising generalization among global label spaces, enabling accurate fault diagnosis for target clients' Motor Driven Systems (MDSs) with inconsistent label spaces. Experiments on real-world MDSs validate the effectiveness and superiority of FedIFL in federated cross-domain diagnosis with inconsistent fault modes.

URL PDF HTML ☆

赞 0 踩 0

2505.05020 2026-04-10 cs.LG

Approximately Equivariant Recurrent Generative Models for Quasi-Periodic Time Series with a Progressive Training Scheme

Ruwen Fulek, Markus Lange-Hegermann

2505.00017 2026-04-10 cs.CL cs.AI cs.DB cs.LG

ReCellTy: Domain-Specific Knowledge Graph Retrieval-Augmented LLMs Reasoning Workflow for Single-Cell Annotation

Dezheng Han, Yibin Jia, Ruxiao Chen, Wenjie Han, Shuaishuai Guo, Jianbo Wang

2504.17069 2026-04-10 cs.CV cs.AI

Distilling Specialized Orders for Visual Generation

Rishav Pramanik, Amin Sghaier, Masih Aminbeidokhti, Juan A. Rodriguez, Antoine Poupon, David Vazquez, Christopher Pal, Zhaozheng Yin, Marco Pedersoli

2504.13102 2026-04-10 cs.SD cs.AI eess.AS

A Multi-task Learning Balanced Attention Convolutional Neural Network Model for Few-shot Underwater Acoustic Target Recognition

Wei Huang, Shumeng Sun, Junpeng Lu, Zhenpeng Xu, Zhengyang Xiu, Hao Zhang

2504.13015 2026-04-10 cs.CV

Hierarchical Feature Learning for Medical Point Clouds via State Space Model

Guoqing Zhang, Jingyun Yang, Yang Li

Comments 10 pages, 3 figures

2503.23078 2026-04-10 cs.CL

EventWeave: A Dynamic Framework for Capturing Core and Supporting Events in Dialogue Systems

Zhengyi Zhao, Shubo Zhang, Yiming Du, Bin Liang, Baojun Wang, Zhongyang Li, Binyang Li, Kam-Fai Wong

Comments Accepted by ACL'26

2503.10183 2026-04-10 cs.CV cs.AI

Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding

Shunqi Mao, Chaoyi Zhang, Weidong Cai

Comments ACL 2026 Main Conference

2503.02537 2026-04-10 cs.CV cs.AI

RectifiedHR: Enable Efficient High-Resolution Synthesis via Energy Rectification

Zhen Yang, Guibao Shen, Minyang Li, Liang Hou, Mushui Liu, Luozhou Wang, Xin Tao, Ying-Cong Chen

Comments Project Page: https://zhenyangcs.github.io/RectifiedHR-Diffusion/

2503.01870 2026-04-10 cs.CL cs.AI cs.HC econ.GN q-fin.EC

Transforming the Voice of the Customer: Large Language Models for Identifying Customer Needs

Artem Timoshenko, Chengfeng Mao, John R. Hauser

2502.19559 2026-04-10 cs.CL

Stay Focused: Problem Drift in Multi-Agent Debate

Jonas Becker, Lars Benedikt Kaesberg, Andreas Stephan, Jan Philip Wahle, Terry Ruas, Bela Gipp

Comments accepted at EACL 2026

2502.15512 2026-04-10 cs.LG

SALSA-RL: Stability Analysis in the Latent Space of Actions for Reinforcement Learning

Xuyang Li, Romit Maulik

2412.20718 2026-04-10 cs.CV cs.AI

MM-MoralBench: A MultiModal Moral Evaluation Benchmark for Large Vision-Language Models

Bei Yan, Jie Zhang, Zhiyuan Chen, Shiguang Shan, Xilin Chen

Comments Accepted by Pattern Recognition

2412.15922 2026-04-10 cs.LG cs.SD eess.AS

RiTTA: Modeling Event Relations in Text-to-Audio Generation

Yuhang He, Yash Jain, Xubo Liu, Andrew Markham, Vibhav Vineet

Comments EMNLP25, Project Site: https://yuhanghe01.github.io/RiTTA-Proj/. Code: https://github.com/yuhanghe01/RiTTA

2412.10437 2026-04-10 cs.CV cs.GR cs.LG

SVGFusion: A VAE-Diffusion Transformer for Vector Graphic Generation

Ximing Xing, Juncheng Hu, Ziteng Xue, Jing Zhang, Buyu Li, Sheng Wang, Dong Xu, Qian Yu

Comments project page: https://ximinng.github.io/SVGFusionProject/

2412.08637 2026-04-10 cs.CV cs.AI cs.LG

DMin: Scalable Training Data Influence Estimation for Diffusion Models

Huawei Lin, Yingjie Lao, Weijie Zhao

Comments Accepted to CVPR 2026 (Findings)

2412.03884 2026-04-10 cs.AI

A Unified Framework for Evaluating and Enhancing the Transparency of Explainable AI Methods via Perturbation-Gradient Consensus Attribution

Md. Ariful Islam, Md Abrar Jahin, M. F. Mridha, Nilanjan Dey

2411.07799 2026-04-10 cs.CV cs.RO

Horticultural Temporal Fruit Monitoring via 3D Instance Segmentation and Re-Identification using Colored Point Clouds

Daniel Fusaro, Federico Magistri, Jens Behley, Alberto Pretto, Cyrill Stachniss

详情

DOI: 10.1016/j.compag.2026.111723
Journal ref: Computers and Electronics in Agriculture, Volume 247, Pages 111723, 2026

英文摘要

Accurate and consistent fruit monitoring over time is a key step toward automated agricultural production systems. However, this task is inherently difficult due to variations in fruit size, shape, occlusion, orientation, and the dynamic nature of orchards where fruits may appear or disappear between observations. In this article, we propose a novel method for fruit instance segmentation and re-identification on 3D terrestrial point clouds collected over time. Our approach directly operates on dense colored point clouds, capturing fine-grained 3D spatial detail. We segment individual fruits using a learning-based instance segmentation method applied directly to the point cloud. For each segmented fruit, we extract a compact and discriminative descriptor using a 3D sparse convolutional neural network. To track fruits across different times, we introduce an attention-based matching network that associates fruits with their counterparts from previous sessions. Matching is performed using a probabilistic assignment scheme, selecting the most likely associations across time. We evaluate our approach on real-world datasets of strawberries and apples, demonstrating that it outperforms existing methods in both instance segmentation and temporal re-identification, enabling robust and precise fruit monitoring across complex and dynamic orchard environments. Keywords = Agricultural Robotics, 3D Fruit Tracking, Instance Segmentation, Deep Learning , Point Clouds, Sparse Convolutional Networks, Temporal Monitoring

URL PDF HTML ☆

赞 0 踩 0

2411.02622 2026-04-10 cs.LG cs.AI

AdaProb: Efficient Machine Unlearning via Adaptive Probability

Zihao Zhao, Yuchen Yang, Anjalie Field, Yinzhi Cao

2410.22258 2026-04-10 cs.LG cs.SY eess.IV eess.SY stat.ML

LipKernel: Lipschitz-Bounded Convolutional Neural Networks via Dissipative Layers

Patricia Pauli, Ruigang Wang, Ian Manchester, Frank Allgöwer

详情

DOI: 10.1016/j.automatica.2026.112959
Journal ref: Automatica 188 (2026): 112959

英文摘要

We propose a novel layer-wise parameterization for convolutional neural networks (CNNs) that includes built-in robustness guarantees by enforcing a prescribed Lipschitz bound. Each layer in our parameterization is designed to satisfy a linear matrix inequality (LMI), which in turn implies dissipativity with respect to a specific supply rate. Collectively, these layer-wise LMIs ensure Lipschitz boundedness for the input-output mapping of the neural network, yielding a more expressive parameterization than through spectral bounds or orthogonal layers. Our new method LipKernel directly parameterizes dissipative convolution kernels using a 2-D Roesser-type state space model. This means that the convolutional layers are given in standard form after training and can be evaluated without computational overhead. In numerical experiments, we show that the run-time using our method is orders of magnitude faster than state-of-the-art Lipschitz-bounded networks that parameterize convolutions in the Fourier domain, making our approach particularly attractive for improving the robustness of learning-based real-time perception or control in robotics, autonomous vehicles, or automation systems. We focus on CNNs, and in contrast to previous works, our approach accommodates a wide variety of layers typically used in CNNs, including 1-D and 2-D convolutional layers, maximum and average pooling layers, as well as strided and dilated convolutions and zero padding. However, our approach naturally extends beyond CNNs as we can incorporate any layer that is incrementally dissipative.

URL PDF HTML ☆

赞 0 踩 0

2408.05086 2026-04-10 cs.CL cs.AI

A systematic framework for generating novel experimental hypotheses from language models

Kanishka Misra, Najoung Kim

Comments Revised version

2407.19426 2026-04-10 cs.LG cs.AI stat.ML

Causal Discovery in Linear Models with Unobserved Variables and Measurement Error

Yuqin Yang, Mohamed Nafea, Negar Kiyavash, Kun Zhang, AmirEmad Ghassami

2407.01563 2026-04-10 cs.RO cs.AI cs.LG

NaviSlim: Adaptive Context-Aware Navigation and Sensing via Dynamic Slimmable Networks

Tim Johnsen, Marco Levorato

Comments 13 pages, 12 figures

详情

DOI: 10.1109/IoTDI61053.2024.00014
Journal ref: 2024 IEEE/ACM Ninth International Conference on Internet-of-Things Design and Implementation (IoTDI)

英文摘要

Small-scale autonomous airborne vehicles, such as micro-drones, are expected to be a central component of a broad spectrum of applications ranging from exploration to surveillance and delivery. This class of vehicles is characterized by severe constraints in computing power and energy reservoir, which impairs their ability to support the complex state-of-the-art neural models needed for autonomous operations. The main contribution of this paper is a new class of neural navigation models -- NaviSlim -- capable of adapting the amount of resources spent on computing and sensing in response to the current context (i.e., difficulty of the environment, current trajectory, and navigation goals). Specifically, NaviSlim is designed as a gated slimmable neural network architecture that, different from existing slimmable networks, can dynamically select a slimming factor to autonomously scale model complexity, which consequently optimizes execution time and energy consumption. Moreover, different from existing sensor fusion approaches, NaviSlim can dynamically select power levels of onboard sensors to autonomously reduce power and time spent during sensor acquisition, without the need to switch between different neural networks. By means of extensive training and testing on the robust simulation environment Microsoft AirSim, we evaluate our NaviSlim models on scenarios with varying difficulty and a test set that showed a dynamic reduced model complexity on average between 57-92%, and between 61-80% sensor utilization, as compared to static neural networks designed to match computing and sensing of that required by the most difficult scenario.

URL PDF HTML ☆

赞 0 踩 0

2406.13086 2026-04-10 cs.RO cs.AI

NaviSplit: Dynamic Multi-Branch Split DNNs for Efficient Distributed Autonomous Navigation

Timothy K Johnsen, Ian Harshbarger, Zixia Xia, Marco Levorato

Comments 6 pages, 3 figures

详情

DOI: 10.1109/WoWMoM60985.2024.00041
Journal ref: 2024 IEEE 25th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)

英文摘要

Lightweight autonomous unmanned aerial vehicles (UAV) are emerging as a central component of a broad range of applications. However, autonomous navigation necessitates the implementation of perception algorithms, often deep neural networks (DNN), that process the input of sensor observations, such as that from cameras and LiDARs, for control logic. The complexity of such algorithms clashes with the severe constraints of these devices in terms of computing power, energy, memory, and execution time. In this paper, we propose NaviSplit, the first instance of a lightweight navigation framework embedding a distributed and dynamic multi-branched neural model. At its core is a DNN split at a compression point, resulting in two model parts: (1) the head model, that is executed at the vehicle, which partially processes and compacts perception from sensors; and (2) the tail model, that is executed at an interconnected compute-capable device, which processes the remainder of the compacted perception and infers navigation commands. Different from prior work, the NaviSplit framework includes a neural gate that dynamically selects a specific head model to minimize channel usage while efficiently supporting the navigation network. In our implementation, the perception model extracts a 2D depth map from a monocular RGB image captured by the drone using the robust simulator Microsoft AirSim. Our results demonstrate that the NaviSplit depth model achieves an extraction accuracy of 72-81% while transmitting an extremely small amount of data (1.2-18 KB) to the edge server. When using the neural gate, as utilized by NaviSplit, we obtain a slightly higher navigation accuracy as compared to a larger static network by 0.3% while significantly reducing the data rate by 95%. To the best of our knowledge, this is the first exemplar of dynamic multi-branched model based on split DNNs for autonomous navigation.

URL PDF HTML ☆

赞 0 踩 0

2406.12009 2026-04-10 cs.CL

FinTruthQA: A Benchmark for AI-Driven Financial Disclosure Quality Assessment in Investor -- Firm Interactions

Peilin Zhou, Ziyue Xu, Xinyu Shi, Jiageng Wu, Yikang Jiang, Dading Chong, Wang Dong, Jun Chen, Bin Ke, Jie Yang

2406.10521 2026-04-10 cs.LG cs.AI

MALLM-GAN: Multi-Agent Large Language Model as Generative Adversarial Network for Synthesizing Tabular Data

Yaobin Ling, Xiaoqian Jiang, Yejin Kim

2406.01857 2026-04-10 cs.LG cs.NA math.NA

Neural Green's Operators for Parametric Partial Differential Equations

Hugo Melchers, Joost Prins, Michael Abdelmalik

详情

DOI: 10.1016/j.cma.2026.118893
Journal ref: Computer Methods in Applied Mechanics and Engineering 455 (2026) 118893

英文摘要

This work introduces a paradigm for constructing parametric neural operators that are derived from finite-dimensional representations of Green's operators for linear partial differential equations (PDEs). We refer to such neural operators as Neural Green's Operators (NGOs). Our construction of NGOs preserves the linear action of Green's operators on the inhomogeneity fields, while approximating the nonlinear dependence of the Green's function on the coefficients of the PDE using neural networks. This construction reduces the complexity of the problem from learning the entire solution operator and its dependence on all parameters to only learning the Green's function and its dependence on the PDE coefficients. Furthermore, we show that our explicit representation of Green's functions enables the embedding of desirable mathematical attributes in our NGO architectures, such as symmetry, spectral, and conservation properties. Through numerical benchmarks on canonical PDEs, we demonstrate that NGOs achieve comparable or superior accuracy to Deep Operator Networks, Variationally Mimetic Operator Networks, and Fourier Neural Operators with similar parameter counts, while generalizing significantly better when tested on out-of-distribution data. For parametric time-dependent PDEs, we show that NGOs that are trained on a single time step can produce pointwise-accurate dynamics in an auto-regressive manner over arbitrarily large numbers of time steps. For parametric nonlinear PDEs, we demonstrate that NGOs trained exclusively on solutions of corresponding linear problems can be embedded within iterative solvers to yield accurate solutions, provided a suitable initial guess is available. Finally, we show that we can leverage the explicit representation of Green's functions returned by NGOs to construct effective matrix preconditioners that accelerate iterative solvers for PDEs.

URL PDF HTML ☆

赞 0 踩 0