arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2510.06687 2026-04-16 cs.CV cs.AI

Geometry-Aware Cross Modal Alignment for Light Field-LiDAR Semantic Segmentation

Jie Luo, Yuxuan Jiang, Xin Jin, Mingyu Liu, Yihui Fan

详情

英文摘要

Semantic segmentation serves as a cornerstone of scene understanding in autonomous driving but continues to face significant challenges under complex conditions such as occlusion. Light field and LiDAR modalities provide complementary visual and spatial cues that are beneficial for robust perception; however, their effective integration is hindered by limited viewpoint diversity and inherent modality discrepancies. To address these challenges, the first multimodal semantic segmentation dataset integrating light field data and point cloud data is proposed. Based on this dataset, we proposed a multi-modal light field point-cloud fusion segmentation network(Mlpfseg), incorporating feature completion and depth perception to segment both camera images and LiDAR point clouds simultaneously. The feature completion module addresses the density mismatch between point clouds and image pixels by performing differential reconstruction of point-cloud feature maps, enhancing the fusion of these modalities. The depth perception module improves the segmentation of occluded objects by reinforcing attention scores for better occlusion awareness. Our method outperforms image-only segmentation by 1.71 Mean Intersection over Union(mIoU) and point cloud-only segmentation by 2.38 mIoU, demonstrating its effectiveness.

URL PDF HTML ☆

赞 0 踩 0

2510.05056 2026-04-16 cs.LG

Modeling Student Learning with 3.8 Million Program Traces

Alexis Ross, Megha Srivastava, Jeremiah Blanchard, Jacob Andreas

Comments Accepted to 27th International Conference on AI in Education (AIED 2026)

2510.04995 2026-04-16 cs.LG cs.NA math.NA

Power Transform Revisited: Numerically Stable, and Federated

Xuefeng Xu, Graham Cormode

Comments AISTATS 2026. 24 pages, 17 figures, 4 tables. Project page see https://xuefeng-xu.github.io/powertf.html

2510.03988 2026-04-16 cs.LG cs.AI

The Signal is in the Steps: Local Scoring for Reasoning Data Selection

Hoang Anh Just, Myeongseob Ko, Ruoxi Jia

Comments Preprint

2510.01608 2026-04-16 cs.CV eess.SP math.OC

NPN: Non-Linear Projections of the Null-Space for Imaging Inverse Problems

Roman Jacome, Romario Gualdrón-Hurtado, Leon Suarez, Henry Arguello

Comments 25 pages, 12 tables, 10 figures. Accepted to NeurIPS 2025

详情

Journal ref: Proceedings of the The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

英文摘要

Imaging inverse problems aim to recover high-dimensional signals from undersampled, noisy measurements, a fundamentally ill-posed task with infinite solutions in the null-space of the sensing operator. To resolve this ambiguity, prior information is typically incorporated through handcrafted regularizers or learned models that constrain the solution space. However, these priors typically ignore the task-specific structure of that null-space. In this work, we propose Non-Linear Projections of the Null-Space (NPN), a novel class of regularization that, instead of enforcing structural constraints in the image domain, promotes solutions that lie in a low-dimensional projection of the sensing matrix's null-space with a neural network. Our approach has two key advantages: (1) Interpretability: by focusing on the structure of the null-space, we design sensing-matrix-specific priors that capture information orthogonal to the signal components that are fundamentally blind to the sensing process. (2) Flexibility: NPN is adaptable to various inverse problems, compatible with existing reconstruction frameworks, and complementary to conventional image-domain priors. We provide theoretical guarantees on convergence and reconstruction accuracy when used within plug-and-play methods. Empirical results across diverse sensing matrices demonstrate that NPN priors consistently enhance reconstruction fidelity in various imaging inverse problems, such as compressive sensing, deblurring, super-resolution, computed tomography, and magnetic resonance imaging, with plug-and-play methods, unrolling networks, deep image prior, and diffusion models.

URL PDF HTML ☆

赞 0 踩 0

2510.00573 2026-04-16 cs.RO

GRITS: A Spillage-Aware Guided Diffusion Policy for Robot Food Scooping Tasks

Yen-Ling Tai, Yi-Ru Yang, Kuan-Ting Yu, Yu-Wei Chao, Yi-Ting Chen

2509.25549 2026-04-16 cs.CV cs.AI cs.LG

Hybrid Approach for Enhancing Lesion Segmentation in Fundus Images

Mohammadmahdi Eshragh, Emad A. Mohammed, Behrouz Far, Ezekiel Weis, Carol L Shields, Sandor R Ferenczy, Trafford Crump

2509.22750 2026-04-16 cs.CL cs.AI

MARCH: Evaluating the Intersection of Ambiguity Interpretation and Multi-hop Inference

Jeonghyun Park, Ingeol Baek, Seunghyun Yoon, Haeun Jang, Aparna Garimella, Akriti Jain, Nedim Lipka, Hwanhee Lee

Comments ACL 2026 Findings

2509.21912 2026-04-16 cs.LG stat.ML

Discrete Guidance Matching: Exact Guidance for Discrete Flow Matching

Zhengyan Wan, Yidong Ouyang, Liyan Xie, Fang Fang, Hongyuan Zha, Guang Cheng

Comments Published as a conference paper at ICLR 2026

2509.21823 2026-04-16 cs.AI

ProRe: A Proactive Reward System for GUI Agents via Reasoner-Actor Collaboration

Gaole Dai, Shiqi Jiang, Ting Cao, Yuqing Yang, Yuanchun Li, Rui Tan, Mo Li, Lili Qiu

Comments 23 pages, 12 figures, ICLR'2026

2509.18847 2026-04-16 cs.CV cs.AI cs.CL

Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions

Junhao Su, Yuanliang Wan, Junwei Yang, Hengyu Shi, Tianyang Han, Junfeng Luo, Yurui Qiu

Comments ACL

2509.16445 2026-04-16 cs.RO

FiLM-Nav: Efficient and Generalizable Navigation via VLM Fine-tuning

Naoki Yokoyama, Sehoon Ha

2509.14566 2026-04-16 cs.CV

DICE: Diffusion Consensus Equilibrium for Sparse-view CT Reconstruction

Leon Suarez-Rodriguez, Roman Jacome, Romario Gualdron-Hurtado, Ana Mantilla-Dulcey, Henry Arguello

Comments 8 pages, 4 figures, confenrence

2509.07464 2026-04-16 cs.RO cs.SY eess.SY

Safe and Nonconservative Contingency Planning for Autonomous Vehicles via Online Learning-Based Reachable Set Barriers

Rui Yang, Lei Zheng, Shuzhi Sam Ge, Jun Ma

Comments 16 pages, 13 figures

详情

DOI: 10.1109/TCST.2026.3675339
Journal ref: IEEE Trans. Control Syst. Technol., 2026, pp.1-16

英文摘要

Autonomous vehicles must navigate dynamically uncertain environments while balancing safety and efficiency. This challenge is exacerbated by unpredictable human-driven vehicle (HV) behaviors and perception inaccuracies, necessitating planners that adapt to evolving uncertainties while maintaining safe trajectories. Overly conservative planning degrades driving efficiency, while deterministic methods risk failure in unexpected scenarios. To address these issues, we propose a real-time contingency trajectory optimization framework. Our method employs event-triggered online learning of HV control-intent sets to dynamically quantify multimodal HV uncertainties and incrementally refine their forward reachable sets (FRSs). Crucially, we enforce invariant safety through FRS-based barrier constraints that ensure safety without reliance on accurate trajectory prediction. These constraints are seamlessly embedded in contingency trajectory optimization and solved efficiently through consensus alternating direction method of multipliers (ADMM). The system continuously adapts to HV behavioral uncertainties, preserving feasibility and safety without excessive conservatism. High-fidelity simulations on highway and urban scenarios, along with a series of real-world experiments, demonstrate significant improvements in driving efficiency and passenger comfort while maintaining safety under uncertainty. The project page is available at https://pathetiue.github.io/frscp.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2509.06477 2026-04-16 cs.AI

MAS-Bench: A Unified Benchmark for Shortcut-Augmented Hybrid Mobile GUI Agents

Pengxiang Zhao, Guangyi Liu, YaoZhen Liang, Weiqing He, Zhengxi Lu, WenHao Wang, Yuehao Huang, Yuxiang Chai, Zhaolu Kang, Yaxuan Guo, Hao Wang, Kexin Zhang, Liang Liu, Yong Liu

2508.09532 2026-04-16 cs.LG cs.AI cs.NI

Decentralized Rank Scheduling for Energy-Constrained Multi-Task Federated Fine-Tuning in Edge-Assisted IoV Networks

Bokeng Zheng, Jianqiang Zhong, Jiayi Liu, Lei Xue, Xu Chen, Xiaoxi Zhang

2508.08791 2026-04-16 cs.CL cs.AI

Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments

Junjie Ye, Changhao Jiang, Zhengyin Du, Yufei Xu, Xuesong Yao, Zhiheng Xi, Xiaoran Fan, Qi Zhang, Tao Gui, Xuanjing Huang, Jiecao Chen

Comments Accepted by ACL 2026

2508.00222 2026-04-16 cs.AI cs.CL cs.LG

RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

Yihong Dong, Xue Jiang, Yongding Tao, Huanyu Liu, Kechi Zhang, Lili Mou, Rongyu Cao, Yingwei Ma, Jue Chen, Binhua Li, Zhi Jin, Fei Huang, Yongbin Li, Ge Li

Comments Accepted to ACL 2026 (main)

2507.18558 2026-04-16 cs.CV eess.IV

Synthetic Data Augmentation for Enhanced Chicken Carcass Instance Segmentation

Yihong Feng, Chaitanya Pallerla, Xiaomin Lin, Pouya Sohrabipour, Philip Crandall, Wan Shou, Yu She, Dongyi Wang

Comments Submitted for journal reviewing

2506.20083 2026-04-16 cs.CL

Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder

Yingji Zhang, Danilo S. Carvalho, André Freitas

Comments In progress

2506.09207 2026-04-16 cs.LG cs.NA math.NA

mLaSDI: Multi-stage latent space dynamics identification

William Anderson, Seung Whan Chung, Robert Stephany, Youngsoo Choi

2506.06558 2026-04-16 cs.LG cs.NE

Rapid training of Hamiltonian graph networks using random features

Atamert Rahma, Chinmay Datar, Ana Cukarska, Felix Dietrich

Comments Accepted to ICLR 2026

2506.03610 2026-04-16 cs.AI

Orak: A Foundational Benchmark for Training and Evaluating LLM Agents on Diverse Video Games

Dongmin Park, Minkyu Kim, Beongjun Choi, Junhyuck Kim, Keon Lee, Jonghyun Lee, Inkyu Park, Byeong-Uk Lee, Jaeyoung Hwang, Jaewoo Ahn, Ameya S. Mahabaleshwarkar, Bilal Kartal, Pritam Biswas, Yoshi Suhara, Kangwook Lee, Jaewoong Cho

2505.19054 2026-04-16 cs.LG

RANDPOL: Parameter-Efficient End-to-End Quadruped Locomotion via Randomized Policy Learning

Zhuochen Liu, Rahul Jain, Quan Nguyen

Comments 6 pages main, 7 pages total, 10 figures

2505.10101 2026-04-16 cs.SD cs.AI cs.GR cs.MM eess.AS

LAV: Audio-Driven Dynamic Visual Generation with Neural Compression and StyleGAN2

Jongmin Jung, Dasaem Jeong

Comments Paper accepted at ISEA 2025, The 30th International Symposium on Electronic/Emerging Art, Seoul, Republic of Korea, 23 - 29 May 2025

2505.07591 2026-04-16 cs.CL cs.AI

MulDimIF: A Multi-Dimensional Constraint Framework for Evaluating and Improving Instruction Following in Large Language Models

Junjie Ye, Caishuang Huang, Zhuohan Chen, Wenjie Fu, Chenyuan Yang, Leyi Yang, Yilong Wu, Peng Wang, Meng Zhou, Xiaolong Yang, Tao Gui, Qi Zhang, Zhongchao Shi, Jianping Fan, Xuanjing Huang

Comments Accepted by ACL 2026

2505.03280 2026-04-16 cs.LG

MDPs with a State Sensing Cost

Vansh Kapoor, Jayakrishnan Nair

Comments Accepted at AISTATS 2026

2505.00598 2026-04-16 cs.LG cs.AI

Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Haozheng Luo, Chenghao Qiu, Maojiang Su, Zhihan Zhou, Zoe Mehta, Guo Ye, Jerry Yao-Chieh Hu, Han Liu

Comments International Conference on Machine Learning (ICML) 2025

2504.15801 2026-04-16 cs.CL cs.AI cs.CY

A closer look at how large language models trust humans: patterns and biases

Valeria Lerman, Yaniv Dover

详情

DOI: 10.1098/rspa.2025.1113
Journal ref: Proceedings of the Royal Society A 482 2335 20251113 (2026)

英文摘要

As large language models (LLMs) and LLM-based agents increasingly interact with humans in decision-making contexts, understanding the trust dynamics between humans and AI agents becomes a central concern. While considerable literature studies how humans trust AI agents, it is much less understood how LLM-based agents develop effective trust in humans. LLM-based agents likely rely on some sort of implicit effective trust in trust-related contexts (e.g., evaluating individual loan applications) to assist and affect decision making. Using established behavioral theories, we develop an approach that studies whether LLMs trust depends on the three major trustworthiness dimensions: competence, benevolence and integrity of the human subject. We also study how demographic variables affect effective trust. Across 43,200 simulated experiments, for five popular language models, across five different scenarios we find that LLM trust development shows an overall similarity to human trust development. We find that in most, but not all cases, LLM trust is strongly predicted by trustworthiness, and in some cases also biased by age, religion and gender, especially in financial scenarios. This is particularly true for scenarios common in the literature and for newer models. While the overall patterns align with human-like mechanisms of effective trust formation, different models exhibit variation in how they estimate trust; in some cases, trustworthiness and demographic factors are weak predictors of effective trust. These findings call for a better understanding of AI-to-human trust dynamics and monitoring of biases and trust development patterns to prevent unintended and potentially harmful outcomes in trust-sensitive applications of AI.

URL PDF HTML ☆

赞 0 踩 0

2503.23137 2026-04-16 cs.CV cs.CL

When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning?

Tuo Liang, Zhe Hu, Jing Li, Hao Zhang, Yiren Lu, Yunlai Zhou, Yiran Qiao, Disheng Liu, Jeirui Peng, Jing Ma, Yu Yin