arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.05623 2026-03-09 cs.CV cs.AI

Post Fusion Bird's Eye View Feature Stabilization for Robust Multimodal 3D Detection

Trung Tien Dong, Dev Thakkar, Arman Sargolzaei, Xiaomin Lin

详情

英文摘要

Camera-LiDAR fusion is widely used in autonomous driving to enable accurate 3D object detection. However, bird's-eye view (BEV) fusion detectors can degrade significantly under domain shift and sensor failures, limiting reliability in real-world deployment. Existing robustness approaches often require modifying the fusion architecture or retraining specialized models, making them difficult to integrate into already deployed systems. We propose a Post Fusion Stabilizer (PFS), a lightweight module that operates on intermediate BEV representations of existing detectors and produces a refined feature map for the original detection head. The design stabilizes feature statistics under domain shift, suppresses spatial regions affected by sensor degradation, and adaptively restores weakened cues through residual correction. Designed as a near-identity transformation, PFS preserves performance while improving robustness under diverse camera and LiDAR corruptions. Evaluations on the nuScenes benchmark demonstrate that PFS achieves state-of-the-art results in several failure modes, notably improving camera dropout robustness by +1.2% and low-light performance by +4.4% mAP while maintaining a lightweight footprint of only 3.3 M parameters.

URL PDF HTML ☆

赞 0 踩 0

2603.05618 2026-03-09 cs.CL

Safer Reasoning Traces: Measuring and Mitigating Chain-of-Thought Leakage in LLMs

Patrick Ahrend, Tobias Eder, Xiyang Yang, Zhiyi Pan, Georg Groh

2603.05617 2026-03-09 cs.CL

NOTAI.AI: Explainable Detection of Machine-Generated Text via Curvature and Feature Attribution

Oleksandr Marchenko Breneur, Adelaide Danilov, Aria Nourbakhsh, Salima Lamsiyah

Comments 8 pages, 7 figures

2603.05614 2026-03-09 cs.AI

Real-Time AI Service Economy: A Framework for Agentic Computing Across the Continuum

Lauri Lovén, Alaa Saleh, Reza Farahani, Ilir Murturi, Miguel Bordallo López, Praveen Kumar Donta, Schahram Dustdar

详情

英文摘要

Real-time AI services increasingly operate across the device-edge-cloud continuum, where autonomous AI agents generate latency-sensitive workloads, orchestrate multi-stage processing pipelines, and compete for shared resources under policy and governance constraints. This article shows that the structure of service-dependency graphs, modelled as DAGs whose nodes represent compute stages and whose edges encode execution ordering, is a primary determinant of whether decentralised, price-based resource allocation can work reliably at scale. When dependency graphs are hierarchical (tree or series-parallel), prices converge to stable equilibria, optimal allocations can be computed efficiently, and under appropriate mechanism design (with quasilinear utilities and discrete slice items), agents have no incentive to misreport their valuations within each decision epoch. When dependencies are more complex, with cross-cutting ties between pipeline stages, prices oscillate, allocation quality degrades, and the system becomes difficult to manage. To bridge this gap, we propose a hybrid management architecture in which cross-domain integrators encapsulate complex sub-graphs into resource slices that present a simpler, well-structured interface to the rest of the market. A systematic ablation study across six experiments (1,620 runs, 10 seeds each) confirms that (i) dependency-graph topology is a first-order determinant of price stability and scalability,(ii) the hybrid architecture reduces price volatility by up to 70-75% without sacrificing throughput, (iii) governance constraints create quantifiable efficiency-compliance trade-offs that depend jointly on topology and load, and (iv) under truthful bidding the decentralised market matches a centralised value-optimal baseline, confirming that decentralised coordination can replicate centralised allocation quality.

URL PDF HTML ☆

赞 0 踩 0

2603.05607 2026-03-09 cs.CV cs.AI

DreamCAD: Scaling Multi-modal CAD Generation using Differentiable Parametric Surfaces

Mohammad Sadil Khan, Muhammad Usama, Rolandos Alexandros Potamias, Didier Stricker, Muhammad Zeshan Afzal, Jiankang Deng, Ismail Elezi

Comments For Caption Dataset: https://huggingface.co/datasets/SadilKhan/CADCap-1M

2603.05604 2026-03-09 cs.CV cs.LG cs.RO

From Decoupled to Coupled: Robustness Verification for Learning-based Keypoint Detection with Joint Specifications

Xusheng Luo, Changliu Liu

Comments 21 pages, 4 figures, 9 tables. arXiv admin note: text overlap with arXiv:2408.00117

2603.05591 2026-03-09 cs.CV

Thinking with Spatial Code for Physical-World Video Reasoning

Jieneng Chen, Wenxin Ma, Ruisheng Yuan, Yunzhi Zhang, Jiajun Wu, Alan Yuille

Comments Code at https://github.com/Beckschen/spatialcode

2603.05581 2026-03-09 cs.LG cs.AI eess.SP

Spatiotemporal Heterogeneity of AI-Driven Traffic Flow Patterns and Land Use Interaction: A GeoAI-Based Analysis of Multimodal Urban Mobility

Olaf Yunus Laitinen Imanov

Comments 13 pages, 7 figures, 9 tables. Submitted to Computers, Environment and Urban Systems (Elsevier)

2603.05579 2026-03-09 cs.LG math.OC

A Novel Hybrid Heuristic-Reinforcement Learning Optimization Approach for a Class of Railcar Shunting Problems

Ruonan Zhao, Joseph Geunes

2603.05577 2026-03-09 cs.SD cs.LG

Koopman Regularized Deep Speech Disentanglement for Speaker Verification

Nikos Chazaridis, Mohammad Belal, Rafael Mestre, Timothy J. Norman, Christine Evers

Comments This work has been submitted to the IEEE for possible publication

2603.05574 2026-03-09 cs.RO cs.AI

PRISM: Personalized Refinement of Imitation Skills for Manipulation via Human Instructions

Arnau Boix-Granell, Alberto San-Miguel-Tello, Magí Dalmau-Moreno, Néstor García

Comments 10 pages, 3 figures, Accepted for publication at European Robotics Forum 2026

2603.05567 2026-03-09 cs.LG

FuseDiff: Symmetry-Preserving Joint Diffusion for Dual-Target Structure-Based Drug Design

Jianliang Wu, Anjie Qiao, Zhen Wang, Zhewei Wei, Sheng Chen

2603.05566 2026-03-09 cs.LG cs.CL

Aligning the True Semantics: Constrained Decoupling and Distribution Sampling for Cross-Modal Alignment

Xiang Ma, Lexin Fang, Litian Xu, Caiming Zhang

Comments AAAI 2026 poster

2603.05559 2026-03-09 cs.LG cs.ET math.PR physics.optics

Autocorrelation effects in a stochastic-process model for decision making via time series

Tomoki Yamagami, Mikio Hasegawa, Takatomo Mihana, Ryoichi Horisaki, Atsushi Uchida

Comments 21 pages, 10 figures

2603.05552 2026-03-09 cs.RO

TEGA: A Tactile-Enhanced Grasping Assistant for Assistive Robotics via Sensor Fusion and Closed-Loop Haptic Feedback

Hengxu You, Tianyu Zhou, Fang Xu, Kaleb Smith, Eric Jing Du

Comments Accepted to include in ICRA 2026

2603.05546 2026-03-09 cs.RO cs.CV

Digital-Twin Losses for Lane-Compliant Trajectory Prediction at Urban Intersections

Kuo-Yi Chao, Erik Leo Haß, Melina Gegg, Jiajie Zhang, Ralph Raßhofer, Alois Christian Knoll

Comments 7 pages, 2 figures, conference

2603.05540 2026-03-09 cs.CL cs.FL cs.LG

Attention Meets Reachability: Structural Equivalence and Efficiency in Grammar-Constrained LLM Decoding

Faruk Alpay, Bilge Senturk

Comments 20 pages

2603.05519 2026-03-09 cs.CL cs.HC cs.IR

Verify as You Go: An LLM-Powered Browser Extension for Fake News Detection

Dorsaf Sallami, Esma Aïmeur

2603.05517 2026-03-09 cs.LG cs.AI cs.CR cs.SE

Traversal-as-Policy: Log-Distilled Gated Behavior Trees as Externalized, Verifiable Policies for Safe, Robust, and Efficient Agents

Peiran Li, Jiashuo Sun, Fangzhou Lin, Shuo Xing, Tianfu Fu, Suofei Feng, Chaoqun Ni, Zhengzhong Tu

Comments 30 pages, 1 figurres, 23 tables

2603.05504 2026-03-09 cs.RO cs.AI cs.LG

RoboPocket: Improve Robot Policies Instantly with Your Phone

Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu

Comments Project page: https://robo-pocket.github.io

2603.05497 2026-03-09 cs.RO

Safe-SAGE: Social-Semantic Adaptive Guidance for Safe Engagement through Laplace-Modulated Poisson Safety Functions

Lizhi Yang, Ryan M. Bena, Meg Wilkinson, Gilbert Bahati, Andy Navarro Brenes, Ryan K. Cosner, Aaron D. Ames

Comments 8 pages

2603.05355 2026-03-09 cs.RO

OmniDP: Beyond-FOV Large-Workspace Humanoid Manipulation with Omnidirectional 3D Perception

Pei Qu, Zheng Li, Yufei Jia, Ziyun Liu, Liang Zhu, Haoang Li, Jinni Zhou, Jun Ma

Comments 8 pages, 6 figures

2603.05272 2026-03-09 cs.CL cs.HC

Oral to Web: Digitizing 'Zero Resource'Languages of Bangladesh

Mohammad Mamun Or Rashid

详情

英文摘要

We present the Multilingual Cloud Corpus, the first national-scale, parallel, multimodal linguistic dataset of Bangladesh's ethnic and indigenous languages. Despite being home to approximately 40 minority languages spanning four language families, Bangladesh has lacked a systematic, cross-family digital corpus for these predominantly oral, computationally "zero resource" varieties, 14 of which are classified as endangered. Our corpus comprises 85792 structured textual entries, each containing a Bengali stimulus text, an English translation, and an IPA transcription, together with approximately 107 hours of transcribed audio recordings, covering 42 language varieties from the Tibeto-Burman, Indo-European, Austro-Asiatic, and Dravidian families, plus two genetically unclassified languages. The data were collected through systematic fieldwork over 90 days across nine districts of Bangladesh, involving 16 data collectors, 77 speakers, and 43 validators, following a predefined elicitation template of 2224 unique items organized at three levels of linguistic granularity: isolated lexical items (475 words across 22 semantic domains), grammatical constructions (887 sentences across 21 categories including verbal conjugation paradigms), and directed speech (862 prompts across 46 conversational scenarios). Post-field processing included IPA transcription by 10 linguists with independent adjudication by 6 reviewers. The complete dataset is publicly accessible through the Multilingual Cloud platform (multiling.cloud), providing searchable access to annotated audio and textual data for all documented varieties. We describe the corpus design, fieldwork methodology, dataset structure, and per-language coverage, and discuss implications for endangered language documentation, low-resource NLP, and digital preservation in linguistically diverse developing countries.

URL PDF HTML ☆

赞 0 踩 0

2603.05193 2026-03-09 cs.CL cs.FL

Transducing Language Models

Vésteinn Snæbjarnarson, Samuel Kiegeland, Tianyu Liu, Reda Boumasmoud, Ryan Cotterell, Tim Vieira

2603.05078 2026-03-09 cs.CV

MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Juntong Fang, Zequn Chen, Weiqi Zhang, Donglin Di, Xuancheng Zhang, Chengmin Yang, Yu-Shen Liu

Comments Accepted by CVPR 2026. Project page:https://hellexf.github.io/MoRe/

2603.04992 2026-03-09 cs.CL

ThaiSafetyBench: Assessing Language Model Safety in Thai Cultural Contexts

Trapoom Ukarapol, Nut Chukamphaeng, Kunat Pipatanakul, Pakhapoom Sarapat

Comments ICLR 2026 Workshop on Principled Design for Trustworthy AI

详情

英文摘要

The safety evaluation of large language models (LLMs) remains largely centered on English, leaving non-English languages and culturally grounded risks underexplored. In this work, we investigate LLM safety in the context of the Thai language and culture and introduce ThaiSafetyBench, an open-source benchmark comprising 1,954 malicious prompts written in Thai. The dataset covers both general harmful prompts and attacks that are explicitly grounded in Thai cultural, social, and contextual nuances. Using ThaiSafetyBench, we evaluate 24 LLMs, with GPT-4.1 and Gemini-2.5-Pro serving as LLM-as-a-judge evaluators. Our results show that closed-source models generally demonstrate stronger safety performance than open-source counterparts, raising important concerns regarding the robustness of openly available models. Moreover, we observe a consistently higher Attack Success Rate (ASR) for Thai-specific, culturally contextualized attacks compared to general Thai-language attacks, highlighting a critical vulnerability in current safety alignment methods. To improve reproducibility and cost efficiency, we further fine-tune a DeBERTa-based harmful response classifier, which we name ThaiSafetyClassifier. The model achieves a weighted F1 score of 84.4%, matching GPT-4.1 judgments. We publicly release the fine-tuning weights and training scripts to support reproducibility. Finally, we introduce the ThaiSafetyBench leaderboard to provide continuously updated safety evaluations and encourage community participation. - ThaiSafetyBench HuggingFace Dataset: https://huggingface.co/datasets/typhoon-ai/ThaiSafetyBench - ThaiSafetyBench Github: https://github.com/trapoom555/ThaiSafetyBench - ThaiSafetyClassifier HuggingFace Model: https://huggingface.co/typhoon-ai/ThaiSafetyClassifier - ThaiSafetyBench Leaderboard: https://huggingface.co/spaces/typhoon-ai/ThaiSafetyBench-Leaderboard

URL PDF HTML ☆

赞 0 踩 0

2603.04897 2026-03-09 cs.CL

Can LLMs Capture Expert Uncertainty? A Comparative Analysis of Value Alignment in Ethnographic Qualitative Research

Arina Kostina, Marios Dikaiakos, Alejandro Porcel, Tassos Stassopoulos

Comments Accepted for a poster session at BIG.AI at MIT 2026

2603.04873 2026-03-09 cs.AI

SEA-TS: Self-Evolving Agent for Autonomous Code Generation of Time Series Forecasting Algorithms

Longkun Xu, Xiaochun Zhang, Qiantu Tuo, Rui Li

2603.04756 2026-03-09 cs.AI cs.CE cs.SE

MOOSEnger -- a Domain-Specific AI Agent for the MOOSE Ecosystem

Mengnan Li, Jason Miller, Zachary Prince, Alexander Lindsay, Cody Permann

2603.04413 2026-03-09 cs.CL cs.AI

Simulating Meaning, Nevermore! Introducing ICR: A Semiotic-Hermeneutic Metric for Evaluating Meaning in LLM Text Summaries

Natalie Perez, Sreyoshi Bhaduri, Aman Chadha