arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.26553 2026-04-30 cs.CL cs.AI cs.LG

TLPO: Token-Level Policy Optimization for Mitigating Language Confusion in Large Language Models

Jinho Choo, JunSeung Lee, Jimyeong Kim, Yeeho Song, S. K. Hong, Yeong-Dae Kwon

Comments Accepted to the main conference of ACL 2026

详情

英文摘要

Large language models (LLMs) demonstrate strong multilingual capabilities, yet often fail to consistently generate responses in the intended language, exhibiting a phenomenon known as language confusion. Prior mitigation approaches based on sequence-level fine-tuning, such as DPO, ORPO, and GRPO, operate at the level of entire responses and can lead to unintended degradation of general model capabilities, motivating the need for more fine-grained alternatives. To address this, we introduce Token-Level Policy Optimization (TLPO), a fine-tuning framework designed to mitigate language confusion through localized, token-level updates. TLPO identifies error-prone positions, explores alternative candidate tokens, and updates the policy using a tailored objective to suppress error-inducing outputs at a granular level. This selective intervention enables effective mitigation of language confusion without compromising the model's general abilities. Experiments on multiple multilingual LLMs across diverse languages demonstrate that TLPO significantly outperforms baselines in improving language consistency while preserving downstream task accuracy.

URL PDF HTML ☆

赞 0 踩 0

2604.26550 2026-04-30 cs.LG

Large-scale semi-supervised learning with online spectral graph sparsification

Daniele Calandriello, Alessandro Lazaric, Michal Valko

Comments Workshop on Resource-Efficient Machine Learning (REML), ICML 2015

2604.26522 2026-04-30 cs.AI cs.LG cs.LO cs.MA cs.SC

AGEL-Comp: A Neuro-Symbolic Framework for Compositional Generalization in Interactive Agents

Mahnoor Shahid, Hannes Rothe

Comments Accepted at IntelliSys 2026

2604.26521 2026-04-30 cs.AI cs.CV cs.LG cs.LO

Grounding vs. Compositionality: On the Non-Complementarity of Reasoning in Neuro-Symbolic Systems

Mahnoor Shahid, Hannes Rothe

Comments Accepted at AAAI MAKE 2026

2604.26520 2026-04-30 cs.CV

3D-LENS: A 3D Lifting-based Elevated Novel-view Synthesis method for Single-View Aerial-Ground Re-Identification

William Grolleau, Astrid Sabourin, Guillaume Lapouge, Catherine Achard

2604.26519 2026-04-30 cs.CV

GIFGuard: Proactive Forensics against Deepfakes in Facial GIFs via Spatiotemporal Watermarking

Shupeng Che, Zhiqing Guo, Changtao Miao, Dan Ma, Gaobo Yang

2604.26517 2026-04-30 cs.CV q-bio.CB

MTCurv: Deep learning for direct microtubule curvature mapping in noisy fluorescence microscopy images

Achraf Ait Laydi, Sidi Mohamed Sid'El Moctar, Yousef El Mourabit, Hélène Bouvrais

Comments Accepted for presentation at the International Conference on Pattern Recognition (ICPR) 2026

详情

英文摘要

Accurate quantification of the geometry of curvilinear biological structures is essential for understanding cellular mechanics and disease-related morphological alterations. Microtubule curvature is a key descriptor of filament rigidity and mechanical perturbations. However, reliable curvature extraction from fluorescence microscopy images remains challenging due to noise, low contrast, and partial filament visibility. Existing approaches rely on segmentation pipelines with pre or post-processing, which are highly sensitive to segmentation errors and often fail under adverse imaging conditions. In this work, we propose MTCurv, a deep learning framework for direct, segmenta-tion-free regression of microtubule curvature maps from noisy microscopy images. Leveraging a synthetic dataset with pixel-wise curvature annotations, we reformulated curvature estimation as a regression problem and adapted an attention-based residual U-Net. To reduce hallucinations and enforce spatial coherence, we introduced a gradient-aware loss combining Mean Squared Error with a gradient consistency term. Beyond model and loss design, we evaluated commonly used regression and image quality metrics, revealing that many perceptual and blind metrics are poorly suited for curvature estimation. Correlation-based metrics, particularly Spearman correlation, emerged as more reliable indicators of curvature prediction quality. Experiments on two datasets of increasing difficulty demonstrated that MTCurv accurately recovers local microtubule curvatures, even in the presence of background fluorescence. Ablation studies highlighted the contribution of both residual encoding and attention-based decoding. Overall, this work provides a practical tool for filament curvature analysis and methodological insights for geometry-aware regression in biomedical imaging. Datasets and code are made available.

URL PDF HTML ☆

赞 0 踩 0

2604.26516 2026-04-30 cs.LG cs.AI

Lyapunov-Guided Self-Alignment: Test-Time Adaptation for Offline Safe Reinforcement Learning

Seungyub Han, Hyungjin Kim, Jungwoo Lee

Comments Accepted at AISTATS 2026. First two authors contributed equally. Project page: https://seungyubhan.github.io/sas/. Code: https://github.com/seungyubhan/sas

2604.26514 2026-04-30 cs.CL cs.AI cs.NE

Text-Utilization for Encoder-dominated Speech Recognition Models

Albert Zeyer, Tim Posielek, Ralf Schlüter, Hermann Ney

2604.26508 2026-04-30 cs.LG cs.AI cs.CV cs.DC cs.NI

Progressive Semantic Communication for Efficient Edge-Cloud Vision-Language Models

Cyril Shih-Huan Hsu, Wig Yuan-Cheng Cheng, Chrysa Papagianni

Comments Under review. Extended version with additional figures and appendices

2604.26507 2026-04-30 cs.AI

Auto-Relational Reasoning

Ioannis Konstantoulas, Dimosthenis Tsimas, Pavlos Peppas, Kyriakos Sgarbas

Comments Submitted to JAIR

2604.26504 2026-04-30 cs.RO

HiPAN: Hierarchical Posture-Adaptive Navigation for Quadruped Robots in Unstructured 3D Environments

Jeil Jeong, Minsung Yoon, Seokryun Choi, Heechan Shin, Taegeun Yang, Sung-eui Yoon

Comments Accepted to RA-L 2026 | Project page: https://sgvr.kaist.ac.kr/~Jeil/project_page_HiPAN/

2604.26503 2026-04-30 cs.CV

Delta Score Matters! Spatial Adaptive Multi Guidance in Diffusion Models

Haosen Li, Wenshuo Chen, Lei Wang, Shaofeng Liang, Bowen Tian, Soning Lai, Yutao Yue

2604.26501 2026-04-30 cs.CL cs.AI cs.HC

Tree-of-Text: A Tree-based Prompting Framework for Table-to-Text Generation in the Sports Domain

Shang-Hsuan Chiang, Tsan-Tsung Yang, An-Zi Yen, Wen-Chih Peng

Comments Accepted by ACL SRW 2025: Long Paper (Oral)

2604.26500 2026-04-30 cs.CL

StarDrinks: An English and Korean Test Set for SLU Evaluation in a Drink Ordering Scenario

Marcely Zanon Boito, Caroline Brun, Inyoung Kim, Denys Proux, Salah Ait-Mokhtar, Nikolaos Lagos, Jean-Luc Meunier, Ioan Calapodescu

Comments Accepted at LREC 2026

2604.26496 2026-04-30 cs.CV

Robust Alignment: Harmonizing Clean Accuracy and Adversarial Robustness in Adversarial Training

Yanyun Wang, Qingqing Ye, Li Liu, Zi Liang, Haibo Hu

Comments 2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition - Findings Track (CVPR'26 Findings)

2604.26489 2026-04-30 cs.LG cs.IR

Understanding DNNs in Feature Interaction Models: A Dimensional Collapse Perspective

Jiancheng Wang, Mingjia Yin, Hao Wang, Enhong Chen

Comments 6 pages

2604.26488 2026-04-30 cs.CV cs.LG

Featurising Pixels from Dynamic 3D Scenes with Linear In-Context Learners

Nikita Araslanov, Martin Sundermeyer, Hidenobu Matsuki, David Joseph Tan, Federico Tombari

Comments To appear at CVPR 2026 (oral). Project website: https://lila-pixels.github.io

2604.26478 2026-04-30 cs.CV

Cross-Domain Transfer of Hyperspectral Foundation Models

Nick Theisen, Peer Neubert

Comments Accepted for publication at ICPR 2026

2604.26473 2026-04-30 cs.RO

Alter-Art: Exploring Embodied Artistic Creation through a Robot Avatar

Do Won Park, Samuele Bordini, Giorgio Grioli, Manuel G. Catalano, Antonio Bicchi

Comments 12 pages, 6 figures

2604.26470 2026-04-30 cs.LG

Hierarchical adaptive control for real-time dynamic inference at the edge

Francesco Daghero, Mahyar Tourchi Moghaddam, Mikkel Baun Kjærgaard

Comments Accepted as paper at 5th Real-time And intelliGent Edge computing (RAGE 2026) workshop

详情

英文摘要

Industrial systems increasingly depend on Machine Learning (ML), and operate on heterogeneous nodes that must satisfy tight latency, energy, and memory constraints. Dynamic ML models, which reconfigure their computational footprint at runtime, promise high energy efficiency and lower average latency for modest accuracy tradeoffs; however, their deployment is complex due to the additional hyperparameters they rely on. These hyperparameters, controlling the accuracy versus average latency tradeoff, are often tuned on a calibration dataset that must match the test time distribution, an assumption that rarely holds in real-world scenarios, leading to suboptimal operational conditions, possibly below static models. We propose a two-tier adaptive architecture that co-optimizes model and system decisions. At the global level, a scheduler configures and deploys, for each edge node, a cascade of classifiers composed of lightweight specialized models and a generalist fallback, satisfying latency and memory constraints. At the node level, a local controller tracks data drifts and hardware resources, enabling or disabling specialized predictors (SP) to preserve high energy efficiency and avoid latency-constraint violations under varying conditions. This design allows longer operating times without forcing a global redeployment step, and enables efficient execution in case of an unreachable remote global controller. We evaluate the approach on two datasets under controlled distribution mismatch scenarios, showing average per-inference reductions of latency up to 2.45x and energy up to 2.86x, with less than 4% accuracy drop compared to static baselines. Our contributions are:(1) a budgeted SP-cascade formulation that preserves worst-case latency constraints;(2) a hierarchical controller that maintains efficiency under data and resource changes; and (3) an experimental evaluation on embedded hardware.

URL PDF HTML ☆

赞 0 踩 0

2604.26465 2026-04-30 cs.SD

Diffusion Reconstruction towards Generalizable Audio Deepfake Detection

Bo Cheng, Songjun Cao, Xiaoming Zhang, Jie Chen, Long Ma, Fei Chen

Comments 5 pages, this paper was submitted to Interspeech2026 for review

2604.26462 2026-04-30 cs.CV

A Multistage Extraction Pipeline for Long Scanned Financial Documents: An Empirical Study in Industrial KYC Workflows

Yuxuan Han, Yuanxing Zhang, Yushuo Wang, Yichao Jin, Kenneth Zhu Ke, Jingyuan Zhao

2604.26461 2026-04-30 cs.CV

$\text{PKS}^4$:Parallel Kinematic Selective State Space Scanners for Efficient Video Understanding

Lingjie Zeng, Hailun Zhang, Xiwen Wang, Qijun Zhao

2604.26460 2026-04-30 cs.CL

Theory-Grounded Evaluation Exposes the Authorship Gap in LLM Personalization

Yash Ganpat Sawant

Comments 6 pages, 2 figures, 2 tables

2604.26456 2026-04-30 cs.CL cs.AI

Naamah: A Large Scale Synthetic Sanskrit NER Corpus via DBpedia Seeding and LLM Generation

Akhil Rajeev P, Annarao Kulkarni

2604.26454 2026-04-30 cs.CV

Last-Layer-Centric Feature Recombination: Unleashing 3D Geometric Knowledge in DINOv3 for Monocular Depth Estimation

Gongshu Wang, Zhirui Wang, Kan Yang

Comments 18page, 6 figure, 6 table

2604.26453 2026-04-30 cs.CV

Attribution-Guided Multimodal Deepfake Detection via Cross-Modal Forensic Fingerprints

Wasim Ahmad, Wei Zhang, Xuerui Mao

2604.26446 2026-04-30 cs.LG

Near-Optimal Cryptographic Hardness of Learning With Homogeneous Halfspaces Under Gaussian Marginals

Jizhou Huang, Brendan Juba

2604.26444 2026-04-30 cs.LG

Layer-wise Lipschitz-Product Control for Deep Kolmogorov--Arnold Network Representations of Compositionally Structured Functions

Aleksander Tankman

Comments 15 pages, theoretical note on layer-wise Lipschitz control for deep KANs