arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.06323 2026-05-08 cs.RO

AssistDLO: Assistive Teleoperation for Deformable Linear Object Manipulation

Berk Guler, Simon Manschitz, Kay Pompetzki, Jan Peters

Comments 20 pages, 14 figures. Submitted to a peer-reviewed journal

详情

英文摘要

Manipulating Deformable Linear Objects (DLOs) is challenging in robotics due to their infinite-dimensional configuration space and complex nonlinear dynamics. In teleoperation, depth uncertainty hinders state perception and reaction. AssistDLO addresses this challenge as an assistive teleoperation framework for DLO manipulation that combines real-time multi-view state estimation, visual assistance (VA), and a geometry-aware shared-autonomy controller based on Control Barrier Functions (SA-CBF). While traditional shared autonomy methods often rely on simple geometric attractors and may fail to preserve DLO geometry, SA-CBF acts as a geometry-aware funnel, facilitating precise grasping while preserving the operator's high-level authority. The framework is evaluated in a bimanual knot-untangling user study (N = 22) using ropes with varying length and rigidity. Results show that the effectiveness of the assistance depends strongly on operator expertise and DLO properties. SA-CBF provides the strongest gains for naive users, acting as a skill equalizer that increases task success from 71% to 88%, and is effective for stiffer ropes. Conversely, expert users prefer VA, and highly compliant, long ropes benefit more from visual support than localized action assistance. Ultimately, these findings demonstrate that effective DLO teleoperation cannot rely on a fixed strategy, highlighting the critical need for adaptive, user-aware, and material-aware shared autonomy.

URL PDF HTML ☆

赞 0 踩 0

2605.06318 2026-05-08 cs.CL cs.CY

Who and What? Using Linguistic Features and Annotator Characteristics to Analyze Annotation Variation

Maximilian Maurer, Maximilian Linde, Gabriella Lapesa

2605.06316 2026-05-08 cs.LG cs.AI

Pro-KLShampoo: Projected KL-Shampoo with Whitening Recovered by Orthogonalization

Ruotong Sun, Ermin Wei

2605.06311 2026-05-08 cs.RO

Toward Visually Realistic Simulation: A Benchmark for Evaluating Robot Manipulation in Simulation

Yixin Zhu, Zixiong Wang, Jian Yang, Jin Xie, Jingyi Yu, Jiayuan Gu, Beibei Wang

2605.06310 2026-05-08 cs.LG

Perceive, Route and Modulate: Dynamic Pattern Recalibration for Time Series Forecasting

Siru Zhong, Zhao Meng, Haohuan Fu, Haoyang Li, Qingsong Wen, Yuxuan Liang

Comments 22 pages, 6 figures. Preprint

2605.06308 2026-05-08 cs.AI

Measuring Black-Box Confidence via Reasoning Trajectories: Geometry, Coverage, and Verbalization

Marc Boubnovski Martell, Josefa Lia Stoisser, Kaspar Märtens, Jialin Yu, Robert Kitchen, Philip Torr, Jesper Ferkinghoff-Borg

2605.06305 2026-05-08 cs.AI cs.IR

Addressing Labelled Data Scarcity: Taxonomy-Agnostic Annotation of PII Values in HTTP Traffic using LLMs

Thomas Cory, Axel Küpper

Comments Accepted to 2026 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)

2605.06303 2026-05-08 cs.LG

Molecules Meet Language: Confound-Aware Representation Learning and Chemical Property Steering in Transformer-VAE Latent Spaces

Zakaria Elabid, Jan Andrzejewski, Bartosz Brzoza, Attila Cangi

2605.06295 2026-05-08 cs.LG cs.AI stat.ML

Attributions All the Way Down? The Metagame of Interpretability

Hubert Baniecki, Przemyslaw Biecek, Fabian Fumagalli

2605.06294 2026-05-08 cs.CL cs.AI cs.LG

Log-Likelihood, Simpson's Paradox, and the Detection of Machine-Generated Text

Tom Kempton, Viktor Drobnyi, Maeve Madigan, Stuart Burrell

Comments 10 pages, 3 figures, 2 tables, 11 appendices

详情

英文摘要

The ability to reliably distinguish human-written text from that generated by large language models is of profound societal importance. The dominant approach to this problem exploits the likelihood hypothesis: that machine-generated text should appear more probable to a detector language model than human-written text. However, we demonstrate that the token-level signal distinguishing human and machine text is non-uniform across the hidden space of the detector model, and naively averaging likelihood-based token scores across regions with fundamentally different statistical structure, as most detectors do, causes a form of Simpson's paradox: a strong local signal is destroyed by inappropriate aggregation. To correct for this, we introduce a learned local calibration step grounded in Bayesian decision theory. Rather than aggregating raw token scores, we first learn lightweight predictors of the score distributions conditioned on position in hidden space, and aggregate calibrated log-likelihood ratios instead. This single intervention dramatically and consistently improves detection performance across all baseline detectors and all datasets we consider. For example, our calibrated variant of Fast-DetectGPT improves AUROC from $0.63$ to $0.85$ on GPT-5.4 text, and a locally-calibrated DMAP detector we introduce achieves state-of-the-art performance across the board. That said, our central contribution is not a new detector, but a precise diagnosis of a significant cause of under-performance of existing detectors and a principled, modular remedy compatible with any token-averaging pipeline. This will serve as a foundation for the community to build upon, with natural avenues including richer distributional models, improved calibration strategies, and principled ensembling with hidden-space geometry signals via the full Bayes-optimal decision rule.

URL PDF HTML ☆

赞 0 踩 0

2605.06290 2026-05-08 cs.AI

Data Language Models: A New Foundation Model Class for Tabular Data

Eda Erol, Giuliano Pezzoli, Ozer Cem Kelahmet

2605.06285 2026-05-08 cs.CL cs.LG

LatentRAG: Latent Reasoning and Retrieval for Efficient Agentic RAG

Yijia Zheng, Marcel Worring

2605.06283 2026-05-08 cs.CL

Quantifying the Statistical Effect of Rubric Modifications on Human-Autorater Agreement

Jessica Huynh, Alfredo Gomez, Athiya Deviyani, Renee Shelby, Jeffrey P. Bigham, Fernando Diaz

2605.06281 2026-05-08 cs.LG cs.NA math.NA q-fin.CP

INEUS: Iterative Neural Solver for High-Dimensional PIDEs

Jean-Loup Dupret, Davide Gallon, Patrick Cheridito

2605.06278 2026-05-08 cs.LG math.OC

PACE: Prune-And-Compress Ensemble Models

Fabian Akkerman, Julien Ferry, Théo Guyard, Thibaut Vidal

2605.06276 2026-05-08 cs.CL cs.AI

Linear Semantic Segmentation for Low-Resource Spoken Dialects

Kirill Chirkunov, Younes Samih, Abed Alhakim Freihat, Hanan Aldarmaki

Comments ACL Findings 2026

2605.06274 2026-05-08 cs.LG cs.CV

When Labels Have Structure: Improving Image Classification with Hierarchy-Aware Cross-Entropy

April Chan, Davide D'Ascenzo, Sebastiano Cultrera di Montesano

2605.06273 2026-05-08 cs.CV cs.AR

On-Orbit Real-Time Wildfire Detection Under On-Board Constraints

Matthias Rötzer, Veronika Pörtge, Martin Ickerott, Jayendra Praveen Kumar Chorapalli, Dimitri Scheftelowitsch, Max Bereczky, Dmitry Rashkovetsky, Sai Manoj Appalla, Julia Gottfriedsen

2605.06272 2026-05-08 cs.LG

A Flow Matching Algorithm for Many-Shot Adaptation to Unseen Distributions

Tyler Ingebrand, Ruihan Zhao, Kushagra Gupta, David Fridovich-Keil, Sandeep P. Chinchali, Ufuk Topcu

2605.06266 2026-05-08 cs.CV

ZScribbleSeg: A comprehensive segmentation framework with modeling of efficient annotation and maximization of scribble supervision

Ke Zhang, Bomin Wang, Hangqi Zhou, Xiahai Zhuang

Comments Accepted by Medical Image Analysis

2605.06264 2026-05-08 cs.LG

Can Attribution Predict Risk? From Multi-View Attribution to Planning Risk Signals in End-to-End Autonomous Driving

Le Yang, Ruoyu Chen, Haijun Liu, Jiawei Liang, ShangQuan Sun, Xiaochun Cao

详情

英文摘要

End-to-end autonomous driving models generate future trajectories from multi-view inputs, improving system integration but introducing opaque decisions and hard-to-localize risks. Existing methods either rely on auxiliary monitoring models or generate textual explanations, but are decoupled from the planning process and fail to reveal the visual evidence underlying trajectory generation. While attribution offers a direct alternative, planning differs from image classification by taking six-view camera images as input and predicting continuous multi-step trajectories, requiring attribution to capture both critical views and regions and their influence on outputs. Moreover, whether attribution maps can support risk identification remains underexplored. To address this, we propose a hierarchical attribution framework for end-to-end planning. Specifically, using L2 consistency with the original trajectory as the objective, we design a coarse-to-fine region attribution strategy that searches candidate regions across the full six-view input and refines attribution within them. We further extract three attribution statistics as predictive signals for planning risk, including attribution entropy to measure how concentrated the planner's reliance is over the joint visual space, within-camera spatial variance to characterize how spread out the attribution is within each view, and cross-camera Gini coefficient to quantify how unevenly attribution is distributed across the six cameras. Experiments on BridgeAD, UniAD, and GenAD show that these statistics correlate with planning risk, achieving Spearman correlations of $0.30 \pm 0.07$ with trajectory error and AUROC of $0.77 \pm 0.04$ for collision detection. The signal generalizes to held-out scenes with negligible degradation and remains stable under an alternative attribution baseline.

URL PDF HTML ☆

赞 0 踩 0

2605.06261 2026-05-08 cs.LG cs.AI

Inference-Time Refinement Closes the Synthetic-Real Gap in Tabular Diffusion

Eugenio Lomurno, Filippo Balzarini, Francesco Benelle, Francesca Pia Panaccione, Matteo Matteucci

详情

英文摘要

Diffusion-based generators set the current state of the art for synthetic tabular data. These methods approach but rarely exceed real-data utility, and closing this synthetic-real gap has so far been pursued exclusively at training time, via architectural advances, scaling, and retraining of monolithic generators. The inference-time alternative, i.e., refining the outputs of a pre-trained backbone with parameters left untouched, has remained largely unexplored for tabular synthesis. We introduce TARDIS (Tabular generation through Refinement, Distillation, and Inference-time Sampling), an inference-time refinement framework that operates on a frozen pre-trained backbone, configured per dataset by a Tree-structured Parzen Estimator search over score-level guidance during reverse diffusion, with each trial's objective set by an inner grid search over post-hoc sample selectors and an optional soft-label distillation step. The search space encodes a single mathematical pattern we name Bidirectional Chamfer Refinement (BCR): the symmetric Chamfer functional between synthetic and real samples is minimized both continuously, via a score-level gradient, and discretely, via batch-ranking post-generation. The per-dataset search recovers BCR-aligned configurations on most datasets, evidence for BCR as the dominant refinement pattern. Across 15 binary, multiclass, and regression benchmarks TARDIS achieves a median +8.6% downstream-task improvement over models trained on real data (95% CI [+3.3, +16.4], Wilcoxon p=0.016, 11/15 strict wins) and improves over the TabDiff backbone on all 15 datasets (mean +12.9%, p<10^-4), matching the backbone on manifold fidelity, diversity, and sample-level privacy. Inference-time refinement of a pre-trained tabular diffusion backbone reaches and exceeds real-data utility in 1 to 80 minutes on a single consumer-grade GPU.

URL PDF HTML ☆

赞 0 踩 0

2605.06260 2026-05-08 cs.LG

Beyond Rigid Alignment: Graph Federated Learning via Dual Manifold Calibration

Wentao Yu, Bo Han, Jie Yang, Chen Gong

Comments 30 pages

详情

英文摘要

Graph Federated Learning (GFL) enables collaborative representation learning across distributed subgraphs while preserving privacy. However, heterogeneity remains a critical challenge, as subgraphs across clients typically differ significantly in both semantics and structures. Existing methods address heterogeneity by enforcing the rigid alignment of model parameters or prototypes between clients and the server. However, these alignments implicitly rely on a restrictive global linearity assumption that summarizes local data distributions using a single and globally consistent representation space. This severely compresses the personalized representation space of clients and fails to preserve diverse local graph distributions. To overcome these limitations, we propose Federated Graph Manifold Calibration (FedGMC), a novel paradigm that tackles semantic heterogeneity and structural heterogeneity from a unified manifold perspective. Instead of enforcing rigid alignment, FedGMC introduces a dual manifold calibration mechanism that preserves global commonalities while maximizing the personalized representation space of local clients. Specifically, for semantic heterogeneity, the server constructs a geometrically optimal semantic manifold via equidistant semantic anchors, so as to guide the calibration of local semantic manifolds. For structural heterogeneity, the server constructs a global structural manifold by building global structural templates, so as to guide the calibration of local structural manifolds. Finally, the server dynamically refines both global semantic manifolds and structural manifolds by aggregating local manifolds. Extensive experiments on eleven homophilic and heterophilic graphs demonstrate that FedGMC effectively balances global commonality and local personalization, thereby significantly outperforming state-of-the-art baseline methods.

URL PDF HTML ☆

赞 0 踩 0

2605.06258 2026-05-08 cs.LG cs.AI

The Weight Gram Matrix Captures Sequential Feature Linearization in Deep Networks

Taehun Cha, Daniel Beaglehole, Adityanarayanan Radhakrishnan, Donghun Lee

Comments 29 pages including appendix

2605.06250 2026-05-08 cs.LG

The Role of Node Features in Graph Pooling

Jan von Pichowski, Alžbeta Hrabošová, Ingo Scholtes, Christopher Blöcker

2605.06247 2026-05-08 cs.RO

CKT-WAM: Parameter-Efficient Context Knowledge Transfer Between World Action Models

Yuhua Jiang, Yijun Guo, Hongbing Yang, Guojun Lei, Nuo Chen, Yinuo Zhang, Shaoqiang Yan, Bo Lin, Feifei Gao, Biqing Qi

2605.06246 2026-05-08 cs.LG cs.RO

Structure-Preserving Gaussian Processes Via Discrete Euler-Lagrange Equations

Jan-Hendrik Ewering, Kathrin Flaßkamp, Niklas Wahlström, Thomas B. Schön, Thomas Seel

Comments 30 pages

2605.06240 2026-05-08 cs.LG cs.AI

Cumulative-Goodness Free-Riding in Forward-Forward Networks: Real, Repairable, but Not Accuracy-Dominant

Amirhossein Yousefiramandi

2605.06239 2026-05-08 cs.LG

When Graph Language Models Go Beyond Memorization

Masatsugu Yamada, Mahito Sugiyama

Comments Under review

2605.06238 2026-05-08 cs.LG cs.AI

Band Together: Untargeted Adversarial Training with Multimodal Coordination against Evasion-based Promotion Attacks

Guanmeng Xian, Ning Yang, Philip S. Yu