arXivDaily arXiv每日学术速递 周一至周五更新
2601.19644 2026-01-28 cs.LO cs.AI

Robustness of Constraint Automata for Description Logics with Concrete Domains

Stéphane Demri, Tianwen Gu

Comments Extended version of a paper accepted at CSL'26, Paris

详情
英文摘要

Decidability or complexity issues about the consistency problem for description logics with concrete domains have already been analysed with tableaux-based or type elimination methods. Concrete domains in ontologies are essential to consider concrete objects and predefined relations. In this work, we expose an automata-based approach leading to the optimal upper bound EXPTIME, that is designed by enriching the transitions with symbolic constraints. We show that the nonemptiness problem for such automata belongs to EXPTIME if the concrete domains satisfy a few simple properties. Then, we provide a reduction from the consistency problem for ontologies, yielding EXPTIME-membership. Thanks to the expressivity of constraint automata, the results are extended to additional ingredients such as inverse roles, functional role names and constraint assertions, while maintaining EXPTIME-membership, which illustrates the robustness of the approach

2601.19062 2026-01-28 cs.CY cs.AI cs.CL cs.HC

Who's in Charge? Disempowerment Patterns in Real-World LLM Usage

Mrinank Sharma, Miles McCain, Raymond Douglas, David Duvenaud

详情
英文摘要

Although AI assistants are now deeply embedded in society, there has been limited empirical study of how their usage affects human empowerment. We present the first large-scale empirical analysis of disempowerment patterns in real-world AI assistant interactions, analyzing 1.5 million consumer Claude$.$ai conversations using a privacy-preserving approach. We focus on situational disempowerment potential, which occurs when AI assistant interactions risk leading users to form distorted perceptions of reality, make inauthentic value judgments, or act in ways misaligned with their values. Quantitatively, we find that severe forms of disempowerment potential occur in fewer than one in a thousand conversations, though rates are substantially higher in personal domains like relationships and lifestyle. Qualitatively, we uncover several concerning patterns, such as validation of persecution narratives and grandiose identities with emphatic sycophantic language, definitive moral judgments about third parties, and complete scripting of value-laden personal communications that users appear to implement verbatim. Analysis of historical trends reveals an increase in the prevalence of disempowerment potential over time. We also find that interactions with greater disempowerment potential receive higher user approval ratings, possibly suggesting a tension between short-term user preferences and long-term human empowerment. Our findings highlight the need for AI systems designed to robustly support human autonomy and flourishing.

2410.13189 2026-01-28 quant-ph cs.NA math.NA

Fast-forwarding quantum algorithms for linear dissipative differential equations

Dong An, Akwum Onwunta, Gengzhi Yang

Comments 32+11 pages

Journal ref Quantum 10, 1986 (2026)

详情
英文摘要

We establish improved complexity estimates of quantum algorithms for linear dissipative ordinary differential equations (ODEs) and show that the time dependence can be fast-forwarded to be sub-linear. Specifically, we show that a quantum algorithm based on truncated Dyson series can prepare history states of dissipative ODEs up to time $T$ with cost $\widetilde{\mathcal{O}}(\log(T) (\log(1/ε))^2 )$, which is an exponential speedup over the best previous result. For final state preparation at time $T$, we show that its complexity is $\widetilde{\mathcal{O}}(\sqrt{T} (\log(1/ε))^2 )$, achieving a polynomial speedup in $T$. We also analyze the complexity of simpler lower-order quantum algorithms, such as the forward Euler method and the trapezoidal rule, and find that even lower-order methods can still achieve $\widetilde{\mathcal{O}}(\sqrt{T})$ cost with respect to time $T$ for preparing final states of dissipative ODEs. As applications, we show that quantum algorithms can simulate dissipative non-Hermitian quantum dynamics and heat processes with fast-forwarded complexity sub-linear in time.

2407.10635 2026-01-28 quant-ph cs.DM cs.DS math.CO math.OC

NPA Hierarchy for Quantum Isomorphism and Homomorphism Indistinguishability

Prem Nigam Kar, David E. Roberson, Tim Seppelt, Peter Zeman

Journal ref Quantum 10, 1989 (2026)

详情
英文摘要

Mančinska and Roberson [FOCS'20] showed that two graphs are quantum isomorphic if and only if they admit the same number of homomorphisms from any planar graph. Atserias et al. [JCTB'19] proved that quantum isomorphism is undecidable in general, which motivates the study of its relaxations. In the classical setting, Roberson and Seppelt [ICALP'23] characterized the feasibility of each level of the Lasserre hierarchy of semidefinite programming relaxations of graph isomorphism in terms of equality of homomorphism counts from an appropriate graph class. The NPA hierarchy, a noncommutative generalization of the Lasserre hierarchy, provides a sequence of semidefinite programming relaxations for quantum isomorphism. In the quantum setting, we show that the feasibility of each level of the NPA hierarchy for quantum isomorphism is equivalent to equality of homomorphism counts from an appropriate class of planar graphs. Combining this characterization with the convergence of the NPA hierarchy, and noting that the union of these classes is the set of all planar graphs, we obtain a new proof of the result of Mančinska and Roberson [FOCS'20] that avoids the use of quantum groups. Moreover, this homomorphism indistinguishability characterization also yields a randomized polynomial-time algorithm deciding exact feasibility of each fixed level of the NPA hierarchy of SDP relaxations for quantum isomorphism.

2304.06325 2026-01-28 quant-ph cs.CR

How to Sign Quantum Messages

Mohammed Barhoush, Louis Salvail

Comments 51 pages

Journal ref Quantum 10, 1980 (2026)

详情
英文摘要

Signing quantum messages has long been considered impossible even under computational assumptions. In this work, we challenge this notion and provide three innovative approaches to sign quantum messages that are the first to ensure authenticity with public verifiability. Our contributions can be summarized as follows: 1) We introduce the concept of time-dependent (TD) signatures, where the signature of a quantum message depends on the time of signing and the verification process depends on the time of the signature reception. We construct this primitive assuming the existence of post-quantum secure one-way functions (pq-OWFs) and time-lock puzzles (TLPs). 2) By utilizing verification keys that evolve over time, we eliminate the need for TLPs in our construction. This leads to TD signatures from pq-OWFs with dynamic verification keys. 3) We then consider the bounded quantum storage model, where adversaries are limited with respect to their quantum memories. We show that quantum messages can be signed with information-theoretic security in this model. Moreover, we leverage TD signatures to achieve the following objectives, relying solely on pq-OWFs: (a) We design a public key encryption scheme featuring authenticated quantum public keys that resist adversarial tampering. (b) We present a novel TD public-key quantum money scheme.

2212.07197 2026-01-28 math.OA cs.IT math.FA math.IT

Relative position between a pair of spin model subfactors

Keshab Chandra Bakshi, Satyajit Guin

Comments 124 pages, 17 figures, substantial improvement, angle operators computed, new section added

Journal ref J. Aust. Math. Soc. 119 (2025) 1-38

详情
英文摘要

Jones proposed the study of two subfactors of a $II_1$ factor as a quantization of two closed subspaces in a Hilbert space. The Pimsner-Popa probabilistic constant, Sano-Watatani angle, interior and exterior angle, and Connes-Størmer relative entropy (along with a slight variant of it) are a few key invariants for pair of subfactors that analyze their relative position. In practice, however, the explicit computation of these invariants is often difficult. In this article, we provide an in-depth analysis of a special class of two subfactors, namely a pair of spin model subfactors of the hyperfinite type $II_1$ factor $R$. We first characterize when two distinct $n\times n$ complex Hadamard matrices give rise to distinct spin model subfactors. Then, a detailed investigation has been carried out for pairs of (Hadamard equivalent) complex Hadamard matrices of order $2\times 2$ as well as Hadamard inequivalent complex Hadamard matrices of order $4\times 4$. To the best of our knowledge, this article is the first instance in the literature where the exact value of the Pimsner-Popa probabilistic constant and the noncommutative relative entropy for pairs of (non-trivial) subfactors have been obtained. Furthermore, we prove the factoriality of the intersection of the corresponding pair of subfactors using the `commuting square technique'. En route, we construct an infinite family of potentially new subfactors of $R$. All these subfactors are irreducible with Jones index $4n,n\geq 2$. As a consequence, the rigidity of the interior angle between the spin model subfactors is established. Last but not least, we explicitly compute the Sano-Watatani angle between the spin model subfactors.

2104.12269 2026-01-28 cs.CL cs.AI cs.IR

A Bi-Encoder LSTM Model For Learning Unstructured Dialogs

Danny Brahman, Pooran S. Negi, Mohammad Mahoor

详情
英文摘要

Creating a data-driven model that is trained on a large dataset of unstructured dialogs is a crucial step in developing Retrieval-based Chatbot systems. This paper presents a Long Short Term Memory (LSTM) based architecture that learns unstructured multi-turn dialogs and provides results on the task of selecting the best response from a collection of given responses. Ubuntu Dialog Corpus Version 2 was used as the corpus for training. We show that our model achieves 0.8%, 1.0% and 0.3% higher accuracy for Recall@1, Recall@2 and Recall@5 respectively than the benchmark model. We also show results on experiments performed by using several similarity functions, model hyper-parameters and word embeddings on the proposed architecture

2601.19899 2026-01-28 cs.CL

Evaluation of Oncotimia: An LLM based system for supporting tumour boards

Luis Lorenzo, Marcos Montana-Mendez, Sergio Figueiras, Miguel Boubeta, Cristobal Bernardo-Castineira

Comments 9 pages, 2 figures

详情
英文摘要

Multidisciplinary tumour boards (MDTBs) play a central role in oncology decision-making but require manual processes and structuring large volumes of heterogeneous clinical information, resulting in a substantial documentation burden. In this work, we present ONCOTIMIA, a modular and secure clinical tool designed to integrate generative artificial intelligence (GenAI) into oncology workflows and evaluate its application to the automatic completion of lung cancer tumour board forms using large language models (LLMs). The system combines a multi-layer data lake, hybrid relational and vector storage, retrieval-augmented generation (RAG) and a rule-driven adaptive form model to transform unstructured clinical documentation into structured and standardised tumour board records. We assess the performance of six LLMs deployed through AWS Bedrock on ten lung cancer cases, measuring both completion form accuracy and end-to-end latency. The results demonstrate high performance across models, with the best performing configuration achieving an 80% of correct field completion and clinically acceptable response time for most LLMs. Larger and more recent models exhibit best accuracies without incurring prohibitive latency. These findings provide empirical evidence that LLM- assisted autocompletion form is technically feasible and operationally viable in multidisciplinary lung cancer workflows and support its potential to significantly reduce documentation burden while preserving data quality.

2601.19898 2026-01-28 cs.CV

DuwatBench: Bridging Language and Visual Heritage through an Arabic Calligraphy Benchmark for Multimodal Understanding

Shubham Patle, Sara Ghaboura, Hania Tariq, Mohammad Usman Khan, Omkar Thawakar, Rao Muhammad Anwer, Salman Khan

Comments Accepted to EACL-2026 (Main Track)

详情
英文摘要

Arabic calligraphy represents one of the richest visual traditions of the Arabic language, blending linguistic meaning with artistic form. Although multimodal models have advanced across languages, their ability to process Arabic script, especially in artistic and stylized calligraphic forms, remains largely unexplored. To address this gap, we present DuwatBench, a benchmark of 1,272 curated samples containing about 1,475 unique words across six classical and modern calligraphic styles, each paired with sentence-level detection annotations. The dataset reflects real-world challenges in Arabic writing, such as complex stroke patterns, dense ligatures, and stylistic variations that often challenge standard text recognition systems. Using DuwatBench, we evaluated 13 leading Arabic and multilingual multimodal models and showed that while they perform well on clean text, they struggle with calligraphic variation, artistic distortions, and precise visual-text alignment. By publicly releasing DuwatBench and its annotations, we aim to advance culturally grounded multimodal research, foster fair inclusion of the Arabic language and visual heritage in AI systems, and support continued progress in this area. Our dataset (https://huggingface.co/datasets/MBZUAI/DuwatBench) and evaluation suit (https://github.com/mbzuai-oryx/DuwatBench) are publicly available.

2601.19897 2026-01-28 cs.LG

Self-Distillation Enables Continual Learning

Idan Shenfeld, Mehul Damani, Jonas Hübotter, Pulkit Agrawal

详情
英文摘要

Continual learning, enabling models to acquire new skills and knowledge without degrading existing capabilities, remains a fundamental challenge for foundation models. While on-policy reinforcement learning can reduce forgetting, it requires explicit reward functions that are often unavailable. Learning from expert demonstrations, the primary alternative, is dominated by supervised fine-tuning (SFT), which is inherently off-policy. We introduce Self-Distillation Fine-Tuning (SDFT), a simple method that enables on-policy learning directly from demonstrations. SDFT leverages in-context learning by using a demonstration-conditioned model as its own teacher, generating on-policy training signals that preserve prior capabilities while acquiring new skills. Across skill learning and knowledge acquisition tasks, SDFT consistently outperforms SFT, achieving higher new-task accuracy while substantially reducing catastrophic forgetting. In sequential learning experiments, SDFT enables a single model to accumulate multiple skills over time without performance regression, establishing on-policy distillation as a practical path to continual learning from demonstrations.

2601.19893 2026-01-28 cs.ET cs.DC

Enabling SSI-Compliant Use of EUDI Wallet Credentials through Trusted Execution Environment and Zero-Knowledge Proof

Nacereddine Sitouah, Francesco Bruschi, Stefano De Cillis

详情
英文摘要

The passing of the eIDAS amendment marks an important milestone for EU countries and changes how they must manage digital credentials for both public services and businesses. Italy has led in adopting eIDAS, first with CIE and SPID identity schemes, and now with the Italian Wallet (IO app) aligned to eIDAS 2.0. Self-Sovereign Identity (SSI) is a decentralized model born from the success of Distributed Ledgers, giving individuals full control over their digital identity. The current eIDAS 2.0 and its implementation acts diverge from SSI principles, rendering the European Digital Identity Wallet (EUDIW) centralized and merely user-centric, prioritizing security and legal protection over true self-sovereignty. This paper proposes an architecture that enables the use of IT Wallet credentials and services in an SSI-compliant environment through Trusted Execution Environments and Zero-Knowledge Proofs.

2601.19884 2026-01-28 cs.CV cs.LG

SONIC: Spectral Oriented Neural Invariant Convolutions

Gijs Joppe Moens, Regina Beets-Tan, Eduardo H. P. Pooch

Comments 10 pages, 4 figures. Accepted at ICLR 2026

详情
英文摘要

Convolutional Neural Networks (CNNs) rely on fixed-size kernels scanning local patches, which limits their ability to capture global context or long-range dependencies without very deep architectures. Vision Transformers (ViTs), in turn, provide global connectivity but lack spatial inductive bias, depend on explicit positional encodings, and remain tied to the initial patch size. Bridging these limitations requires a representation that is both structured and global. We introduce SONIC (Spectral Oriented Neural Invariant Convolutions), a continuous spectral parameterisation that models convolutional operators using a small set of shared, orientation-selective components. These components define smooth responses across the full frequency domain, yielding global receptive fields and filters that adapt naturally across resolutions. Across synthetic benchmarks, large-scale image classification, and 3D medical datasets, SONIC shows improved robustness to geometric transformations, noise, and resolution shifts, and matches or exceeds convolutional, attention-based, and prior spectral architectures with an order of magnitude fewer parameters. These results demonstrate that continuous, orientation-aware spectral parameterisations provide a principled and scalable alternative to conventional spatial and spectral operators.

2601.19877 2026-01-28 math.NA cs.NA

An Energy-Preserving Domain of Dependence Stabilization for the Linear Wave Equation on Cut-Cell Meshes

Gunnar Birke, Christian Engwer, Sandra May, Louis Petri, Hendrik Ranocha

详情
英文摘要

We present an energy-preserving (either energy-conservative or energy-dissipative) domain of dependence stabilization method for the linear wave equation on cut-cell meshes. Our scheme is based on a standard discontinuous Galerkin discretization in space and an explicit (strong stability preserving) Runge Kutta method in time. Tailored stabilization terms allow for selecting the time step length based on the size of the background cells rather than the small cut cells by propagating information across small cut cells. The stabilization terms preserve the energy stability or energy conservation property of the underlying discontinuous Galerkin space discretization. Numerical results display the high accuracy and stability properties of our scheme.

2601.19871 2026-01-28 cs.CL

Reflective Translation: Improving Low-Resource Machine Translation via Structured Self-Reflection

Nicholas Cheng

Comments 12 pages, 3 figures, 6 tables. Accepted to the NeurIPS 2025 Workshop on Multilingual Representation Learning (Mexico City) and the AAAI 2025 Workshop on Language Models for Under-Resourced Communities (LM4UC). Code and data available at: https://github.com/Nickcheng123/reflective-translation-mt

详情
英文摘要

Low-resource languages such as isiZulu and isiXhosa face persistent challenges in machine translation due to limited parallel data and linguistic resources. Recent advances in large language models suggest that self-reflection, prompting a model to critique and revise its own outputs, can improve reasoning quality and factual consistency. Building on this idea, this paper introduces Reflective Translation, a prompt-based framework in which a model generates an initial translation, produces a structured self-critique, and then uses this reflection to generate a refined translation. The approach is evaluated on English-isiZulu and English-isiXhosa translation using OPUS-100 and NTREX-African, across multiple prompting strategies and confidence thresholds. Results show consistent improvements in both BLEU and COMET scores between first- and second-pass translations, with average gains of up to +0.22 BLEU and +0.18 COMET. Statistical significance testing using paired nonparametric tests confirms that these improvements are robust. The proposed method is model-agnostic, requires no fine-tuning, and introduces a reflection-augmented dataset that can support future supervised or analysis-driven work. These findings demonstrate that structured self-reflection is a practical and effective mechanism for improving translation quality in low-resource settings.

2601.19867 2026-01-28 cs.LG

Bandits in Flux: Adversarial Constraints in Dynamic Environments

Tareq Si Salem

Comments Accepted to AISTATS 2026

详情
英文摘要

We investigate the challenging problem of adversarial multi-armed bandits operating under time-varying constraints, a scenario motivated by numerous real-world applications. To address this complex setting, we propose a novel primal-dual algorithm that extends online mirror descent through the incorporation of suitable gradient estimators and effective constraint handling. We provide theoretical guarantees establishing sublinear dynamic regret and sublinear constraint violation for our proposed policy. Our algorithm achieves state-of-the-art performance in terms of both regret and constraint violation. Empirical evaluations demonstrate the superiority of our approach.

2601.19862 2026-01-28 cs.LG cs.GT

Calibration without Ground Truth

Yuqing Kong, Mingyu Song, Yizhou Wang, Yifan Wu

详情
英文摘要

Villalobos et al. [2024] predict that publicly available human text will be exhausted within the next decade. Thus, improving models without access to ground-truth labels becomes increasingly important. We propose a label-free post-processing framework that improves a strong but miscalibrated model using a weaker yet better-calibrated reference. Our framework guarantees a strict performance improvement under any proper loss. Our approach is based on a characterization of when strict improvement is possible: when the strong and reference models are not mutually calibrated. We formalize this condition, connect it to arbitrage and no-trade results from economics, and develop an efficient Bregman projection algorithm that guarantees worst-case loss reduction without labels. Experiments on representative LLMs across varying scales demonstrate that our label-free method significantly reduces proper losses and calibration errors, achieving performance competitive with supervised baselines.

2601.19856 2026-01-28 cs.RO

Estimating Trust in Human-Robot Collaboration through Behavioral Indicators and Explainability

Giulio Campagna, Marta Lagomarsino, Marta Lorenzini, Dimitrios Chrysostomou, Matthias Rehm, Arash Ajoudani

Journal ref IEEE Robotics and Automation Letters (Volume: 10, Issue: 10, October 2025)

详情
英文摘要

Industry 5.0 focuses on human-centric collaboration between humans and robots, prioritizing safety, comfort, and trust. This study introduces a data-driven framework to assess trust using behavioral indicators. The framework employs a Preference-Based Optimization algorithm to generate trust-enhancing trajectories based on operator feedback. This feedback serves as ground truth for training machine learning models to predict trust levels from behavioral indicators. The framework was tested in a chemical industry scenario where a robot assisted a human operator in mixing chemicals. Machine learning models classified trust with over 80\% accuracy, with the Voting Classifier achieving 84.07\% accuracy and an AUC-ROC score of 0.90. These findings underscore the effectiveness of data-driven methods in assessing trust within human-robot collaboration, emphasizing the valuable role behavioral indicators play in predicting the dynamics of human trust.

2601.19853 2026-01-28 eess.SP cs.LG

Generative Latent Alignment for Interpretable Radar Based Occupancy Detection in Ambient Assisted Living

Huy Trinh

详情
英文摘要

In this work, we study how to make mmWave radar presence detection more interpretable for Ambient Assisted Living (AAL) settings, where camera-based sensing raises privacy concerns. We propose a Generative Latent Alignment (GLA) framework that combines a lightweight convolutional variational autoencoder with a frozen CLIP text encoder to learn a low-dimensional latent representation of radar Range-Angle (RA) heatmaps. The latent space is softly aligned with two semantic anchors corresponding to "empty room" and "person present", and Grad-CAM is applied in this aligned latent space to visualize which spatial regions support each presence decision. On our mmWave radar dataset, we qualitatively observe that the "person present" class produces compact Grad-CAM blobs that coincide with strong RA returns, whereas "empty room" samples yield diffuse or no evidence. We also conduct an ablation study using unrelated text prompts, which degrades both reconstruction and localization, suggesting that radar-specific anchors are important for meaningful explanations in this setting.

2601.19851 2026-01-28 cs.HC cs.RO

How Does Delegation in Social Interaction Evolve Over Time? Navigation with a Robot for Blind People

Rayna Hata, Masaki Kuribayashi, Allan Wang, Hironobu Takagi, Chieko Asakawa

Comments Pre-print of paper accepted into CHI 2026

详情
英文摘要

Autonomy and independent navigation are vital to daily life but remain challenging for individuals with blindness. Robotic systems can enhance mobility and confidence by providing intelligent navigation assistance. However, fully autonomous systems may reduce users' sense of control, even when they wish to remain actively involved. Although collaboration between user and robot has been recognized as important, little is known about how perceptions of this relationship change with repeated use. We present a repeated exposure study with six blind participants who interacted with a navigation-assistive robot in a real-world museum. Participants completed tasks such as navigating crowds, approaching lines, and encountering obstacles. Findings show that participants refined their strategies over time, developing clearer preferences about when to rely on the robot versus act independently. This work provides insights into how strategies and preferences evolve with repeated interaction and offers design implications for robots that adapt to user needs over time.

2601.19850 2026-01-28 cs.CV

EgoHandICL: Egocentric 3D Hand Reconstruction with In-Context Learning

Binzhu Xie, Shi Qiu, Sicheng Zhang, Yinqiao Wang, Hao Xu, Muzammal Naseer, Chi-Wing Fu, Pheng-Ann Heng

Comments Accepted in ICLR 2026, Codebase: https://github.com/Nicous20/EgoHandICL

详情
英文摘要

Robust 3D hand reconstruction in egocentric vision is challenging due to depth ambiguity, self-occlusion, and complex hand-object interactions. Prior methods mitigate these issues by scaling training data or adding auxiliary cues, but they often struggle in unseen contexts. We present EgoHandICL, the first in-context learning (ICL) framework for 3D hand reconstruction that improves semantic alignment, visual consistency, and robustness under challenging egocentric conditions. EgoHandICL introduces complementary exemplar retrieval guided by vision-language models (VLMs), an ICL-tailored tokenizer for multimodal context, and a masked autoencoder (MAE)-based architecture trained with hand-guided geometric and perceptual objectives. Experiments on ARCTIC and EgoExo4D show consistent gains over state-of-the-art methods. We also demonstrate real-world generalization and improve EgoVLM hand-object interaction reasoning by using reconstructed hands as visual prompts. Code and data: https://github.com/Nicous20/EgoHandICL

2601.19849 2026-01-28 cs.CV

HexFormer: Hyperbolic Vision Transformer with Exponential Map Aggregation

Haya Alyoussef, Ahmad Bdeir, Diego Coello de Portugal Mecke, Tom Hanika, Niels Landwehr, Lars Schmidt-Thieme

详情
英文摘要

Data across modalities such as images, text, and graphs often contains hierarchical and relational structures, which are challenging to model within Euclidean geometry. Hyperbolic geometry provides a natural framework for representing such structures. Building on this property, this work introduces HexFormer, a hyperbolic vision transformer for image classification that incorporates exponential map aggregation within its attention mechanism. Two designs are explored: a hyperbolic ViT (HexFormer) and a hybrid variant (HexFormer-Hybrid) that combines a hyperbolic encoder with an Euclidean linear classification head. HexFormer incorporates a novel attention mechanism based on exponential map aggregation, which yields more accurate and stable aggregated representations than standard centroid based averaging, showing that simpler approaches retain competitive merit. Experiments across multiple datasets demonstrate consistent performance improvements over Euclidean baselines and prior hyperbolic ViTs, with the hybrid variant achieving the strongest overall results. Additionally, this study provides an analysis of gradient stability in hyperbolic transformers. The results reveal that hyperbolic models exhibit more stable gradients and reduced sensitivity to warmup strategies compared to Euclidean architectures, highlighting their robustness and efficiency in training. Overall, these findings indicate that hyperbolic geometry can enhance vision transformer architectures by improving gradient stability and accuracy. In addition, relatively simple mechanisms such as exponential map aggregation can provide strong practical benefits.

2601.19847 2026-01-28 cs.CL

Identifying and Transferring Reasoning-Critical Neurons: Improving LLM Inference Reliability via Activation Steering

Fangan Dong, Zuming Yan, Xuri Ge, Zhiwei Xu, Mengqi Zhang, Xuanang Chen, Ben He, Xin Xin, Zhumin Chen, Ying Zhou

详情
英文摘要

Despite the strong reasoning capabilities of recent large language models (LLMs), achieving reliable performance on challenging tasks often requires post-training or computationally expensive sampling strategies, limiting their practical efficiency. In this work, we first show that a small subset of neurons in LLMs exhibits strong predictive correlations with reasoning correctness. Based on this observation, we propose AdaRAS (Adaptive Reasoning Activation Steering), a lightweight test-time framework that improves reasoning reliability by selectively intervening on neuron activations. AdaRAS identifies Reasoning-Critical Neurons (RCNs) via a polarity-aware mean-difference criterion and adaptively steers their activations during inference, enhancing incorrect reasoning traces while avoiding degradation on already-correct cases. Experiments on 10 mathematics and coding benchmarks demonstrate consistent improvements, including over 13% gains on AIME-24 and AIME-25. Moreover, AdaRAS exhibits strong transferability across datasets and scalability to stronger models, outperforming post-training methods without additional training or sampling cost.

2601.19839 2026-01-28 cs.RO cs.AI cs.HC

HARMONI: Multimodal Personalization of Multi-User Human-Robot Interactions with LLMs

Jeanne Malécot, Hamed Rahimi, Jeanne Cattoni, Marie Samson, Mouad Abrini, Mahdi Khoramshahi, Maribel Pino, Mohamed Chetouani

详情
英文摘要

Existing human-robot interaction systems often lack mechanisms for sustained personalization and dynamic adaptation in multi-user environments, limiting their effectiveness in real-world deployments. We present HARMONI, a multimodal personalization framework that leverages large language models to enable socially assistive robots to manage long-term multi-user interactions. The framework integrates four key modules: (i) a perception module that identifies active speakers and extracts multimodal input; (ii) a world modeling module that maintains representations of the environment and short-term conversational context; (iii) a user modeling module that updates long-term speaker-specific profiles; and (iv) a generation module that produces contextually grounded and ethically informed responses. Through extensive evaluation and ablation studies on four datasets, as well as a real-world scenario-driven user-study in a nursing home environment, we demonstrate that HARMONI supports robust speaker identification, online memory updating, and ethically aligned personalization, outperforming baseline LLM-driven approaches in user modeling accuracy, personalization quality, and user satisfaction.

2601.19838 2026-01-28 math.NA cs.NA

Modified splitting methods for Gross-Pitaevskii systems modelling Bose-Einstein condensates: Time evolution and ground state computation

Mechthild Thalhammer, Gregor Thalhammer-Thurner

详情
英文摘要

The year 2025 marks the 100 and 30 years anniversaries of the discovery of Bose--Einstein condensation and its successful experimental realisation. Inspired by these important research achievements, a conceptually simple approach is proposed to facilitate reliable and efficient numerical simulations. The structure of the underlying systems of coupled Gross--Pitaevskii equations suggests the use of optimised high-order operator splitting methods for dynamical evolution and ground state computation. A second-order barrier, however, prevents the applicability of standard operator splitting methods for both, time evolution as well as imaginary time propagation. An innovative alternative approach accomplishes the design of novel modified operator splitting methods that remain stable under moderate smallness assumptions on the time increments. The core idea is to incorporate commutators of the defining differential and nonlinear multiplication operators, since this permits to fulfill the basic stability requirement of positive method coefficients. Further improvements with respect to convergence at the targeted precision arise from automatic adjustments of the time stepsizes by an inexpensive local error control. The presented numerical experiments confirm the favourable performance of a specific fourth-order modified operator splitting method. Amongst others, it is demonstrated that the excellent mass and energy conservation in long-term evolutions, intrinsic attributes of geometric numerical integrators for Hamiltonian systems, is maintained for a sensible variation of the time stepsizes. Moreover, the benefits of adaptive higher-order approximations in ground state computations are illustrated.

2601.19834 2026-01-28 cs.AI

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Jialong Wu, Xiaoying Zhang, Hongyi Yuan, Xiangcheng Zhang, Tianhao Huang, Changjing He, Chaoyi Deng, Renrui Zhang, Youbin Wu, Mingsheng Long

Comments Project page: https://thuml.github.io/Reasoning-Visual-World

详情
英文摘要

Humans construct internal world models and reason by manipulating the concepts within these models. Recent advances in AI, particularly chain-of-thought (CoT) reasoning, approximate such human cognitive abilities, where world models are believed to be embedded within large language models. Expert-level performance in formal and abstract domains such as mathematics and programming has been achieved in current systems by relying predominantly on verbal reasoning. However, they still lag far behind humans in domains like physical and spatial intelligence, which require richer representations and prior knowledge. The emergence of unified multimodal models (UMMs) capable of both verbal and visual generation has therefore sparked interest in more human-like reasoning grounded in complementary multimodal pathways, though their benefits remain unclear. From a world-model perspective, this paper presents the first principled study of when and how visual generation benefits reasoning. Our key position is the visual superiority hypothesis: for certain tasks--particularly those grounded in the physical world--visual generation more naturally serves as world models, whereas purely verbal world models encounter bottlenecks arising from representational limitations or insufficient prior knowledge. Theoretically, we formalize internal world modeling as a core component of CoT reasoning and analyze distinctions among different forms of world models. Empirically, we identify tasks that necessitate interleaved visual-verbal CoT reasoning, constructing a new evaluation suite, VisWorld-Eval. Controlled experiments on a state-of-the-art UMM show that interleaved CoT significantly outperforms purely verbal CoT on tasks that favor visual world modeling, but offers no clear advantage otherwise. Together, this work clarifies the potential of multimodal world modeling for more powerful, human-like multimodal AI.

2601.19833 2026-01-28 cs.LG

A Multi-directional Meta-Learning Framework for Class-Generalizable Anomaly Detection

Padmaksha Roy, Lamine Mili, Almuatazbellah Boker

详情
英文摘要

In this paper, we address the problem of class-generalizable anomaly detection, where the objective is to develop a unified model by focusing our learning on the available normal data and a small amount of anomaly data in order to detect the completely unseen anomalies, also referred to as the out-of-distribution (OOD) classes. Adding to this challenge is the fact that the anomaly data is rare and costly to label. To achieve this, we propose a multidirectional meta-learning algorithm -- at the inner level, the model aims to learn the manifold of the normal data (representation); at the outer level, the model is meta-tuned with a few anomaly samples to maximize the softmax confidence margin between the normal and anomaly samples (decision surface calibration), treating normals as in-distribution (ID) and anomalies as out-of-distribution (OOD). By iteratively repeating this process over multiple episodes of predominantly normal and a small number of anomaly samples, we realize a multidirectional meta-learning framework. This two-level optimization, enhanced by multidirectional training, enables stronger generalization to unseen anomaly classes.

2601.19832 2026-01-28 cs.RO

Information-Theoretic Detection of Bimanual Interactions for Dual-Arm Robot Plan Generation

Elena Merlo, Marta Lagomarsino, Arash Ajoudani

Journal ref in IEEE Robotics and Automation Letters, vol. 10, no. 5, pp. 4532-4539, May 2025

详情
英文摘要

Programming by demonstration is a strategy to simplify the robot programming process for non-experts via human demonstrations. However, its adoption for bimanual tasks is an underexplored problem due to the complexity of hand coordination, which also hinders data recording. This paper presents a novel one-shot method for processing a single RGB video of a bimanual task demonstration to generate an execution plan for a dual-arm robotic system. To detect hand coordination policies, we apply Shannon's information theory to analyze the information flow between scene elements and leverage scene graph properties. The generated plan is a modular behavior tree that assumes different structures based on the desired arms coordination. We validated the effectiveness of this framework through multiple subject video demonstrations, which we collected and made open-source, and exploiting data from an external, publicly available dataset. Comparisons with existing methods revealed significant improvements in generating a centralized execution plan for coordinating two-arm systems.

2601.19828 2026-01-28 math.NA cs.NA

Galerkin-type time discretizations for parabolic and hyperbolic problems: stability and a priori error analysis

Sergio Gómez

详情
英文摘要

We present a unified framework for the analysis of space-time methods based on Galerkin-type time discretizations for parabolic and hyperbolic problems. Crucially, the stability analysis relies on a suitable choice of test functions to establish the continuous dependence of the discrete solution on the data in $L^{\infty}(0, T; X)$ norms, which is then used to derive a priori error estimates. This approach closes the gap in the analysis of some methods in this class caused by the limitation of standard energy arguments, and is characterized by the absence of Grönwall estimates, applicability to arbitrary approximation degrees, reduced regularity assumptions, and robustness with respect to the model parameters.

2601.19826 2026-01-28 cs.RO cs.HC

Whether We Care, How We Reason: The Dual Role of Anthropomorphism and Moral Foundations in Robot Abuse

Fan Yang, Renkai Ma, Yaxin Hu, Lingyao Li

详情
英文摘要

As robots become increasingly integrated into daily life, understanding responses to robot mistreatment carries important ethical and design implications. This mixed-methods study (N = 201) examined how anthropomorphic levels and moral foundations shape reactions to robot abuse. Participants viewed videos depicting physical mistreatment of robots varying in humanness (Spider, Twofoot, Humanoid) and completed measures assessing moral foundations, anger, and social distance. Results revealed that anthropomorphism determines whether people extend moral consideration to robots, while moral foundations shape how they reason about such consideration. Qualitative analysis revealed distinct reasoning patterns: low-progressivism individuals employed character-based judgments, while high-progressivism individuals engaged in future-oriented moral deliberation. Findings offer implications for robot design and policy communication.

2601.19825 2026-01-28 cs.AI cs.DB

Routing End User Queries to Enterprise Databases

Saikrishna Sudarshan, Tanay Kulkarni, Manasi Patwardhan, Lovekesh Vig, Ashwin Srinivasan, Tanmay Tulsidas Verlekar

Comments 6 pages, 2 figures

详情
英文摘要

We address the task of routing natural language queries in multi-database enterprise environments. We construct realistic benchmarks by extending existing NL-to-SQL datasets. Our study shows that routing becomes increasingly challenging with larger, domain-overlapping DB repositories and ambiguous queries, motivating the need for more structured and robust reasoning-based solutions. By explicitly modelling schema coverage, structural connectivity, and fine-grained semantic alignment, the proposed modular, reasoning-driven reranking strategy consistently outperforms embedding-only and direct LLM-prompting baselines across all the metrics.