arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1125
2602.18006 2026-02-23 cs.CV

MUOT_3M: A 3 Million Frame Multimodal Underwater Benchmark and the MUTrack Tracking Method

Ahsan Baidar Bakht, Mohamad Alansari, Muhayy Ud Din, Muzammal Naseer, Sajid Javed, Irfan Hussain, Jiri Matas, Arif Mahmood

详情
英文摘要

Underwater Object Tracking (UOT) is crucial for efficient marine robotics, large scale ecological monitoring, and ocean exploration; however, progress has been hindered by the scarcity of large, multimodal, and diverse datasets. Existing benchmarks remain small and RGB only, limiting robustness under severe color distortion, turbidity, and low visibility conditions. We introduce MUOT_3M, the first pseudo multimodal UOT benchmark comprising 3 million frames from 3,030 videos (27.8h) annotated with 32 tracking attributes, 677 fine grained classes, and synchronized RGB, estimated enhanced RGB, estimated depth, and language modalities validated by a marine biologist. Building upon MUOT_3M, we propose MUTrack, a SAM-based multimodal to unimodal tracker featuring visual geometric alignment, vision language fusion, and four level knowledge distillation that transfers multimodal knowledge into a unimodal student model. Extensive evaluations across five UOT benchmarks demonstrate that MUTrack achieves up to 8.40% higher AUC and 7.80% higher precision than the strongest SOTA baselines while running at 24 FPS. MUOT_3M and MUTrack establish a new foundation for scalable, multimodally trained yet practically deployable underwater tracking.

2602.18002 2026-02-23 cs.LG

Asynchronous Heavy-Tailed Optimization

Junfei Sun, Dixi Yao, Xuchen Gong, Tahseen Rabbani, Manzil Zaheer, Tian Li

Comments 8-page main body, 25-page appendix, 5 figures

详情
英文摘要

Heavy-tailed stochastic gradient noise, commonly observed in transformer models, can destabilize the optimization process. Recent works mainly focus on developing and understanding approaches to address heavy-tailed noise in the centralized or distributed, synchronous setting, leaving the interactions between such noise and asynchronous optimization underexplored. In this work, we investigate two communication schemes that handle stragglers with asynchronous updates in the presence of heavy-tailed gradient noise. We propose and theoretically analyze algorithmic modifications based on delay-aware learning rate scheduling and delay compensation to enhance the performance of asynchronous algorithms. Our convergence guarantees under heavy-tailed noise match the rate of the synchronous counterparts and improve delay tolerance compared with existing asynchronous approaches. Empirically, our approaches outperform prior synchronous and asynchronous methods in terms of accuracy/runtime trade-offs and are more robust to hyperparameters in both image and language tasks.

2602.18000 2026-02-23 cs.CV

Image Quality Assessment: Exploring Quality Awareness via Memory-driven Distortion Patterns Matching

Xuting Lan, Mingliang Zhou, Xuekai Wei, Jielu Yan, Yueting Huang, Huayan Pu, Jun Luo, Weijia Jia

详情
英文摘要

Existing full-reference image quality assessment (FR-IQA) methods achieve high-precision evaluation by analysing feature differences between reference and distorted images. However, their performance is constrained by the quality of the reference image, which limits real-world applications where ideal reference sources are unavailable. Notably, the human visual system has the ability to accumulate visual memory, allowing image quality assessment on the basis of long-term memory storage. Inspired by this biological memory mechanism, we propose a memory-driven quality-aware framework (MQAF), which establishes a memory bank for storing distortion patterns and dynamically switches between dual-mode quality assessment strategies to reduce reliance on high-quality reference images. When reference images are available, MQAF obtains reference-guided quality scores by adaptively weighting reference information and comparing the distorted image with stored distortion patterns in the memory bank. When the reference image is absent, the framework relies on distortion patterns in the memory bank to infer image quality, enabling no-reference quality assessment (NR-IQA). The experimental results show that our method outperforms state-of-the-art approaches across multiple datasets while adapting to both no-reference and full-reference tasks.

2602.17998 2026-02-23 cs.LG cs.AI cs.CE cs.SY eess.SY

PHAST: Port-Hamiltonian Architecture for Structured Temporal Dynamics Forecasting

Shubham Bhardwaj, Chandrajit Bajaj

Comments 50 pages

详情
英文摘要

Real physical systems are dissipative -- a pendulum slows, a circuit loses charge to heat -- and forecasting their dynamics from partial observations is a central challenge in scientific machine learning. We address the \emph{position-only} (q-only) problem: given only generalized positions~$q_t$ at discrete times (momenta~$p_t$ latent), learn a structured model that (a)~produces stable long-horizon forecasts and (b)~recovers physically meaningful parameters when sufficient structure is provided. The port-Hamiltonian framework makes the conservative-dissipative split explicit via $\dot{x}=(J-R)\nabla H(x)$, guaranteeing $dH/dt\le 0$ when $R\succeq 0$. We introduce \textbf{PHAST} (Port-Hamiltonian Architecture for Structured Temporal dynamics), which decomposes the Hamiltonian into potential~$V(q)$, mass~$M(q)$, and damping~$D(q)$ across three knowledge regimes (KNOWN, PARTIAL, UNKNOWN), uses efficient low-rank PSD/SPD parameterizations, and advances dynamics with Strang splitting. Across thirteen q-only benchmarks spanning mechanical, electrical, molecular, thermal, gravitational, and ecological systems, PHAST achieves the best long-horizon forecasting among competitive baselines and enables physically meaningful parameter recovery when the regime provides sufficient anchors. We show that identification is fundamentally ill-posed without such anchors (gauge freedom), motivating a two-axis evaluation that separates forecasting stability from identifiability.

2602.17993 2026-02-23 cs.LG cs.AI

Turbo Connection: Reasoning as Information Flow from Higher to Lower Layers

Mohan Tang, Sidi Lu

详情
英文摘要

Complex problems, whether in math, logic, or planning, are solved by humans through a sequence of steps where the result of one step informs the next. In this work, we adopt the perspective that the reasoning power of Transformers is fundamentally limited by a fixed maximum number of steps along any latent path of computation. To address this, we introduce Turbo Connection (TurboConn), a novel architecture that overcomes the fixed-depth constraint by routing multiple residual connections from the higher-layer hidden states of each token $t$ to the lower layers of token $t+1$. Fine-tuning pre-trained LLMs with our method not only yields accuracy gains of 0.9% to over 10% on benchmarks like GSM8K, Parity, and multi-step arithmetic, but also demonstrates that the density of these backward connections is critical; our dense interaction significantly outperforms "sparse" alternatives that only pass a single hidden state or vector. Notably, TurboConn can be integrated into pre-trained LLMs to overcome task-specific plateaus: while a fine-tuned Qwen-3-1.7B achieves only 53.78% on Parity, adding our architectural modification enables the model to reach 100% accuracy, all without the necessity to retrain the full model from scratch or sophisticated curriculum learning. Our results provide strong empirical evidence that the depth of the computational path is a key factor in reasoning ability, also offering a new mechanism to enhance LLMs without significantly affecting generation latency.

2602.17985 2026-02-23 cs.LG stat.ML

Learning Without Training

Ryan O'Dowd

Comments PhD Dissertation of Ryan O'Dowd, defended successfully at Claremont Graduate University on 1/28/2026

详情
英文摘要

Machine learning is at the heart of managing the real-world problems associated with massive data. With the success of neural networks on such large-scale problems, more research in machine learning is being conducted now than ever before. This dissertation focuses on three different projects rooted in mathematical theory for machine learning applications. The first project deals with supervised learning and manifold learning. In theory, one of the main problems in supervised learning is that of function approximation: that is, given some data set $\mathcal{D}=\{(x_j,f(x_j))\}_{j=1}^M$, can one build a model $F\approx f$? We introduce a method which aims to remedy several of the theoretical shortcomings of the current paradigm for supervised learning. The second project deals with transfer learning, which is the study of how an approximation process or model learned on one domain can be leveraged to improve the approximation on another domain. We study such liftings of functions when the data is assumed to be known only on a part of the whole domain. We are interested in determining subsets of the target data space on which the lifting can be defined, and how the local smoothness of the function and its lifting are related. The third project is concerned with the classification task in machine learning, particularly in the active learning paradigm. Classification has often been treated as an approximation problem as well, but we propose an alternative approach leveraging techniques originally introduced for signal separation problems. We introduce theory to unify signal separation with classification and a new algorithm which yields competitive accuracy to other recent active learning algorithms while providing results much faster.

2602.17981 2026-02-23 cs.CL cs.IR

Decomposing Retrieval Failures in RAG for Long-Document Financial Question Answering

Amine Kobeissi, Philippe Langlais

详情
英文摘要

Retrieval-augmented generation is increasingly used for financial question answering over long regulatory filings, yet reliability depends on retrieving the exact context needed to justify answers in high stakes settings. We study a frequent failure mode in which the correct document is retrieved but the page or chunk that contains the answer is missed, leading the generator to extrapolate from incomplete context. Despite its practical significance, this within-document retrieval failure mode has received limited systematic attention in the Financial Question Answering (QA) literature. We evaluate retrieval at multiple levels of granularity, document, page, and chunk level, and introduce an oracle based analysis to provide empirical upper bounds on retrieval and generative performance. On a 150 question subset of FinanceBench, we reproduce and compare diverse retrieval strategies including dense, sparse, hybrid, and hierarchical methods with reranking and query reformulation. Across methods, gains in document discovery tend to translate into stronger page recall, yet oracle performance still suggests headroom for page and chunk level retrieval. To target this gap, we introduce a domain fine-tuned page scorer that treats pages as an intermediate retrieval unit between documents and chunks. Unlike prior passage-based hierarchical retrieval, we fine-tune a bi-encoder specifically for page-level relevance on financial filings, exploiting the semantic coherence of pages. Overall, our results demonstrate a significant improvement in page recall and chunk retrieval.

2602.17978 2026-02-23 cs.LG cs.AI

Learning Optimal and Sample-Efficient Decision Policies with Guarantees

Daqian Shao

Comments A thesis submitted for the degree of DPhil in Computer Science at Oxford

详情
英文摘要

The paradigm of decision-making has been revolutionised by reinforcement learning and deep learning. Although this has led to significant progress in domains such as robotics, healthcare, and finance, the use of RL in practice is challenging, particularly when learning decision policies in high-stakes applications that may require guarantees. Traditional RL algorithms rely on a large number of online interactions with the environment, which is problematic in scenarios where online interactions are costly, dangerous, or infeasible. However, learning from offline datasets is hindered by the presence of hidden confounders. Such confounders can cause spurious correlations in the dataset and can mislead the agent into taking suboptimal or adversarial actions. Firstly, we address the problem of learning from offline datasets in the presence of hidden confounders. We work with instrumental variables (IVs) to identify the causal effect, which is an instance of a conditional moment restrictions (CMR) problem. Inspired by double/debiased machine learning, we derive a sample-efficient algorithm for solving CMR problems with convergence and optimality guarantees, which outperforms state-of-the-art algorithms. Secondly, we relax the conditions on the hidden confounders in the setting of (offline) imitation learning, and adapt our CMR estimator to derive an algorithm that can learn effective imitator policies with convergence rate guarantees. Finally, we consider the problem of learning high-level objectives expressed in linear temporal logic (LTL) and develop a provably optimal learning algorithm that improves sample efficiency over existing methods. Through evaluation on reinforcement learning benchmarks and synthetic and semi-synthetic datasets, we demonstrate the usefulness of the methods developed in this thesis in real-world decision making.

2602.17976 2026-02-23 cs.LG cs.AI

In-Context Learning for Pure Exploration in Continuous Spaces

Alessio Russo, Yin-Ching Lee, Ryan Welch, Aldo Pacchiano

详情
英文摘要

In active sequential testing, also termed pure exploration, a learner is tasked with the goal to adaptively acquire information so as to identify an unknown ground-truth hypothesis with as few queries as possible. This problem, originally studied by Chernoff in 1959, has several applications: classical formulations include Best-Arm Identification (BAI) in bandits, where actions index hypotheses, and generalized search problems, where strategically chosen queries reveal partial information about a hidden label. In many modern settings, however, the hypothesis space is continuous and naturally coincides with the query/action space: for example, identifying an optimal action in a continuous-armed bandit, localizing an $ε$-ball contained in a target region, or estimating the minimizer of an unknown function from a sequence of observations. In this work, we study pure exploration in such continuous spaces and introduce Continuous In-Context Pure Exploration for this regime. We introduce C-ICPE-TS, an algorithm that meta-trains deep neural policies to map observation histories to (i) the next continuous query action and (ii) a predicted hypothesis, thereby learning transferable sequential testing strategies directly from data. At inference time, C-ICPE-TS actively gathers evidence on previously unseen tasks and infers the true hypothesis without parameter updates or explicit hand-crafted information models. We validate C-ICPE-TS across a range of benchmarks, spanning continuous best-arm identification, region localization, and function minimizer identification.

2602.17975 2026-02-23 cs.LG cs.SY eess.SY

Generating adversarial inputs for a graph neural network model of AC power flow

Robert Parker

详情
英文摘要

This work formulates and solves optimization problems to generate input points that yield high errors between a neural network's predicted AC power flow solution and solutions to the AC power flow equations. We demonstrate this capability on an instance of the CANOS-PF graph neural network model, as implemented by the PF$Δ$ benchmark library, operating on a 14-bus test grid. Generated adversarial points yield errors as large as 3.4 per-unit in reactive power and 0.08 per-unit in voltage magnitude. When minimizing the perturbation from a training point necessary to satisfy adversarial constraints, we find that the constraints can be met with as little as an 0.04 per-unit perturbation in voltage magnitude on a single bus. This work motivates the development of rigorous verification and robust training methods for neural network surrogate models of AC power flow.

2602.17972 2026-02-23 cs.LG

Student Flow Modeling for School Decongestion via Stochastic Gravity Estimation and Constrained Spatial Allocation

Sebastian Felipe R. Bundoc, Paula Joy B. Martinez, Sebastian C. Ibañez, Erika Fille T. Legara

详情
英文摘要

School congestion, where student enrollment exceeds school capacity, is a major challenge in low- and middle-income countries. It highly impacts learning outcomes and deepens inequities in education. While subsidy programs that transfer students from public to private schools offer a mechanism to alleviate congestion without capital-intensive construction, they often underperform due to fragmented data systems that hinder effective implementation. The Philippine Educational Service Contracting program, one of the world's largest educational subsidy programs, exemplifies these challenges, falling short of its goal to decongest public schools. This prevents the science-based and data-driven analyses needed to understand what shapes student enrollment flows, particularly how families respond to economic incentives and spatial constraints. We introduce a computational framework for modeling student flow patterns and simulating policy scenarios. By synthesizing heterogeneous government data across nearly 3,000 institutions, we employ a stochastic gravity model estimated via negative binomial regression to derive behavioral elasticities for distance, net tuition cost, and socioeconomic determinants. These elasticities inform a doubly constrained spatial allocation mechanism that simulates student redistribution under varying subsidy amounts while respecting both origin candidate pools and destination slot capacities. We find that geographic proximity constrains school choice four times more strongly than tuition cost and that slot capacity, not subsidy amounts, is the binding constraint. Our work demonstrates that subsidy programs alone cannot resolve systemic overcrowding, and computational modeling can empower education policymakers to make equitable, data-driven decisions by revealing the structural constraints that shape effective resource allocation, even when resources are limited.

2602.17962 2026-02-23 cs.LG

Improving Generalizability of Hip Fracture Risk Prediction via Domain Adaptation Across Multiple Cohorts

Shuo Sun, Meiling Zhou, Chen Zhao, Joyce H. Keyak, Nancy E. Lane, Jeffrey D. Deng, Kuan-Jui Su, Hui Shen, Hong-Wen Deng, Kui Zhang, Weihua Zhou

Comments 26 pages, 3 tables, 1 figure

详情
英文摘要

Clinical risk prediction models often fail to be generalized across cohorts because underlying data distributions differ by clinical site, region, demographics, and measurement protocols. This limitation is particularly pronounced in hip fracture risk prediction, where the performance of models trained on one cohort (the source cohort) can degrade substantially when deployed in other cohorts (target cohorts). We used a shared set of clinical and DXA-derived features across three large cohorts - the Study of Osteoporotic Fractures (SOF), the Osteoporotic Fractures in Men Study (MrOS), and the UK Biobank (UKB), to systematically evaluate the performance of three domain adaptation methods - Maximum Mean Discrepancy (MMD), Correlation Alignment (CORAL), and Domain - Adversarial Neural Networks (DANN) and their combinations. For a source cohort with males only and a source cohort with females only, domain-adaptation methods consistently showed improved performance than the no-adaptation baseline (source-only training), and the use of combinations of multiple domain adaptation methods delivered the largest and most stable gains. The method that combines MMD, CORAL, and DANN achieved the highest discrimination with the area under curve (AUC) of 0.88 for a source cohort with males only and 0.95 for a source cohort with females only), demonstrating that integrating multiple domain adaptation methods could produce feature representations that are less sensitive to dataset differences. Unlike existing methods that rely heavily on supervised tuning or assume known outcomes of samples in target cohorts, our outcome-free approaches enable the model selection under realistic deployment conditions and improve generalization of models in hip fracture risk prediction.

2602.17958 2026-02-23 cs.LG

Bayesian Online Model Selection

Aida Afshar, Yuke Zhang, Aldo Pacchiano

详情
英文摘要

Online model selection in Bayesian bandits raises a fundamental exploration challenge: When an environment instance is sampled from a prior distribution, how can we design an adaptive strategy that explores multiple bandit learners and competes with the best one in hindsight? We address this problem by introducing a new Bayesian algorithm for online model selection in stochastic bandits. We prove an oracle-style guarantee of $O\left( d^* M \sqrt{T} + \sqrt{(MT)} \right)$ on the Bayesian regret, where $M$ is the number of base learners, $d^*$ is the regret coefficient of the optimal base learner, and $T$ is the time horizon. We also validate our method empirically across a range of stochastic bandit settings, demonstrating performance that is competitive with the best base learner. Additionally, we study the effect of sharing data among base learners and its role in mitigating prior mis-specification.

2602.17952 2026-02-23 cs.LG

Hardware-Friendly Input Expansion for Accelerating Function Approximation

Hu Lou, Yin-Jun Gao, Dong-Xiao Zhang, Tai-Jiao Du, Jun-Jie Zhang, Jia-Rui Zhang

Comments 22 pages, 4 figures

详情
英文摘要

One-dimensional function approximation is a fundamental problem in scientific computing and engineering applications. While neural networks possess powerful universal approximation capabilities, their optimization process is often hindered by flat loss landscapes induced by parameter-space symmetries, leading to slow convergence and poor generalization, particularly for high-frequency components. Inspired by the principle of \emph{symmetry breaking} in physics, this paper proposes a hardware-friendly approach for function approximation through \emph{input-space expansion}. The core idea involves augmenting the original one-dimensional input (e.g., $x$) with constant values (e.g., $π$) to form a higher-dimensional vector (e.g., $[π, π, x, π, π]$), effectively breaking parameter symmetries without increasing the network's parameter count. We evaluate the method on ten representative one-dimensional functions, including smooth, discontinuous, high-frequency, and non-differentiable functions. Experimental results demonstrate that input-space expansion significantly accelerates training convergence (reducing LBFGS iterations by 12\% on average) and enhances approximation accuracy (reducing final MSE by 66.3\% for the optimal 5D expansion). Ablation studies further reveal the effects of different expansion dimensions and constant selections, with $π$ consistently outperforming other constants. Our work proposes a low-cost, efficient, and hardware-friendly technique for algorithm design.

2602.17951 2026-02-23 cs.CV cs.AI

ROCKET: Residual-Oriented Multi-Layer Alignment for Spatially-Aware Vision-Language-Action Models

Guoheng Sun, Tingting Du, Kaixi Feng, Chenxiang Luo, Xingguo Ding, Zheyu Shen, Ziyao Wang, Yexiao He, Ang Li

详情
英文摘要

Vision-Language-Action (VLA) models enable instruction-following robotic manipulation, but they are typically pretrained on 2D data and lack 3D spatial understanding. An effective approach is representation alignment, where a strong vision foundation model is used to guide a 2D VLA model. However, existing methods usually apply supervision at only a single layer, failing to fully exploit the rich information distributed across depth; meanwhile, naïve multi-layer alignment can cause gradient interference. We introduce ROCKET, a residual-oriented multi-layer representation alignment framework that formulates multi-layer alignment as aligning one residual stream to another. Concretely, ROCKET employs a shared projector to align multiple layers of the VLA backbone with multiple layers of a powerful 3D vision foundation model via a layer-invariant mapping, which reduces gradient conflicts. We provide both theoretical justification and empirical analyses showing that a shared projector is sufficient and outperforms prior designs, and further propose a Matryoshka-style sparse activation scheme for the shared projector to balance multiple alignment losses. Our experiments show that, combined with a training-free layer selection strategy, ROCKET requires only about 4% of the compute budget while achieving 98.5% state-of-the-art success rate on LIBERO. We further demonstrate the superior performance of ROCKET across LIBERO-Plus and RoboTwin, as well as multiple VLA models. The code and model weights can be found at https://github.com/CASE-Lab-UMD/ROCKET-VLA.

2602.17948 2026-02-23 cs.LG

A Geometric Probe of the Accuracy-Robustness Trade-off: Sharp Boundaries in Symmetry-Breaking Dimensional Expansion

Yu Bai, Zhe Wang, Jiarui Zhang, Dong-Xiao Zhang, Yinjun Gao, Jun-Jie Zhang

Comments 22 pages, 3 figures

详情
英文摘要

The trade-off between clean accuracy and adversarial robustness is a pervasive phenomenon in deep learning, yet its geometric origin remains elusive. In this work, we utilize Symmetry-Breaking Dimensional Expansion (SBDE) as a controlled probe to investigate the mechanism underlying this trade-off. SBDE expands input images by inserting constant-valued pixels, which breaks translational symmetry and consistently improves clean accuracy (e.g., from $90.47\%$ to $95.63\%$ on CIFAR-10 with ResNet-18) by reducing parameter degeneracy. However, this accuracy gain comes at the cost of reduced robustness against iterative white-box attacks. By employing a test-time \emph{mask projection} that resets the inserted auxiliary pixels to their training values, we demonstrate that the vulnerability stems almost entirely from the inserted dimensions. The projection effectively neutralizes the attacks and restores robustness, revealing that the model achieves high accuracy by creating \emph{sharp boundaries} (steep loss gradients) specifically along the auxiliary axes. Our findings provide a concrete geometric explanation for the accuracy-robustness paradox: the optimization landscape deepens the basin of attraction to improve accuracy but inevitably erects steep walls along the auxiliary degrees of freedom, creating a fragile sensitivity to off-manifold perturbations.

2602.17947 2026-02-23 cs.LG

Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition

Yubo Zhou, Jun Shu, Junmin Liu, Deyu Meng

详情
英文摘要

Gradient-based hyperparameter optimization (HPO) have emerged recently, leveraging bilevel programming techniques to optimize hyperparameter by estimating hypergradient w.r.t. validation loss. Nevertheless, previous theoretical works mainly focus on reducing the gap between the estimation and ground-truth (i.e., the bias), while ignoring the error due to data distribution (i.e., the variance), which degrades performance. To address this issue, we conduct a bias-variance decomposition for hypergradient estimation error and provide a supplemental detailed analysis of the variance term ignored by previous works. We also present a comprehensive analysis of the error bounds for hypergradient estimation. This facilitates an easy explanation of some phenomena commonly observed in practice, like overfitting to the validation set. Inspired by the derived theories, we propose an ensemble hypergradient strategy to reduce the variance in HPO algorithms effectively. Experimental results on tasks including regularization hyperparameter learning, data hyper-cleaning, and few-shot learning demonstrate that our variance reduction strategy improves hypergradient estimation. To explain the improved performance, we establish a connection between excess error and hypergradient estimation, offering some understanding of empirical observations.

2602.17941 2026-02-23 cs.LG cs.AI

Optimizing Graph Causal Classification Models: Estimating Causal Effects and Addressing Confounders

Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Jianming Yong, Xin Wang

详情
英文摘要

Graph data is becoming increasingly prevalent due to the growing demand for relational insights in AI across various domains. Organizations regularly use graph data to solve complex problems involving relationships and connections. Causal learning is especially important in this context, since it helps to understand cause-effect relationships rather than mere associations. Since many real-world systems are inherently causal, graphs can efficiently model these systems. However, traditional graph machine learning methods including graph neural networks (GNNs), rely on correlations and are sensitive to spurious patterns and distribution changes. On the other hand, causal models enable robust predictions by isolating true causal factors, thus making them more stable under such shifts. Causal learning also helps in identifying and adjusting for confounders, ensuring that predictions reflect true causal relationships and remain accurate even under interventions. To address these challenges and build models that are robust and causally informed, we propose CCAGNN, a Confounder-Aware causal GNN framework that incorporates causal reasoning into graph learning, supporting counterfactual reasoning and providing reliable predictions in real-world settings. Comprehensive experiments on six publicly available datasets from diverse domains show that CCAGNN consistently outperforms leading state-of-the-art models.

2602.17940 2026-02-23 cs.LG

Tighter Regret Lower Bound for Gaussian Process Bandits with Squared Exponential Kernel in Hypersphere

Shogo Iwazaki

Comments 27 pages, 2 figures

详情
英文摘要

We study an algorithm-independent, worst-case lower bound for the Gaussian process (GP) bandit problem in the frequentist setting, where the reward function is fixed and has a bounded norm in the known reproducing kernel Hilbert space (RKHS). Specifically, we focus on the squared exponential (SE) kernel, one of the most widely used kernel functions in GP bandits. One of the remaining open questions for this problem is the gap in the \emph{dimension-dependent} logarithmic factors between upper and lower bounds. This paper partially resolves this open question under a hyperspherical input domain. We show that any algorithm suffers $Ω(\sqrt{T (\ln T)^{d} (\ln \ln T)^{-d}})$ cumulative regret, where $T$ and $d$ represent the total number of steps and the dimension of the hyperspherical domain, respectively. Regarding the simple regret, we show that any algorithm requires $Ω(ε^{-2}(\ln \frac{1}ε)^d (\ln \ln \frac{1}ε)^{-d})$ time steps to find an $ε$-optimal point. We also provide the improved $O((\ln T)^{d+1}(\ln \ln T)^{-d})$ upper bound on the maximum information gain for the SE kernel. Our results guarantee the optimality of the existing best algorithm up to \emph{dimension-independent} logarithmic factors under a hyperspherical input domain.

2602.17937 2026-02-23 cs.CL cs.PL

Analyzing LLM Instruction Optimization for Tabular Fact Verification

Xiaotang Du, Giwon Hong, Wai-Chung Kwan, Rohit Saxena, Ivan Titov, Pasquale Minervini, Emily Allaway

详情
英文摘要

Instruction optimization provides a lightweight, model-agnostic approach to enhancing the reasoning performance of large language models (LLMs). This paper presents the first systematic comparison of instruction optimization, based on the DSPy optimization framework, for tabular fact verification. We evaluate four out-of-the-box prompting techniques that cover both text-only prompting and code use: direct prediction, Chain-of-Thought (CoT), ReAct with SQL tools, and CodeAct with Python execution. We study three optimizers from the DSPy framework -- COPRO, MiPROv2, and SIMBA -- across four benchmarks and three model families. We find that instruction optimization consistently improves verification accuracy, with MiPROv2 yielding the most stable gains for CoT, and SIMBA providing the largest benefits for ReAct agents, particularly at larger model scales. Behavioral analyses reveal that SIMBA encourages more direct reasoning paths by applying heuristics, thereby improving numerical comparison abilities in CoT reasoning and helping avoid unnecessary tool calls in ReAct agents. Across different prompting techniques, CoT remains effective for tabular fact checking, especially with smaller models. Although ReAct agents built with larger models can achieve competitive performance, they require careful instruction optimization.

2602.17934 2026-02-23 cs.LG cs.AI

Causal Neighbourhood Learning for Invariant Graph Representations

Simi Job, Xiaohui Tao, Taotao Cai, Haoran Xie, Jianming Yong

详情
英文摘要

Graph data often contain noisy and spurious correlations that mask the true causal relationships, which are essential for enabling graph models to make predictions based on the underlying causal structure of the data. Dependence on spurious connections makes it challenging for traditional Graph Neural Networks (GNNs) to generalize effectively across different graphs. Furthermore, traditional aggregation methods tend to amplify these spurious patterns, limiting model robustness under distribution shifts. To address these issues, we propose Causal Neighbourhood Learning with Graph Neural Networks (CNL-GNN), a novel framework that performs causal interventions on graph structure. CNL-GNN effectively identifies and preserves causally relevant connections and reduces spurious influences through the generation of counterfactual neighbourhoods and adaptive edge perturbation guided by learnable importance masking and an attention-based mechanism. In addition, by combining structural-level interventions with the disentanglement of causal features from confounding factors, the model learns invariant node representations that are robust and generalize well across different graph structures. Our approach improves causal graph learning beyond traditional feature-based methods, resulting in a robust classification model. Extensive experiments on four publicly available datasets, including multiple domain variants of one dataset, demonstrate that CNL-GNN outperforms state-of-the-art GNN models.

2602.17931 2026-02-23 cs.LG cs.AI

Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning

Narjes Nourzad, Carlee Joe-Wong

Comments Association for the Advancement of Artificial Intelligence (AAAI)

详情
英文摘要

In environments with sparse or delayed rewards, reinforcement learning (RL) incurs high sample complexity due to the large number of interactions needed for learning. This limitation has motivated the use of large language models (LLMs) for subgoal discovery and trajectory guidance. While LLMs can support exploration, frequent reliance on LLM calls raises concerns about scalability and reliability. We address these challenges by constructing a memory graph that encodes subgoals and trajectories from both LLM guidance and the agent's own successful rollouts. From this graph, we derive a utility function that evaluates how closely the agent's trajectories align with prior successful strategies. This utility shapes the advantage function, providing the critic with additional guidance without altering the reward. Our method relies primarily on offline input and only occasional online queries, avoiding dependence on continuous LLM supervision. Preliminary experiments in benchmark environments show improved sample efficiency and faster early learning compared to baseline RL methods, with final returns comparable to methods that require frequent LLM interaction.

2602.17930 2026-02-23 cs.LG cs.AI

MIRA: Memory-Integrated Reinforcement Learning Agent with Limited LLM Guidance

Narjes Nourzad, Carlee Joe-Wong

Comments International Conference on Learning Representations (ICLR'26)

详情
英文摘要

Reinforcement learning (RL) agents often suffer from high sample complexity in sparse or delayed reward settings due to limited prior structure. Large language models (LLMs) can provide subgoal decompositions, plausible trajectories, and abstract priors that facilitate early learning. However, heavy reliance on LLM supervision introduces scalability constraints and dependence on potentially unreliable signals. We propose MIRA (Memory-Integrated Reinforcement Learning Agent), which incorporates a structured, evolving memory graph to guide early training. The graph stores decision-relevant information, including trajectory segments and subgoal structures, and is constructed from both the agent's high-return experiences and LLM outputs. This design amortizes LLM queries into a persistent memory rather than requiring continuous real-time supervision. From this memory graph, we derive a utility signal that softly adjusts advantage estimation to influence policy updates without modifying the underlying reward function. As training progresses, the agent's policy gradually surpasses the initial LLM-derived priors, and the utility term decays, preserving standard convergence guarantees. We provide theoretical analysis showing that utility-based shaping improves early-stage learning in sparse-reward environments. Empirically, MIRA outperforms RL baselines and achieves returns comparable to approaches that rely on frequent LLM supervision, while requiring substantially fewer online LLM queries. Project webpage: https://narjesno.github.io/MIRA/

2602.17926 2026-02-23 cs.RO

Homotopic information gain for sparse active target tracking

Jennifer Wakulicz, Ki Myung Brian Lee, Teresa Vidal-Calleja, Robert Fitch

Comments 12 pages, 12 figures, accepted to Transactions on Robotics

详情
英文摘要

The problem of planning sensing trajectories for a mobile robot to collect observations of a target and predict its future trajectory is known as active target tracking. Enabled by probabilistic motion models, one may solve this problem by exploring the belief space of all trajectory predictions given future sensing actions to maximise information gain. However, for multi-modal motion models the notion of information gain is often ill-defined. This paper proposes a planning approach designed around maximising information regarding the target's homotopy class, or high-level motion. We introduce homotopic information gain, a measure of the expected high-level trajectory information given by a measurement. We show that homotopic information gain is a lower bound for metric or low-level information gain, and is as sparsely distributed in the environment as obstacles are. Planning sensing trajectories to maximise homotopic information results in highly accurate trajectory estimates with fewer measurements than a metric information approach, as supported by our empirical evaluation on real and simulated pedestrian data.

2602.17921 2026-02-23 cs.RO cs.LG

Latent Diffeomorphic Co-Design of End-Effectors for Deformable and Fragile Object Manipulation

Kei Ikemura, Yifei Dong, Florian T. Pokorny

详情
英文摘要

Manipulating deformable and fragile objects remains a fundamental challenge in robotics due to complex contact dynamics and strict requirements on object integrity. Existing approaches typically optimize either end-effector design or control strategies in isolation, limiting achievable performance. In this work, we present the first co-design framework that jointly optimizes end-effector morphology and manipulation control for deformable and fragile object manipulation. We introduce (1) a latent diffeomorphic shape parameterization enabling expressive yet tractable end-effector geometry optimization, (2) a stress-aware bi-level co-design pipeline coupling morphology and control optimization, and (3) a privileged-to-pointcloud policy distillation scheme for zero-shot real-world deployment. We evaluate our approach on challenging food manipulation tasks, including grasping and pushing jelly and scooping fillets. Simulation and real-world experiments demonstrate the effectiveness of the proposed method.

2602.17910 2026-02-23 cs.AI

Alignment in Time: Peak-Aware Orchestration for Long-Horizon Agentic Systems

Hanjing Shi, Dominic DiFranzo

详情
英文摘要

Traditional AI alignment primarily focuses on individual model outputs; however, autonomous agents in long-horizon workflows require sustained reliability across entire interaction trajectories. We introduce APEMO (Affect-aware Peak-End Modulation for Orchestration), a runtime scheduling layer that optimizes computational allocation under fixed budgets by operationalizing temporal-affective signals. Instead of modifying model weights, APEMO detects trajectory instability through behavioral proxies and targets repairs at critical segments, such as peak moments and endings. Evaluation across multi-agent simulations and LLM-based planner--executor flows demonstrates that APEMO consistently enhances trajectory-level quality and reuse probability over structural orchestrators. Our results reframe alignment as a temporal control problem, offering a resilient engineering pathway for the development of long-horizon agentic systems.

2602.17902 2026-02-23 cs.AI cs.MA cs.SE physics.chem-ph

El Agente Gráfico: Structured Execution Graphs for Scientific Agents

Jiaru Bai, Abdulrahman Aldossary, Thomas Swanick, Marcel Müller, Yeonghun Kang, Zijian Zhang, Jin Won Lee, Tsz Wai Ko, Mohammad Ghazi Vakili, Varinia Bernales, Alán Aspuru-Guzik

详情
英文摘要

Large language models (LLMs) are increasingly used to automate scientific workflows, yet their integration with heterogeneous computational tools remains ad hoc and fragile. Current agentic approaches often rely on unstructured text to manage context and coordinate execution, generating often overwhelming volumes of information that may obscure decision provenance and hinder auditability. In this work, we present El Agente Gráfico, a single-agent framework that embeds LLM-driven decision-making within a type-safe execution environment and dynamic knowledge graphs for external persistence. Central to our approach is a structured abstraction of scientific concepts and an object-graph mapper that represents computational state as typed Python objects, stored either in memory or persisted in an external knowledge graph. This design enables context management through typed symbolic identifiers rather than raw text, thereby ensuring consistency, supporting provenance tracking, and enabling efficient tool orchestration. We evaluate the system by developing an automated benchmarking framework across a suite of university-level quantum chemistry tasks previously evaluated on a multi-agent system, demonstrating that a single agent, when coupled to a reliable execution engine, can robustly perform complex, multi-step, and parallel computations. We further extend this paradigm to two other large classes of applications: conformer ensemble generation and metal-organic framework design, where knowledge graphs serve as both memory and reasoning substrates. Together, these results illustrate how abstraction and type safety can provide a scalable foundation for agentic scientific automation beyond prompt-centric designs.

2602.17898 2026-02-23 cs.LG

Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors

Jingquan Yan, Yuwei Miao, Peiran Yu, Junzhou Huang

Comments Accepted by ICLR 2026

详情
英文摘要

Attention-based regression models are often trained by jointly optimizing Mean Squared Error (MSE) loss and Pearson correlation coefficient (PCC) loss, emphasizing the magnitude of errors and the order or shape of targets, respectively. A common but poorly understood phenomenon during training is the PCC plateau: PCC stops improving early in training, even as MSE continues to decrease. We provide the first rigorous theoretical analysis of this behavior, revealing fundamental limitations in both optimization dynamics and model capacity. First, in regard to the flattened PCC curve, we uncover a critical conflict where lowering MSE (magnitude matching) can paradoxically suppress the PCC gradient (shape matching). This issue is exacerbated by the softmax attention mechanism, particularly when the data to be aggregated is highly homogeneous. Second, we identify a limitation in the model capacity: we derived a PCC improvement limit for any convex aggregator (including the softmax attention), showing that the convex hull of the inputs strictly bounds the achievable PCC gain. We demonstrate that data homogeneity intensifies both limitations. Motivated by these insights, we propose the Extrapolative Correlation Attention (ECA), which incorporates novel, theoretically-motivated mechanisms to improve the PCC optimization and extrapolate beyond the convex hull. Across diverse benchmarks, including challenging homogeneous data setting, ECA consistently breaks the PCC plateau, achieving significant improvements in correlation without compromising MSE performance.

2602.17893 2026-02-23 cs.LG

COMBA: Cross Batch Aggregation for Learning Large Graphs with Context Gating State Space Models

Jiajun Shen, Yufei Jin, Yi He, xingquan Zhu

详情
英文摘要

State space models (SSMs) have recently emerged for modeling long-range dependency in sequence data, with much simplified computational costs than modern alternatives, such as transformers. Advancing SMMs to graph structured data, especially for large graphs, is a significant challenge because SSMs are sequence models and the shear graph volumes make it very expensive to convert graphs as sequences for effective learning. In this paper, we propose COMBA to tackle large graph learning using state space models, with two key innovations: graph context gating and cross batch aggregation. Graph context refers to different hops of neighborhood for each node, and graph context gating allows COMBA to use such context to learn best control of neighbor aggregation. For each graph context, COMBA samples nodes as batches, and train a graph neural network (GNN), with information being aggregated cross batches, allowing COMBA to scale to large graphs. Our theoretical study asserts that cross-batch aggregation guarantees lower error than training GNN without aggregation. Experiments on benchmark networks demonstrate significant performance gains compared to baseline approaches. Code and benchmark datasets will be released for public access.

2602.17888 2026-02-23 cs.LG cs.AI

Machine Learning Based Prediction of Surgical Outcomes in Chronic Rhinosinusitis from Clinical Data

Sayeed Shafayet Chowdhury, Karen D'Souza, V. Siva Kakumani, Snehasis Mukhopadhyay, Shiaofen Fang, Rodney J. Schlosser, Daniel M. Beswick, Jeremiah A. Alt, Jess C. Mace, Zachary M. Soler, Timothy L. Smith, Vijay R. Ramakrishnan

详情
英文摘要

Artificial intelligence (AI) has increasingly transformed medical prognostics by enabling rapid and accurate analysis across imaging and pathology. However, the investigation of machine learning predictions applied to prospectively collected, standardized data from observational clinical intervention trials remains underexplored, despite its potential to reduce costs and improve patient outcomes. Chronic rhinosinusitis (CRS), a persistent inflammatory disease of the paranasal sinuses lasting more than three months, imposes a substantial burden on quality of life (QoL) and societal cost. Although many patients respond to medical therapy, others with refractory symptoms often pursue surgical intervention. Surgical decision-making in CRS is complex, as it must weigh known procedural risks against uncertain individualized outcomes. In this study, we evaluated supervised machine learning models for predicting surgical benefit in CRS, using the Sino-Nasal Outcome Test-22 (SNOT-22) as the primary patient-reported outcome. Our prospectively collected cohort from an observational intervention trial comprised patients who all underwent surgery; we investigated whether models trained only on preoperative data could identify patients who might not have been recommended surgery prior to the procedure. Across multiple algorithms, including an ensemble approach, our best model achieved approximately 85% classification accuracy, providing accurate and interpretable predictions of surgical candidacy. Moreover, on a held-out set of 30 cases spanning mixed difficulty, our model achieved 80% accuracy, exceeding the average prediction accuracy of expert clinicians (75.6%), demonstrating its potential to augment clinical decision-making and support personalized CRS care.