arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1582
专题追踪
2512.22399 2026-02-11 astro-ph.IM cs.AI physics.space-ph

Space AI: Leveraging Artificial Intelligence for Space to Improve Life on Earth

Ziyang Wang

详情
英文摘要

Artificial Intelligence (AI) is transforming domains from healthcare and agriculture to finance and industry. As progress on Earth meets growing constraints, the next frontier is outer space, where AI can enable autonomous, resilient operations under extreme uncertainty and limited human oversight. This paper introduces Space AI as a unified interdisciplinary field at the intersection of artificial intelligence and space science and technology. We consolidate historical developments and contemporary progress, and propose a systematic framework that organises Space AI into four mission contexts: 1 AI on Earth, covering intelligent mission planning, spacecraft design optimisation, simulation, and ground-based data analytics; 2 AI in Orbit, focusing on satellite and station autonomy, space robotics, on-board/near-real-time data processing, communication optimisation, and orbital safety; 3 AI in Deep Space, enabling autonomous navigation, adaptive scientific discovery, resource mapping, and long-duration human-AI collaboration under communication constraints; and 4 AI for Multi-Planetary Life, supporting in-situ resource utilisation, habitat and infrastructure construction, life-support and ecological management, and resilient interplanetary networks. Ultimately, Space AI can accelerate humanity's capability to explore and operate in space, while translating advances in sensing, robotics, optimisation, and trustworthy AI into broad societal impact on Earth.

2512.21747 2026-02-11 cs.HC cs.CV

Modified TSception for Analyzing Driver Drowsiness and Mental Workload from EEG

Gourav Siddhad, Anurag Singh, Rajkumar Saini, Partha Pratim Roy

Comments 8 Pages, 4 Figures, 1 Table

详情
英文摘要

Driver drowsiness is a leading cause of traffic accidents, necessitating real-time, reliable detection systems to ensure road safety. This study proposes a Modified TSception architecture for robust assessment of driver fatigue and mental workload using Electroencephalography (EEG). The model introduces a five-layer hierarchical temporal refinement strategy to capture multi-scale brain dynamics, surpassing the original TSception's three-layer approach. Key innovations include the use of Adaptive Average Pooling (ADP) for structural flexibility across varying EEG dimensions and a two-stage fusion mechanism to optimize spatiotemporal feature integration for improved stability. Evaluated on the SEED-VIG dataset, the Modified TSception achieves 83.46% accuracy, comparable to the original model (83.15%), but with a significantly reduced confidence interval (0.24 vs. 0.36), indicating better performance stability. The architecture's generalizability was further validated on the STEW mental workload dataset, achieving state-of-the-art accuracies of 95.93% and 95.35% for 2-class and 3-class classification, respectively. These results show that the proposed modifications improve consistency and cross-task generalizability, making the model a reliable framework for EEG-based safety monitoring.

2512.19172 2026-02-11 math.OC cs.LG cs.SY eess.SY

Finite-sample guarantees for data-driven forward-backward operator methods

Filippo Fabiani, Barbara Franci

详情
英文摘要

We establish finite sample certificates on the quality of solutions produced by data-based forward-backward (FB) operator splitting schemes. As frequently happens in stochastic regimes, we consider the problem of finding a zero of the sum of two operators, where one is either unavailable in closed form or computationally expensive to evaluate, and shall therefore be approximated using a finite number of noisy oracle samples. Under the lens of algorithmic stability, we then derive probabilistic bounds on the distance between a true zero and the FB output without making specific assumptions about the underlying data distribution. We show that under weaker conditions ensuring the convergence of FB schemes, stability bounds grow proportionally to the number of iterations. Conversely, stronger assumptions yield stability guarantees that are independent of the iteration count. We then specialize our results to a popular FB stochastic Nash equilibrium seeking algorithm and validate our theoretical bounds on a control problem for smart grids, where the energy price uncertainty is approximated by means of historical data.

2512.11276 2026-02-11 cs.HC cs.AI

Words to Describe What I'm Feeling: Exploring the Potential of AI Agents for High Subjectivity Decisions in Advance Care Planning

Kellie Yu Hui Sim, Pin Sym Foong, Chenyu Zhao, Melanie Yi Ning Quek, Swarangi Subodh Mehta, Kenny Tsu Wei Choo

Comments Accepted at CHI 2026. 34 pages, 10 figures

详情
英文摘要

Loss of decisional capacity, coupled with the increasing absence of reliable human proxies, raises urgent questions about how individuals' values can be represented in Advance Care Planning (ACP). To probe this fraught design space of high-risk, high-subjectivity decision support, we built an experience prototype (\acpagent{}) and asked 15 participants in 4 workshops to train it to be their personal ACP proxy. We analysed their coping strategies and feature requests and mapped the results onto axes of agent autonomy and human control. Our findings show a surprising 86.7\% agreement with \acpagent{}, arguing for a potential new role of AI in ACP where agents act as personal advocates for individuals, building mutual intelligibility over time. We propose that the key areas of future risk that must be addressed are the moderation of users' expectations and designing accountability and oversight over agent deployment and cutoffs.

2512.09974 2026-02-11 cs.SI cs.LG

Enhancing Fake-News Detection with Node-Level Topological Features

Kaiyuan Xu

详情
英文摘要

In recent years, the proliferation of misinformation and fake news has posed serious threats to individuals and society, spurring intense research into automated detection methods. Previous work showed that integrating content, user preferences, and propagation structure achieves strong performance, but leaves all graph-level representation learning entirely to the GNN, hiding any explicit topological cues. To close this gap, we introduce a lightweight enhancement: for each node, we append two classical graph-theoretic metrics, degree centrality and local clustering coefficient, to its original BERT and profile embeddings, thus explicitly flagging the roles of hub and community. In the UPFD Politifact subset, this simple modification boosts macro F1 from 0.7753 to 0.8344 over the original baseline. Our study not only demonstrates the practical value of explicit topology features in fake-news detection but also provides an interpretable, easily reproducible template for fusing graph metrics in other information-diffusion tasks.

2512.08493 2026-02-11 cs.CR cs.AI

LLM-based Vulnerable Code Augmentation: Generate or Refactor?

Dyna Soumhane Ouchebara, Stéphane Dupont

Comments 15 pages, Accepted by ESAAN 2026, version with added appendix

详情
英文摘要

Vulnerability code-bases often suffer from severe imbalance, limiting the effectiveness of Deep Learning-based vulnerability classifiers. Data Augmentation could help solve this by mitigating the scarcity of under-represented vulnerability types. In this context, we investigate LLM-based augmentation for vulnerable functions, comparing controlled generation of new vulnerable samples with semantics-preserving refactoring of existing ones. Using Qwen2.5-Coder to produce augmented data and CodeBERT as a classifier on the SVEN dataset, we find that our approaches are indeed effective in enriching vulnerable code-bases through a simple process and with reasonable quality, and that a hybrid strategy best boosts vulnerability classifiers' performance. Code repository is available here : https://github.com/DynaSoumhaneOuchebara/LLM-based-code-augmentation-Generate-or-Refactor-

2511.17528 2026-02-11 cs.NI cs.AI

Evaluating Device-First Continuum AI (DFC-AI) for Autonomous Operations in the Energy Sector

Siavash M. Alamouti, Fay Arjomandi, Michel Burger, Bashar Altakrouri

Comments 10 pages, 4 figures, 6 tables

详情
英文摘要

Industrial automation in the energy sector requires AI systems that can operate autonomously regardless of network availability, a requirement that cloud-centric architectures cannot meet. This paper evaluates the application of Device-First Continuum AI (DFC-AI) to critical energy sector operations. DFC-AI, a specialized architecture within the Hybrid Edge Cloud paradigm, implements intelligent agents using a microservices architecture that originates at end devices and extends across the computational continuum. Through comprehensive simulations of energy sector scenarios including drone inspections, sensor networks, and worker safety systems, we demonstrate that DFC-AI maintains full operational capability during network outages while cloud and gateway-based systems experience complete or partial failure. Our analysis reveals that zero-configuration GPU discovery and heterogeneous device clustering are particularly well-suited for energy sector deployments, where specialized nodes can handle intensive AI workloads for entire fleets of inspection drones or sensor networks. The evaluation shows that DFC-AI achieves significant latency reduction and energy savings compared to cloud architectures. Additionally, we find that gateway based edge solutions can paradoxically cost more than cloud solutions for certain energy sector workloads due to infrastructure overhead, while DFC-AI can consistently provide cost savings by leveraging enterprise-owned devices. These findings, validated through rigorous statistical analysis, establish that DFC-AI addresses the unique challenges of energy sector operations, ensuring intelligent agents remain available and functional in remote oil fields, offshore platforms, and other challenging environments characteristic of the industry.

2510.12832 2026-02-11 eess.SY cs.AI cs.LG cs.SY eess.SP

Coherent Load Profile Synthesis with Conditional Diffusion for LV Distribution Network Scenario Generation

Alistair Brash, Junyi Lu, Bruce Stephen, Blair Brown, Robert Atkinson, Craig Michie, Fraser MacIntyre, Christos Tachtatzis

详情
英文摘要

Limited visibility of distribution network power flows at the low voltage level presents challenges to both distribution network operators from a planning perspective and distribution system operators from a congestion management perspective. More representative loads are required to support meaningful analysis of LV substations; otherwise, such analysis risks misinforming future decisions. Traditional load profiling relies on typical profiles, oversimplifying substation-level complexity. Generative models have attempted to address this through synthesising representative loads from historical exemplars; however, while these approaches can approximate load shapes to a convincing degree of fidelity, analysis of the co-behaviour between substations is limited, which ultimately impacts higher voltage level network operation. This limitation will become even more pronounced with the increasing integration of low-carbon technologies, as estimates of base loads fail to capture load diversity. To address this gap, Conditional Diffusion models for synthesising daily active and reactive power profiles at the low voltage distribution substation level are proposed. The evaluation of fidelity is demonstrated through conventional metrics capturing temporal and statistical realism, as well as power flow modelling. Multiple models are proposed to handle varying levels of data availability, ranging from unconditional synthesis to an informed generation driven by metadata and daily statistics. The results show synthesised load profiles are plausible both independently and as a cohort in a wider power systems context. The Conditional Diffusion model is benchmarked against both naive and state-of-the-art models to demonstrate its effectiveness in producing realistic scenarios on which to base sub-regional power distribution network planning and operations.

2510.07342 2026-02-11 q-bio.NC cs.LG eess.IV

Beyond Grid-Locked Voxels: Neural Response Functions for Continuous Brain Encoding

Haomiao Chen, Keith W Jamison, Mert R. Sabuncu, Amy Kuceyeski

详情
英文摘要

Neural encoding models aim to predict fMRI-measured brain responses to natural images. fMRI data is acquired as a 3D volume of voxels, where each voxel has a defined spatial location in the brain. However, conventional encoding models often flatten this volume into a 1D vector and treat voxel responses as independent outputs. This removes spatial context, discards anatomical information, and ties each model to a subject-specific voxel grid. We introduce the Neural Response Function (NRF), a framework that models fMRI activity as a continuous function over anatomical space rather than a flat vector of voxels. NRF represents brain activity as a continuous implicit function: given an image and a spatial coordinate (x, y, z) in standardized MNI space, the model predicts the response at that location. This formulation decouples predictions from the training grid, supports querying at arbitrary spatial resolutions, and enables resolution-agnostic analyses. By grounding the model in anatomical space, NRF exploits two key properties of brain responses: (1) local smoothness -- neighboring voxels exhibit similar response patterns; modeling responses continuously captures these correlations and improves data efficiency, and (2) cross-subject alignment -- MNI coordinates unify data across individuals, allowing a model pretrained on one subject to be fine-tuned on new subjects. In experiments, NRF outperformed baseline models in both intrasubject encoding and cross-subject adaptation, achieving high performance while reducing the data size needed by orders of magnitude. To our knowledge, NRF is the first anatomically aware encoding model to move beyond flattened voxels, learning a continuous mapping from images to brain responses in 3D space.

2509.21996 2026-02-11 stat.ML cs.LG

A Nonparametric Discrete Hawkes Model with a Collapsed Gaussian-Process Prior

Trinnhallen Brisley, Gordon Ross, Daniel Paulin

详情
英文摘要

Hawkes process models are used in settings where past events increase the likelihood of future events occurring. Many applications record events as counts on a regular grid, yet discrete-time Hawkes models remain comparatively underused and are often constrained by fixed-form baselines and excitation kernels. In particular, there is a lack of flexible, nonparametric treatments of both the baseline and the excitation in discrete time. To this end, we propose the Gaussian Process Discrete Hawkes Process (GP-DHP), a nonparametric framework that places Gaussian process priors on both the baseline and the excitation and performs inference through a collapsed latent representation. This yields smooth, data-adaptive structure without prespecifying trends, periodicities, or decay shapes, and enables maximum a posteriori (MAP) estimation with near-linear-time \(O(T\log T)\) complexity. A closed-form projection recovers interpretable baseline and excitation functions from the optimized latent trajectory. In simulations, GP-DHP recovers diverse excitation shapes and evolving baselines. In case studies on U.S. terrorism incidents and weekly Cryptosporidiosis counts, it improves test predictive log-likelihood over standard parametric discrete Hawkes baselines while capturing bursts, delays, and seasonal background variation. The results indicate that flexible discrete-time self-excitation can be achieved without sacrificing scalability or interpretability.

2509.13166 2026-02-11 eess.SY cs.LG cs.SY eess.SP math.OC

Concentration inequalities for semidefinite least squares based on data

Filippo Fabiani, Andrea Simonetto

详情
英文摘要

We study data-driven least squares (LS) problems with semidefinite (SD) constraints and derive finite-sample guarantees on the spectrum of their optimal solutions when these constraints are relaxed. In particular, we provide a high confidence bound allowing one to solve a simpler program in place of the full SDLS problem, while ensuring that the eigenvalues of the resulting solution are $\varepsilon$-close of those enforced by the SD constraints. The developed certificate, which consistently shrinks as the number of data increases, turns out to be easy-to-compute, distribution-free, and only requires independent and identically distributed samples. Moreover, when the SDLS is used to learn an unknown quadratic function, we establish bounds on the error between a gradient descent iterate minimizing the surrogate cost obtained with no SD constraints and the true minimizer.

2508.08438 2026-02-11 cs.CR cs.LG cs.OS

Selective KV-Cache Sharing to Mitigate Timing Side-Channels in LLM Inference

Kexin Chu, Zecheng Lin, Dawei Xiang, Zixu Shen, Jianchang Su, Cheng Chu, Yiwei Yang, Wenhui Zhang, Wenfei Wu, Wei Zhang

Comments 14 pages,15 figures

详情
英文摘要

Global KV-cache sharing is an effective optimization for accelerating large language model (LLM) inference, yet it introduces an API-visible timing side channel that lets adversaries infer sensitive user inputs from shared entries, leading to cross-tenant privacy risks. To address this problem, we introduce SafeKV (Secure and Flexible KV-cache Sharing), a system-level co-design of privacy enforcement and KV-cache management. SafeKV integrates lightweight detection and isolation directly into the serving runtime to eliminate cross-tenant reuse of sensitive KV-cache blocks under our threat model, while recovering most of the performance benefits of global sharing. Our key contributions are: (1) a three-tier asynchronous detection pipeline that decouples privacy classification from inference and supports streaming workloads, (2) a unified radix-tree-based memory manager with path compression and sensitivity-aware eviction for scalable selective isolation, and (3) an RDR-guided (Reuse Diversity Ratio) runtime safeguard that detects and bounds residual leakage. On large LLM backends, SafeKV reduces the time-to-first-token (TTFT) overhead compared to full isolation by up to 40.58% and raises throughput by up to 2.66x. Overall, SafeKV restores the efficiency of KV reuse while enforcing strong, practical privacy for multi-tenant LLM inference.

2508.06734 2026-02-11 cs.CR cs.LG

Quantifying the Generalization Gap: A New Benchmark for Out-of-Distribution Graph-Based Android Malware Classification

Ngoc N. Tran, Anwar Said, Waseem Abbas, Tyler Derr, Xenofon D. Koutsoukos

Comments 14 pages, 5 figures, 10 tables, under review

详情
英文摘要

While graph-based Android malware classifiers achieve over 94% accuracy on standard benchmarks, they exhibit a significant generalization gap under distribution shift, suffering up to 45% performance degradation when encountering unseen malware variants from known families. This work systematically investigates this critical yet overlooked challenge for real-world deployment by introducing a benchmarking suite designed to simulate two prevalent scenarios: MalNet-Tiny-Common for covariate shift, and MalNet-Tiny-Distinct for domain shift. Furthermore, we identify an inherent limitation in existing benchmarks where the inputs are structure-only function call graphs, which fails to capture the latent semantic patterns necessary for robust generalization. To verify this, we construct a semantic enrichment framework that augments the original topology with function-level attributes, including lightweight metadata and LLM-based code embeddings. By providing this expanded feature set, we aim to equip future research with richer behavioral information to facilitate the development of more sophisticated detection techniques. Empirical evaluations confirm the effectiveness of our data-centric methodology, with which classification performs better under distribution shift compared to model-based approaches, and consistently further enhances robustness when used in conjunction. We release our precomputed datasets, along with an extensible implementation of our comprehensive pipeline, to lay the groundwork for building resilient malware detection systems for evolving threat environments.

2508.06133 2026-02-11 math.OC cs.AI cs.LG

LLM Serving Optimization with Variable Prefill and Decode Lengths

Meixuan Wang, Yinyu Ye, Zijie Zhou

详情
英文摘要

We study offline scheduling for large language model (LLM) serving under a fixed KV-cache memory budget, where requests have heterogeneous prompt (prefill) and response (decode) lengths. Prompt tokens determine initial KV usage, and each generated token increases memory by one unit. Given a backlog of n requests arriving together, we schedule mixed prefill and decode batches to minimize total end-to-end latency. We show that heterogeneity in prompt lengths makes the problem computationally intractable and that widely used heuristics such as first-come-first-served and shortest-first can be arbitrarily suboptimal. We propose Sorted-F, which repeatedly forms feasible batches using a new selection metric that balances batch size against downstream decode cost, and prove it achieves a constant-factor guarantee on total latency. We further develop practical variants -- an exact solver for small instances and fast heuristics for larger ones -- and evaluate them on a public workload spanning short conversations and long-document summarization, where they consistently reduce average latency relative to standard baselines. Our results highlight that during peak-hour tidal backlogs, greedy GPU packing or short-request prioritization can perform poorly when prompt lengths vary widely, and provide a principled, tunable framework for designing production batch schedulers and planning capacity in memory-constrained LLM serving systems.

2507.13638 2026-02-11 q-bio.NC cs.LG

State Space Models Naturally Produce Time Cell and Oscillatory Behaviors and Scale to Abstract Cognitive Functions

Sen Lu, Xiaoyu Zhang, Mingtao Hu, Eric Yeu-Jer Lee, Soohyeon Kim, Wei D. Lu

Comments Sen Lu and Xiaoyu Zhang contributed equally. Wei D. Lu is the corresponding author. 5 figures are included in 15 pages

详情
英文摘要

A grand challenge in modern neuroscience is to bridge the gap between the detailed mapping of microscale neural circuits and mechanistic understanding of cognitive functions. While extensive knowledge exists about neuronal connectivity and biophysics, how these low-level phenomena eventually produce abstract behaviors remains largely unresolved. Here, we propose that a model based on State Space Models, an emerging class of deep learning architectures, can be a potential biological model for analysis. We suggest that the differential equations governing elements in a State Space Model are conceptually consistent with the dynamics of biophysical processes, while the model offers a scalable framework to build on the dynamics to produce emergent behaviors observed in experimental neuroscience. We test this model by training a network employing a diagonal state transition matrix on temporal discrimination tasks with reinforcement learning. Our results suggest that neural behaviors such as time cells naturally emerge from two fundamental principles: optimal pre-configuration and rotational dynamics. These features are shown mathematically to optimize history compression, and naturally generate structured temporal dynamics even prior to training, mirroring recent findings in biological circuits. We show that learning acts primarily as a selection mechanism that fine-tunes these pre-configured oscillatory modes, rather than constructing temporal codes de novo. The model can be readily scaled to abstract cognitive functions such as event counting, supporting the use of State Space Models as a computationally tractable framework for understanding neural activities.

2507.09093 2026-02-11 stat.ML cs.LG math.OC

Sharp High-Probability Rates for Nonlinear SGD under Heavy-Tailed Noise via Symmetrization

Aleksandar Armacki, Dragana Bajovic, Dusan Jakovetic, Soummya Kar

Comments 43 pages, 1 figure

详情
英文摘要

We study convergence in high-probability of SGD-type methods in non-convex optimization and the presence of heavy-tailed noise. To combat the heavy-tailed noise, a general black-box nonlinear framework is considered, subsuming nonlinearities like sign, clipping, normalization and their smooth counterparts. Our first result shows that nonlinear SGD (N-SGD) achieves the rate $\widetilde{\mathcal{O}}(t^{-1/2})$, for any noise with unbounded moments and a symmetric probability density function (PDF). Crucially, N-SGD has exponentially decaying tails, matching the performance of linear SGD under light-tailed noise. To handle non-symmetric noise, we propose two novel estimators, based on the idea of noise symmetrization. The first, dubbed Symmetrized Gradient Estimator (SGE), assumes a noiseless gradient at any reference point is available at the start of training, while the second, dubbed Mini-batch SGE (MSGE), uses mini-batches to estimate the noiseless gradient. Combined with the nonlinear framework, we get N-SGE and N-MSGE methods, respectively, both achieving the same convergence rate and exponentially decaying tails as N-SGD, while allowing for non-symmetric noise with unbounded moments and PDF satisfying a mild technical condition, with N-MSGE additionally requiring bounded noise moment of order $p \in (1,2]$. Compared to works assuming noise with bounded $p$-th moment, our results: 1) are based on a novel symmetrization approach; 2) provide a unified framework and relaxed moment conditions; 3) imply optimal oracle complexity of N-SGD and N-SGE, strictly better than existing works when $p < 2$, while the complexity of N-MSGE is close to existing works. Compared to works assuming symmetric noise with unbounded moments, we: 1) provide a sharper analysis and improved rates; 2) facilitate state-dependent symmetric noise; 3) extend the strong guarantees to non-symmetric noise.

2507.02883 2026-02-11 q-bio.BM cs.LG

DISPROTBENCH: Uncovering the Functional Limits of Protein Structure Prediction Models in Intrinsically Disordered Regions

Xinyue Zeng, Tuo Wang, Adithya Kulkarni, Alexander Lu, Alexandra Ni, Phoebe Xing, Junhan Zhao, Siwei Chen, Dawei Zhou

详情
英文摘要

Intrinsically disordered regions (IDRs) play central roles in cellular function, yet remain poorly evaluated by existing protein structure prediction benchmarks. Current evaluations largely focus on well-folded domains, overlooking three fundamental challenges in realistic biological settings: the structural complexity of proteins, the resulting low availability of reliable ground truth, and prediction uncertainty that can propagate into high-risk downstream failures, such as in drug discovery, protein-protein interaction modeling, and functional annotation. We present DisProtBench, an IDR-centric benchmark that explicitly incorporates prediction uncertainty into the evaluation of protein structure prediction models (PSPMs). To address structural complexity and ground-truth scarcity, we curate and unify a large-scale, multi-modal dataset spanning disease-relevant IDRs, GPCR-ligand interactions, and multimeric protein complexes. To assess predictive uncertainty, we introduce Functional Uncertainty Sensitivity (FUS), a novel prediction uncertainty-stratified metric that quantifies downstream task performance under prediction uncertainty. Using this benchmark, we conduct a systematic evaluation of state-of-the-art PSPMs and reveal clear, task-dependent failure modes. Protein-protein interaction prediction degrades sharply in IDRs, while structure-based drug discovery remains comparatively robust. These effects are largely invisible to standard global accuracy metrics, which overestimate functional reliability under prediction uncertainty. We have open-sourced our benchmark and the codebase at https://github.com/Susan571/DisProtBench.

2506.14957 2026-02-11 q-bio.NC cs.LG

POCO: Scalable Neural Forecasting through Population Conditioning

Yu Duan, Hamza Tahir Chaudhry, Misha B. Ahrens, Christopher D Harvey, Matthew G Perich, Karl Deisseroth, Kanaka Rajan

Journal ref Advances in Neural Information Processing Systems (2025)

详情
英文摘要

Predicting future neural activity is a core challenge in modeling brain dynamics, with applications ranging from scientific investigation to closed-loop neurotechnology. While recent models of population activity emphasize interpretability and behavioral decoding, neural forecasting-particularly across multi-session, spontaneous recordings-remains underexplored. We introduce POCO, a unified forecasting model that combines a lightweight univariate forecaster with a population-level encoder to capture both neuron-specific and brain-wide dynamics. Trained across five calcium imaging datasets spanning zebrafish, mice, and C. elegans, POCO achieves state-of-the-art accuracy at cellular resolution in spontaneous behaviors. After pre-training, POCO rapidly adapts to new recordings with minimal fine-tuning. Notably, POCO's learned unit embeddings recover biologically meaningful structure-such as brain region clustering-without any anatomical labels. Our comprehensive analysis reveals several key factors influencing performance, including context length, session diversity, and preprocessing. Together, these results position POCO as a scalable and adaptable approach for cross-session neural forecasting and offer actionable insights for future model design. By enabling accurate, generalizable forecasting models of neural dynamics across individuals and species, POCO lays the groundwork for adaptive neurotechnologies and large-scale efforts for neural foundation models. Code is available at https://github.com/yuvenduan/POCO.

2506.13865 2026-02-11 quant-ph cond-mat.dis-nn cs.LG cs.NE stat.ML

Connecting phases of matter to the flatness of the loss landscape in analog variational quantum algorithms

Kasidit Srimahajariyapong, Supanut Thanasilp, Thiparat Chotibut

Comments 17+9 pages, 9+7 figures

详情
英文摘要

Variational quantum algorithms (VQAs) promise near-term quantum advantage, yet parametrized quantum states commonly built from the digital gate-based approach often suffer from scalability issues such as barren plateaus, where the loss landscape becomes flat. We study an analog VQA ansätze composed of $M$ quenches of a disordered Ising chain, whose dynamics is native to several quantum simulation platforms. By tuning the disorder strength we place each quench in either a thermalized phase or a many-body-localized (MBL) phase and analyse (i) the ansätze's expressivity and (ii) the scaling of loss variance. Numerics shows that both phases reach maximal expressivity at large $M$, but barren plateaus emerge at far smaller $M$ in the thermalized phase than in the MBL phase. Exploiting this gap, we propose an MBL initialisation strategy: initialise the ansätze in the MBL regime at intermediate quench $M$, enabling an initial trainability while retaining sufficient expressivity for subsequent optimization. The results link quantum phases of matter and VQA trainability, and provide practical guidelines for scaling analog-hardware VQAs.

2506.02044 2026-02-11 q-bio.NC cs.LG

A Brain Graph Foundation Model: Pre-Training and Prompt-Tuning across Broad Atlases and Disorders

Xinxu Wei, Kanhao Zhao, Yong Jiao, Lifang He, Yu Zhang

Comments 30pages

详情
英文摘要

As large language models (LLMs) continue to revolutionize AI research, there is a growing interest in building large-scale brain foundation models to advance neuroscience. While most existing brain foundation models are pre-trained on time-series signals or connectome features, we propose a novel graph-based pre-training paradigm for constructing a brain graph foundation model. In this paper, we introduce the Brain Graph Foundation Model, termed BrainGFM, a unified framework that leverages graph contrastive learning and graph masked autoencoders for large-scale fMRI-based pre-training. BrainGFM is pre-trained on a diverse mixture of brain atlases with varying parcellations, significantly expanding the pre-training corpus and enhancing the model's ability to generalize across heterogeneous fMRI-derived brain representations. To support efficient and versatile downstream transfer, we integrate both graph prompts and language prompts into the model design, enabling BrainGFM to flexibly adapt to a wide range of atlases, neurological and psychiatric disorders, and task settings. Furthermore, we employ meta-learning to optimize the graph prompts, facilitating strong generalization to previously unseen disorders under both few-shot and zero-shot learning conditions via language-guided prompting. BrainGFM is pre-trained on 27 neuroimaging datasets spanning 25 common neurological and psychiatric disorders, encompassing 2 types of brain atlases (functional and anatomical) across 8 widely-used parcellations, and covering over 25,000 subjects, 60,000 fMRI scans, and a total of 400,000 graph samples aggregated across all atlases and parcellations.

2505.21208 2026-02-11 stat.ML cs.LG math.OC

Input Convex Kolmogorov Arnold Networks

Thomas Deschatre, Xavier Warin

详情
英文摘要

This article presents an input convex neural network architecture using Kolmogorov-Arnold networks (ICKAN). Two specific networks are presented: the first is based on a low-order, linear-by-part, representation of functions, and a universal approximation theorem is provided. The second is based on cubic splines, for which only numerical results support convergence. We demonstrate on simple tests that these networks perform competitively with classical input convex neural networks (ICNNs). In a second part, we use the networks to solve some optimal transport problems needing a convex approximation of functions and demonstrate their effectiveness. Comparisons with ICNNs show that cubic ICKANs produce results similar to those of classical ICNNs.

2505.17133 2026-02-11 stat.ML cs.AI cs.LG

Learning Probabilities of Causation with Mask-Augmented Data

Shuai Wang, Yizhou Sun, Judea Pearl, Ang Li

Comments arXiv admin note: text overlap with arXiv:2502.08858

详情
英文摘要

Probabilities of causation play a central role in modern decision making. Tian and Pearl first introduced formal definitions and derived tight bounds for three binary probabilities of causation, such as the probability of necessity and sufficiency (PNS). However, estimating these probabilities requires both experimental and observational distributions specific to each subpopulation, which are often unreliable or impractical to obtain from limited population-level data. To solve this problem, we propose two machine learning models: Exact-MLP and Mask-MLP, which are trained on a small set of reliable subpopulations and are able to predict PNS bounds for all other subpopulations. We validate our models across four Structural Causal Models (SCMs), each evaluated on population-level data with sample sizes between 100k and 200k. Our models achieve average mean absolute errors (MAEs) of roughly 0.03 on main tasks, reducing MAE by about 80% relative to the corresponding baselines. These results demonstrate both the feasibility of machine learning models for learning probabilities of causation and the effectiveness of the proposed approach.

2505.15935 2026-02-11 cs.DB cs.CL cs.CR

MAPS: A Multilingual Benchmark for Agent Performance and Security

Omer Hofman, Jonathan Brokman, Oren Rachmil, Shamik Bose, Vikas Pahuja, Toshiya Shimizu, Trisha Starostina, Kelly Marchisio, Seraphina Goldfarb-Tarrant, Roman Vainshtein

Comments Accepted to EACL 2026 findings

详情
英文摘要

Agentic AI systems, which build on Large Language Models (LLMs) and interact with tools and memory, have rapidly advanced in capability and scope. Yet, since LLMs have been shown to struggle in multilingual settings, typically resulting in lower performance and reduced safety, agentic systems risk inheriting these limitations. This raises concerns about the accessibility of such systems, as users interacting in languages other than English may encounter unreliable or security-critical agent behavior. Despite growing interest in evaluating agentic AI and recent initial efforts toward multilingual interaction, existing benchmarks do not yet provide a comprehensive, multi-domain, security-aware evaluation of multilingual agentic systems. To address this gap, we propose MAPS, a multilingual benchmark suite designed to evaluate agentic AI systems across diverse languages and tasks. MAPS builds on four widely used agentic benchmarks - GAIA (real-world tasks), SWE-Bench (code generation), MATH (mathematical reasoning), and the Agent Security Benchmark (security). We translate each dataset into eleven diverse languages, resulting in 805 unique tasks and 9,660 total language-specific instances - enabling a systematic analysis of the Multilingual Effect on AI agents' performance and robustness. Empirically, we observe a degradation in both performance and security when transitioning from English to other languages, with severity varying by task and correlating with the amount of translated input. This work establishes the first standardized evaluation framework for multilingual agentic AI, encouraging future research towards equitable, reliable, and accessible agentic AI. MAPS benchmark suite is publicly available at https://huggingface.co/datasets/Fujitsu-FRE/MAPS

2505.10919 2026-02-11 physics.flu-dyn cs.LG stat.ML

A Physics-Informed Spatiotemporal Deep Learning Framework for Turbulent Systems

Luca Menicali, Andrew Grace, David H. Richter, Stefano Castruccio

详情
英文摘要

Fluid thermodynamics underpins atmospheric dynamics, climate science, industrial applications, and energy systems. However, direct numerical simulations (DNS) of such systems can be computationally prohibitive. To address this, we present a novel physics-informed spatiotemporal surrogate model for Rayleigh-Benard convection (RBC), a canonical example of convective fluid flow. Our approach combines convolutional neural networks, for spatial dimension reduction, with an innovative recurrent architecture, inspired by large language models, to model long-range temporal dynamics. Inference is penalized with respect to the governing partial differential equations to ensure physical interpretability. Since RBC exhibits turbulent behavior, we quantify uncertainty using a conformal prediction framework. This model replicates key physical features of RBC dynamics while significantly reducing computational cost, offering a scalable alternative to DNS for long-term simulations.

2504.12777 2026-02-11 cs.MA cs.AI

Multi-Agent Reinforcement Learning Simulation for Environmental Policy Synthesis

James Rudd-Jones, Mirco Musolesi, María Pérez-Ortiz

Comments Published in AAMAS'25 Blue Sky Ideas Track

详情
英文摘要

Climate policy development faces significant challenges due to deep uncertainty, complex system dynamics, and competing stakeholder interests. Climate simulation methods, such as Earth System Models, have become valuable tools for policy exploration. However, their typical use is for evaluating potential polices, rather than directly synthesizing them. The problem can be inverted to optimize for policy pathways, but the traditional optimization approaches often struggle with non-linear dynamics, heterogeneous agents, and comprehensive uncertainty quantification. We propose a framework for augmenting climate simulations with Multi-Agent Reinforcement Learning (MARL) to address these limitations. We identify key challenges at the interface between climate simulations and the application of MARL in the context of policy synthesis, including reward definition, scalability with increasing agents and state spaces, uncertainty propagation across linked systems, and solution validation. Additionally, we discuss challenges in making MARL-derived solutions interpretable and useful for policy-makers. Our framework provides a foundation for more sophisticated climate policy exploration while acknowledging important limitations and areas for future research.

2504.03784 2026-02-11 stat.ML cs.AI cs.LG

Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning

Kai Ye, Hongyi Zhou, Jin Zhu, Francesco Quinzan, Chengchun Shi

详情
英文摘要

Reinforcement learning from human feedback (RLHF) has emerged as a key technique for aligning the output of large language models (LLMs) with human preferences. To learn the reward function, most existing RLHF algorithms use the Bradley-Terry model, which relies on assumptions about human preferences that may not reflect the complexity and variability of real-world judgments. In this paper, we propose a robust algorithm to enhance the performance of existing approaches under such reward model misspecifications. Theoretically, our algorithm reduces the variance of reward and policy estimators, leading to improved regret bounds. Empirical evaluations on LLM benchmark datasets demonstrate that the proposed algorithm consistently outperforms existing methods, with 77-81% of responses being favored over baselines on the Anthropic Helpful and Harmless dataset. The code is available at https://github.com/VRPO/VRPO.

2503.20879 2026-02-11 quant-ph cs.LG

Quantum advantage for learning shallow neural networks with natural data distributions

Laura Lewis, Dar Gilboa, Jarrod R. McClean

Comments 10 pages, 1 figure + 82-page appendix; updated with stronger classical hardness

Journal ref Nat. Commun. 17, 1341 (2025)

详情
英文摘要

Without large quantum computers to empirically evaluate performance, theoretical frameworks such as the quantum statistical query (QSQ) are a primary tool to study quantum algorithms for learning classical functions and search for quantum advantage in machine learning tasks. However, we only understand quantum advantage in this model at two extremes: either exponential advantages for uniform input distributions or no advantage for arbitrary distributions. Our work helps close the gap between these two regimes by designing an efficient quantum algorithm for learning periodic neurons in the QSQ model over a variety of non-uniform distributions and the first explicit treatment of real-valued functions. We prove that this problem is hard not only for classical gradient-based algorithms, which are the workhorses of machine learning, but also for a more general class of SQ algorithms, establishing an exponential quantum advantage.

2503.16449 2026-02-11 cs.HC cs.AI cs.RO

Affective and Conversational Predictors of Re-Engagement in Human-Robot Interactions -- A Student-Centered Study with A Humanoid Social Robot

Hangyeol Kang, Thiago Freitas dos Santos, Maher Ben Moussa, Nadia Magnenat-Thalmann

Comments 27 pages, 7 figures

详情
英文摘要

Humanoid social robots are increasingly present in daily life, making sustained user engagement a critical factor for their effectiveness and acceptance. While prior work has often examined affective evaluations or anthropomorphic design, less is known about the relative influence of dynamic conversational qualities and perceived robot characteristics in determining a user's intention to re-engage with Large Language Model (LLM)-driven social robots. In this study, 68 participants interacted in open-ended conversations with the Nadine humanoid social robot, completing pre- and post-interaction surveys to assess changes in robot perception, conversational quality, and intention to re-engage. The results showed that verbal interaction significantly improved the robot's perceived characteristics, with statistically significant increases in pleasantness ($p<.0001$) and approachability ($p<.0001$), and a reduction in creepiness ($p<.001$). However, these affective changes were not strong and unique predictors of users' intention to re-engage in a multiple regression model. Instead, participants' perceptions of the interestingness ($β=0.60$, $p<.001$) and naturalness ($β=0.31$, $p=0.015$) of the robot's conversation emerged as the most significant and robust independent predictors of intention to re-engage. Overall, the results highlight that conversational quality, specifically perceived interestingness and naturalness, is the dominant driver of re-engagement, indicating that LLM-driven robot design should prioritize engaging, natural dialogue over affective impression management or anthropomorphic cues.

2502.16292 2026-02-11 cond-mat.dis-nn cs.LG

Generative diffusion for perceptron problems: statistical physics analysis and efficient algorithms

Elizaveta Demyanenko, Davide Straziota, Carlo Baldassi, Carlo Lucibello

Comments v3 minor revision after NeurIPS's rebuttal

Journal ref NeurIPS 2025

详情
英文摘要

We consider random instances of non-convex perceptron problems in the high-dimensional limit of a large number of examples $M$ and weights $N$, with finite load $α= M/N$. We develop a formalism based on replica theory to predict the fundamental limits of efficiently sampling the solution space using generative diffusion algorithms, conjectured to be saturated when the score function is provided by Approximate Message Passing. For the spherical perceptron with negative margin $κ$, we find that the uniform distribution over solutions can be efficiently sampled in most of the Replica Symmetric region of the $α$-$κ$ plane. In contrast, for binary weights, sampling from the uniform distribution remains intractable. A theoretical analysis of this obstruction leads us to identify a potential $U(s) = -\log(s)$, under which the corresponding tilted distribution becomes efficiently samplable via diffusion. Moreover, we show numerically that an annealing procedure over the shape of this potential yields a fast and robust Markov Chain Monte Carlo algorithm for sampling the solution space of the binary perceptron.

2502.05850 2026-02-11 cs.AR cs.LG

MetaML-Pro: Cross-Stage Design Flow Automation for Efficient Deep Learning Acceleration

Zhiqiang Que, Jose G. F. Coutinho, Ce Guo, Hongxiang Fan, Wayne Luk

Comments Accepted by ACM TRETS

详情
英文摘要

This paper presents a unified framework for codifying and automating optimization strategies to efficiently deploy deep neural networks (DNNs) on resource-constrained hardware, such as FPGAs, while maintaining high performance, accuracy, and resource efficiency. Deploying DNNs on such platforms involves addressing the significant challenge of balancing performance, resource usage (e.g., DSPs and LUTs), and inference accuracy, which often requires extensive manual effort and domain expertise. Our novel approach addresses two core key issues: (i)~encoding custom optimization strategies and (ii)~enabling cross-stage optimization search. In particular, our proposed framework seamlessly integrates programmatic DNN optimization techniques with high-level synthesis (HLS)-based metaprogramming, leveraging advanced design space exploration (DSE) strategies like Bayesian optimization to automate both top-down and bottom-up design flows. Hence, we reduce the need for manual intervention and domain expertise. In addition, the framework introduces customizable optimization, transformation, and control blocks to enhance DNN accelerator performance and resource efficiency. Experimental results demonstrate up to a 92\% DSP and 89\% LUT usage reduction for select networks, while preserving accuracy, along with a 15.6-fold reduction in optimization time compared to grid search. These results highlight the potential for automating the generation of resource-efficient DNN accelerator designs with minimum effort.