arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1473
2601.10955 2026-03-12 cs.CR cs.AI

Beyond Max Tokens: Stealthy Resource Amplification via Tool Calling Chains in LLM Agents

Kaiyu Zhou, Yongsen Zheng, Yicheng He, Meng Xue, Xueluan Gong, Yuji Wang, Xuanye Zhang, Kwok-Yan Lam

详情
英文摘要

The agent--tool interaction loop is a critical attack surface for modern Large Language Model (LLM) agents. Existing denial-of-service (DoS) attacks typically function at the user-prompt or retrieval-augmented generation (RAG) context layer and are inherently single-turn in nature. This limitation restricts cost amplification and diminishes stealth in goal-oriented workflows. To address these issues, we proposed a stealthy, multi-turn economic DoS attack at the tool layer under the Model Context Protocol (MCP). By simply editing text-visible fields and implementing a template-driven return policy, our malicious server preserves function signatures and the terminal benign payload while steering agents into prolonged, verbose tool-calling chains. We optimize these text-only edits with Monte Carlo Tree Search (MCTS) to maximize cost under a task-success constraint. Across six LLMs on ToolBench and BFCL benchmarks, our attack yields trajectories over 60K tokens, increases per-query cost by up to 658 times, raises energy by 100 to 560 times, and pushes GPU key-value (KV) cache occupancy to 35--74%. Standard prompt filters and output trajectory monitors seldom detect these attacks, highlighting the need for defenses that safeguard agentic processes rather than focusing solely on final outcomes. We will release the code soon.

2601.09006 2026-03-12 eess.IV cs.CV physics.med-ph

GOUHFI 2.0: A Next-Generation Toolbox for Brain Segmentation and Cortex Parcellation at Ultra-High Field MRI

Marc-Antoine Fortin, Anne Louise Kristoffersen, Paal Erik Goa

详情
英文摘要

Ultra-High Field MRI (UHF-MRI) is increasingly used in large-scale neuroimaging studies, yet automatic brain segmentation and cortical parcellation remain challenging due to signal inhomogeneities, heterogeneous contrasts and resolutions, and the limited availability of tools optimized for UHF data. Standard software packages such as FastSurferVINN and SynthSeg+ often yield suboptimal results when applied directly to UHF images, thereby restricting region-based quantitative analyses. To address this need, we introduce GOUHFI 2.0, an updated implementation of GOUHFI that incorporates increased training data variability and additional functionalities, including cortical parcellation and volumetry. GOUHFI 2.0 preserves the contrast- and resolution-agnostic design of the original toolbox while introducing two independently trained 3D U-Net segmentation tasks. The first performs whole-brain segmentation into 35 labels across contrasts, resolutions, field strengths and populations, using a domain-randomization strategy and a training dataset of 238 subjects. Using the same training data, the second network performs cortical parcellation into 62 labels following the Desikan-Killiany-Tourville (DKT) protocol. Across multiple datasets, GOUHFI 2.0 demonstrated improved segmentation accuracy relative to the original toolbox, particularly in heterogeneous cohorts, and produced reliable cortical parcellations. In addition, the integrated volumetry pipeline yielded results consistent with standard volumetric workflows. Overall, GOUHFI 2.0 provides a comprehensive solution for brain segmentation, parcellation and volumetry across field strengths, and constitutes the first deep-learning toolbox enabling robust cortical parcellation at UHF-MRI.

2601.08527 2026-03-12 math.NA cs.LG cs.NA math.PR stat.ML

Sampling via Stochastic Interpolants by Langevin-based Velocity and Initialization Estimation in Flow ODEs

Chenguang Duan, Yuling Jiao, Gabriele Steidl, Christian Wald, Jerry Zhijian Yang, Ruizhe Zhang

详情
英文摘要

We propose a novel method for sampling from unnormalized Boltzmann densities based on a probability flow ordinary differential equation (ODE) derived from linear stochastic interpolants. The key innovation of our approach is the use of a sequence of Langevin samplers to enable efficient simulation of the flow. Specifically, these Langevin samplers are employed (i) to generate samples from the interpolant distribution at intermediate times and (ii) to construct, starting from these intermediate times, a robust estimator of the velocity field governing the probability flow ODE. Theoretically, we provide convergence guarantees for both Langevin components, and establish a non-asymptotic convergence rate for the probability flow ODE. Extensive numerical experiments demonstrate the efficiency of the proposed method on challenging multimodal distributions across a range of dimensions, as well as its effectiveness in Bayesian inference tasks.

2601.06627 2026-03-12 cs.CR cs.AI

Burn-After-Use for Preventing Data Leakage through a Secure Multi-Tenant Architecture in Enterprise LLM

Qiang Zhang, Elena Emma Wang, Jiaming Li, Xichun Wang

Comments 16 pages, 5 figures

详情
英文摘要

This study presents a Secure Multi-Tenant Architecture (SMTA) combined with a novel concept Burn-After-Use (BAU) mechanism for enterprise LLM environments to effectively prevent data leakage. As institutions increasingly adopt LLMs across departments, the risks of data leakage have become a critical security and compliance concern. The proposed SMTA isolates LLM instances across departments and enforces rigorous context ownership boundaries within an internally deployed infrastructure. The BAU mechanism introduces data confidentiality by enforcing ephemeral conversational contexts that are automatically destroyed after use, preventing cross-session or cross-user inference. The evaluation to SMTA and BAU is through two sets of realistic and reproducible experiments comprising of 127 test iterations. One aspect of this experiment is to assess prompt-based and semantic leakage attacks in a multi-tenant architecture (Appendix A) across 55 infrastructure-level attack tests, including vector-database credential compromise and shared logging pipeline exposure. SMTA achieves 92% defense success rate, demonstrating strong semantic isolation while highlighting residual risks from credential misconfiguration and observability pipelines. Another aspect is to evaluate the robustness of BAU under realistic failure scenarios (Appendix B) using four empirical metrics: Local Residual Persistence Rate (LRPR), Remote Residual Persistence Rate (RRPR), Image Frame Exposure Rate (IFER), and Burn Timer Persistence Rate (BTPR). Across 72 test iterations, BAU achieves a 76.75% success rate in mitigating post-session leakage threats across the client, server, application, infrastructure, and cache layers. These results show that SMTA and BAU together enforce strict isolation, complete session ephemerality, strong confidentiality guarantees, non-persistence, and policy-aligned behavior for enterprise LLMs.

2512.19733 2026-03-12 physics.chem-ph cs.LG

NMIRacle: Multi-modal Generative Molecular Elucidation from IR and NMR Spectra

Federico Ottomano, Yingzhen Li, Alex M. Ganose

详情
英文摘要

Molecular structure elucidation from spectroscopic data is a long-standing challenge in Chemistry, traditionally requiring expert interpretation. We introduce NMIRacle, a two-stage generative framework that builds upon recent paradigms in AI-driven spectroscopy with minimal assumptions. In the first stage, NMIRacle learns to reconstruct molecular structures from count-aware fragment representations, capturing both fragment identities and their occurrences. In the second stage, a spectral encoder maps input spectra (IR, 1H-NMR, 13C-NMR) into a latent embedding used to condition the pre-trained generator, which is fine-tuned for direct spectra-to-molecule generation. This formulation bridges fragment-level chemical modeling with spectral evidence, yielding accurate molecular predictions. Empirical results demonstrate that NMIRacle outperforms existing baselines on molecular elucidation, while maintaining robust performance across increasing levels of molecular complexity.

2512.11957 2026-03-12 astro-ph.IM cs.CV

Pre-training vision models for the classification of alerts from wide-field time-domain surveys

Nabeel Rehemtulla, Adam A. Miller, Mike Walmsley, Ved G. Shah, Theophile Jegou du Laz, Michael W. Coughlin, Argyro Sasli, Joshua Bloom, Christoffer Fremling, Matthew J. Graham, Steven L. Groom, David Hale, Ashish A. Mahabal, Daniel A. Perley, Josiah Purdum, Ben Rusholme, Jesper Sollerman, Mansi M. Kasliwal

Comments Accepted for publication in PASP

详情
英文摘要

Modern wide-field time-domain surveys facilitate the study of transient, variable and moving phenomena by conducting image differencing and relaying alerts to their communities. Machine learning tools have been used on data from these surveys and their precursors for more than a decade, and convolutional neural networks (CNNs), which make predictions directly from input images, saw particularly broad adoption through the 2010s. Since then, continually rapid advances in computer vision have transformed the standard practices around using such models. It is now commonplace to use standardized architectures pre-trained on large corpora of everyday images (e.g., ImageNet). In contrast, time-domain astronomy studies still typically design custom CNN architectures and train them from scratch. Here, we explore the effects of adopting various pre-training regimens and standardized model architectures on the performance of alert classification. We find that the resulting models match or outperform a custom, specialized CNN like what is typically used for filtering alerts. Moreover, our results show that pre-training on galaxy images from Galaxy Zoo tends to yield better performance than pre-training on ImageNet or training from scratch. We observe that the design of standardized architectures are much better optimized than the custom CNN baseline, requiring significantly less time and memory for inference despite having more trainable parameters. On the eve of the Legacy Survey of Space and Time and other image-differencing surveys, these findings advocate for a paradigm shift in the creation of vision models for alerts, demonstrating that greater performance and efficiency, in time and in data, can be achieved by adopting the latest practices from the computer vision field.

2512.10445 2026-03-12 stat.ML cs.AI cs.LG stat.ME

Maximum Risk Minimization with Random Forests

Francesco Freni, Anya Fries, Linus Kühne, Markus Reichstein, Jonas Peters

Comments 47 pages, 13 figures

详情
英文摘要

We consider a regression setting where observations are collected in different environments modeled by different data distributions. The field of out-of-distribution (OOD) generalization aims to design methods that generalize better to test environments whose distributions differ from those observed during training. One line of such works has proposed to minimize the maximum risk across environments, a principle that we refer to as MaxRM (Maximum Risk Minimization). In this work, we introduce variants of random forests based on the principle of MaxRM. We provide computationally efficient algorithms and prove statistical consistency for our primary method. Our proposed method can be used with each of the following three risks: the mean squared error, the negative reward, and the regret (which quantifies the excess risk relative to the best predictor). For MaxRM with regret as the risk, we prove a novel out-of-sample guarantee over unseen test distributions. Finally, we evaluate the proposed methods on both simulated and real-world data.

2512.07737 2026-03-12 quant-ph cs.LG

A scalable and real-time neural decoder for topological quantum codes

Andrew W. Senior, Thomas Edlich, Francisco J. H. Heras, Lei M. Zhang, Oscar Higgott, James S. Spencer, Taylor Applebaum, Sam Blackwell, Justin Ledford, Akvilė Žemgulytė, Augustin Žídek, Noah Shutty, Andrew Cowie, Yin Li, George Holland, Peter Brooks, Charlie Beattie, Michael Newman, Alex Davies, Cody Jones, Sergio Boixo, Hartmut Neven, Pushmeet Kohli, Johannes Bausch

Comments with color code realtime decoding results

详情
英文摘要

Fault-tolerant quantum computing will require error rates far below those achievable with physical qubits. Quantum error correction (QEC) bridges this gap, but depends on decoders being simultaneously fast, accurate, and scalable. This combination of requirements remains unmet by a machine-learning decoder, nor by any decoder for promising resource-efficient codes such as the color code. Here we introduce AlphaQubit 2, a neural-network decoder that achieves near-optimal logical error rates for both surface and color codes at scale under realistic noise. For the color code, it is orders of magnitude faster than other high-accuracy decoders. We demonstrate real-time decoding faster than 1μs per cycle on commercial accelerators: for the surface code to distance 11, with better accuracy than leading real-time decoders; and the first real-time decoding of the color code to distance 9. These results support the practical application of a wider class of promising QEC codes, and establish a credible path towards high-accuracy, real-time neural decoding at the scales required for fault-tolerant quantum computation.

2511.04361 2026-03-12 q-fin.CP cs.LG stat.ME stat.OT

Causal Regime Detection in Energy Markets With Augmented Time Series Structural Causal Models

Dennis Thumm

Comments EurIPS 2025 Workshop Causality for Impact: Practical challenges for real-world applications of causal methods

详情
英文摘要

Energy markets exhibit complex causal relationships between weather patterns, generation technologies, and price formation, with regime changes occurring continuously rather than at discrete break points. Current approaches model electricity prices without explicit causal interpretation or counterfactual reasoning capabilities. We introduce Augmented Time Series Causal Models (ATSCM) for energy markets, extending counterfactual reasoning frameworks to multivariate temporal data with learned causal structure. Our approach models energy systems through interpretable factors (weather, generation mix, demand patterns), rich grid dynamics, and observable market variables. We integrate neural causal discovery to learn time-varying causal graphs without requiring ground truth DAGs. Applied to real-world electricity price data, ATSCM enables novel counterfactual queries such as "What would prices be under different renewable generation scenarios?".

2510.12947 2026-03-12 eess.AS cs.AI cs.LG cs.SD

HyWA: Hypernetwork Weight Adapting Personalized Voice Activity Detection

Mahsa Ghazvini Nejad, Hamed Jafarzadeh Asl, Amin Edraki, Mohammadreza Sadeghi, Masoud Asgharian, Yuanhao Yu, Vahid Partovi Nia

Comments Mahsa Ghazvini Nejad and Hamed Jafarzadeh Asl contributed equally to this work. Submitted to Interspeech 2026

详情
英文摘要

Personalized Voice Activity Detection (PVAD) systems activate only in response to a specific target speaker. Speaker-conditioning methods are employed to inject information about the target speaker into a VAD pipeline, to achieve personalization. Existing speaker-conditioning methods typically modify the inputs or activations of a VAD model. We propose an alternative perspective to speaker conditioning. Our approach, HyWA, employs a hypernetwork to generate personalized weights for a few selected layers of a standard VAD model. We evaluate HyWA against multiple baseline speaker-conditioning techniques using a fixed backbone VAD. Our comparison shows consistent improvements in PVAD performance. This new approach improves the current speaker-conditioning techniques in two ways: i) increases the mean average precision, ii) facilitates deployment by reusing the same VAD architecture.

2510.07960 2026-03-12 cs.HC cs.AI cs.LG

A Systematic Evaluation of Self-Supervised Learning for Label-Efficient Sleep Staging with Wearable EEG

Emilio Estevan, María Sierra-Torralba, Eduardo López-Larraz, Luis Montesano

Comments 15 pages, 4 figures

详情
英文摘要

Wearable EEG devices have emerged as a promising alternative to polysomnography (PSG). As affordable and scalable solutions, their widespread adoption results in the collection of massive volumes of unlabeled data that cannot be analyzed by clinicians at scale. Meanwhile, the recent success of deep learning for sleep scoring has relied on large annotated datasets. Self-supervised learning (SSL) offers an opportunity to bridge this gap, leveraging unlabeled signals to address label scarcity and reduce annotation effort. In this paper, we present the first systematic evaluation of SSL for sleep staging using wearable EEG. We introduce a structured benchmarking framework encompassing a range of SSL paradigms and propose a specialized pipeline tailored to the wearable EEG domain, evaluating them on two sleep databases acquired with the Ikon Sleep wearable headband: BOAS, a high-quality benchmark containing PSG and wearable EEG recordings with consensus labels, and HOGAR, a large collection of home-based, self-recorded, and unlabeled recordings. Three evaluation scenarios are defined to study label efficiency, representation quality, and cross-dataset generalization. Results show that SSL consistently improves classification performance by up to 10% over supervised baselines, with gains particularly evident when labeled data is scarce. SSL achieves clinical-grade accuracy above 80% leveraging only 5% to 10% of labeled data, while the supervised approach requires twice the labels. Additionally, the proposed domain-specific SSL pipeline outperforms the evaluated general-purpose EEG foundation models across all scenarios. Our findings demonstrate the potential of SSL to enable label-efficient sleep staging with wearable EEG, reducing reliance on manual annotations and advancing the development of affordable sleep monitoring systems.

2510.02182 2026-03-12 q-bio.NC cs.CV cs.LG

Uncovering Semantic Selectivity of Latent Groups in Higher Visual Cortex with Mutual Information-Guided Diffusion

Yule Wang, Joseph Yu, Chengrui Li, Weihan Li, Anqi Wu

详情
英文摘要

Understanding how neural populations in higher visual areas encode object-centered visual information remains a central challenge in computational neuroscience. Prior works have investigated representational alignment between artificial neural networks and the visual cortex. Nevertheless, these findings are indirect and offer limited insights to the structure of neural populations themselves. Similarly, decoding-based methods have quantified semantic features from neural populations but have not uncovered their underlying organizations. This leaves open a scientific question: "how feature-specific visual information is distributed across neural populations in higher visual areas, and whether it is organized into structured, semantically meaningful subspaces." To tackle this problem, we present MIG-Vis, a method that leverages the generative power of diffusion models to visualize and validate the visual-semantic attributes encoded in neural latent subspaces. Our method first uses a variational autoencoder to infer a group-wise disentangled neural latent subspace from neural populations. Subsequently, we propose a mutual information (MI)-guided diffusion synthesis procedure to visualize the specific visual-semantic features encoded by each latent group. We validate MIG-Vis on multi-session neural spiking datasets from the inferior temporal (IT) cortex of two macaques. The synthesized results demonstrate that our method identifies neural latent groups with clear semantic selectivity to diverse visual features, including object pose, inter-category transformations, and intra-class content. These findings provide direct, interpretable evidence of structured semantic representation in the higher visual cortex and advance our understanding of its encoding principles.

2509.20985 2026-03-12 stat.ML cs.LG

Empirical PAC-Bayes Bounds for Markov Chains

Vahe Karagulyan, Pierre Alquier

Comments To appear in the proceedings of AISTATS 2026

详情
英文摘要

The core of generalization theory was developed for independent observations. Some PAC and PAC-Bayes bounds are available for data that exhibit a temporal dependence. However, there are constants in these bounds that depend on properties of the data-generating process: mixing coefficients, mixing time, spectral gap... Such constants are unknown in practice. In this paper, we prove a new PAC-Bayes bound for Markov chains. This bound depends on a quantity called the pseudo-spectral gap. The main novelty is that we can provide an empirical bound on the pseudo-spectral gap when the state space is finite. Thus, we obtain the first fully empirical PAC-Bayes bound for Markov chains. This extends beyond the finite case, although this requires additional assumptions. On simulated experiments, the empirical version of the bound is essentially as tight as the non-empirical one.

2509.18149 2026-03-12 math.NA cs.LG cs.NA eess.SP math.OC stat.CO stat.ML

Tensor Train Completion from Fiberwise Observations Along a Single Mode

Shakir Showkat Sofi, Lieven De Lathauwer

Comments 26 pages, 12 figures

详情
Journal ref
Mathematics 2026, 14(5), 922
英文摘要

Tensor completion is an extension of matrix completion aimed at recovering a multiway data tensor by leveraging a given subset of its entries (observations) and the pattern of observation. The low-rank assumption is key in establishing a relationship between the observed and unobserved entries of the tensor. The low-rank tensor completion problem is typically solved using numerical optimization techniques, where the rank information is used either implicitly (in the rank minimization approach) or explicitly (in the error minimization approach). Current theories concerning these techniques often study probabilistic recovery guarantees under conditions such as random uniform observations and incoherence requirements. However, if an observation pattern exhibits some low-rank structure that can be exploited, more efficient algorithms with deterministic recovery guarantees can be designed by leveraging this structure. This work shows how to use only standard linear algebra operations to compute the tensor train decomposition of a specific type of ``fiber-wise'' observed tensor, where some of the fibers of a tensor (along a single specific mode) are either fully observed or entirely missing, unlike the usual entry-wise observations. From an application viewpoint, this setting is relevant when it is easier to sample or collect a multiway data tensor along a specific mode (e.g., temporal). The proposed completion method is fast and is guaranteed to work under reasonable deterministic conditions on the observation pattern. Through numerical experiments, we showcase interesting applications and use cases that illustrate the effectiveness of the proposed approach.

2509.14053 2026-03-12 physics.soc-ph cs.SD eess.AS q-bio.NC

Trade-offs between structural richness and communication efficiency in music network representations

Lluc Bono Rosselló, Robert Jankowski, Hugues Bersini, Marián Boguñá, M. Ángeles Serrano

详情
英文摘要

Music is a structured and perceptually rich sequence of sounds in time, whose perception is shaped by the interplay of expectation and uncertainty about what comes next. Yet the uncertainty we infer from music depends on how the musical piece is encoded as an event sequence. In this work, we use network representations, in which event types are nodes and observed transitions are directed edges, to compare how different feature encodings shape the transition structure we recover and what this implies for both the descriptive uncertainty expectation under imperfect memory and noise. We systematically analyse eight encodings of piano music, from single-feature vocabularies to richer multi-feature combinations. These representational choices reorganize the state space and fundamentally reshape network topology, shifting how uncertainty is distributed across transitions. To connect these descriptive differences to perception, we adopt a perceptual-constraint model that captures imperfect access to transition statistics. Overall, compressed single-feature representations yield dense transition structures with higher entropy rates, corresponding to higher average uncertainty per step, yet low model error, indicating that the constrained estimate stays close to the corpus transitions. In contrast, richer multi-feature representations preserve finer distinctions but expand the state space, sharpen transition profiles, lower entropy rates, and increase model error. Finally, across representations, uncertainty concentrates in diffusion-central nodes while model error remains low there, suggesting an informational landscape in which predictable flow coexists with localized surprise. Overall, our results show that feature choice shapes not only the networks we reconstruct, but also whether their resulting uncertainty is a plausible proxy for the expectations listeners can realistically learn and use.

2509.12583 2026-03-12 eess.AS cs.SD

Robust Audio-Visual Target Speaker Extraction with Emotion-Aware Multiple Enrollment Fusion

Zhan Jin, Bang Zeng, Peijun Yang, Jiarong Du, Wei Ju, Yao Tian, Juan Liu, Ming Li

Comments submitted to Interspeech 2026

详情
英文摘要

Audio-Visual Target Speaker Extraction (AVTSE) is crucial for cocktail party scenarios. Leveraging multiple cues --such as utterance-level speaker embeddings or steady face images, and frame-level lip motion or facial expression features --can significantly improve performance. However, real-world applications often suffer from intermittent signal loss, especially for frame-level cues. This paper systematically investigates the robustness of multi-enrollment fusion under varying degrees of modality missing. Results show that while full multimodal fusion excels under ideal conditions, its performance degrades sharply when encountering unseen modalities missing during the testing. Crucially, training with a high missing rate dramatically enhances robustness, maintaining stable performance even under severe test-time modality missing. We demonstrate that fusing the complementary one frame of face image with frame-level lip features achieves both strong performance and robustness for the AVTSE task. The model and codes are shared.

2508.19075 2026-03-12 quant-ph cond-mat.quant-gas cond-mat.str-el cs.LG cs.SY eess.SY

Universal Dynamics with Globally Controlled Analog Quantum Simulators

Hong-Ye Hu, Abigail McClain Gomez, Liyuan Chen, Aaron Trowbridge, Andy J. Goldschmidt, Zachary Manchester, Frederic T. Chong, Arthur Jaffe, Susanne F. Yelin

Comments The updated version adds new applications and discussions on information scrambling with globally controlled analog quantum systems. 11 pages, 6 figures with Methods. HYH, AMG, and LC contributed equally to this work. Updated acknowledgement and references

详情
英文摘要

Analog quantum simulators with global control fields have emerged as powerful platforms for exploring complex quantum phenomena. Despite these advances, a fundamental theoretical question remains unresolved: to what extent can such systems realize universal quantum dynamics under global control? Here we establish a necessary and sufficient condition for universal quantum computation using only global pulse control, proving that a broad class of analog quantum simulators is, in fact, universal. We further extend this framework to fermionic and bosonic systems, including modern platforms such as ultracold atoms in optical superlattices. Moreover, we observe that analog simulators driven by random global pulses exhibit information scrambling comparable to random unitary circuits. In a dual-species neutral-atom array setup, the measurement outcomes anti-concentrate on a $\log N$ timescale despite the presence of only temporal randomness, opening opportunities for efficient randomness generation. To bridge theoretical possibility with experimental reality, we introduce \emph{direct quantum optimal control}, a control framework that enables the synthesis of complex effective Hamiltonians while incorporating realistic hardware constraints. Using this approach, we experimentally engineer three-body interactions outside the blockade regime and demonstrate topological dynamics on a Rydberg-atom array. Experimental measurements reveal dynamical signatures of symmetry-protected-topological edge modes, confirming both the expressivity and feasibility of our method. Our work opens a new avenue for quantum simulation beyond native hardware Hamiltonians, enabling the engineering of effective multi-body interactions and advancing the frontier of quantum information processing with globally-controlled analog platforms.

2507.19743 2026-03-12 cs.SE cs.AI

What Makes Code Generation Ethically Sourced?

Zhuolin Xu, Chenglin Li, Qiushi Li, Shin Hwei Tan

详情
Journal ref
Proc. 48th International Conference on Software Engineering (ICSE 2026)
英文摘要

Several code generation models have been proposed to help reduce time and effort in solving software-related tasks. To ensure responsible AI, there are growing interests over various ethical issues (e.g., unclear licensing, privacy, fairness, and environment impact). These studies have the overarching goal of ensuring ethically sourced generation, which has gained growing attentions in speech synthesis and image generation. In this paper, we introduce the novel notion of Ethically Sourced Code Generation (ES-CodeGen) to refer to managing all processes involved in code generation model development from data collection to post-deployment via ethical and sustainable practices. To build a taxonomy of ES-CodeGen, we perform a two-phase literature review where we read 803 papers across various domains and specific to AI-based code generation. We identified 71 relevant papers with 10 initial dimensions of ES-CodeGen. To refine our dimensions and gain insights on consequences of ES-CodeGen, we surveyed 32 practitioners, which include six developers who submitted GitHub issues to opt-out from the Stack dataset (these impacted users have real-world experience of ethically sourcing issues in code generation models). The results lead to 11 dimensions of ES-CodeGen with a new dimension on code quality as practitioners have noted its importance. We also identified consequences, artifacts, and stages relevant to ES-CodeGen. Our post-survey reflection showed that most practitioners tend to ignore social-related dimensions despite their importance. Most practitioners either agreed or strongly agreed that our survey help improve their understanding of ES-CodeGen. Our study calls for attentions of various ethical issues towards ES-CodeGen.

2507.19218 2026-03-12 cs.HC cs.AI q-bio.NC

Technological folie à deux: Feedback Loops Between AI Chatbots and Mental Illness

Sebastian Dohnány, Zeb Kurth-Nelson, Eleanor Spens, Lennart Luettgau, Alastair Reid, Iason Gabriel, Christopher Summerfield, Murray Shanahan, Matthew M Nour

详情
Journal ref
Nature Mental Health (2026)
英文摘要

Artificial intelligence chatbots have achieved unprecedented adoption, with millions now using these systems for emotional support and companionship in contexts of widespread social isolation and capacity-constrained mental health services. While some users report psychological benefits, concerning edge cases are emerging, including reports of suicide, violence, and delusional thinking linked to perceived emotional relationships with chatbots. To understand this new risk profile we need to consider the interaction between human cognitive and emotional biases, and chatbot behavioural tendencies such as agreeableness (sycophancy) and adaptability (in-context learning). We argue that individuals with mental health conditions face increased risks of chatbot-induced belief destabilization and dependence, owing to altered belief-updating, impaired reality-testing, and social isolation. Current AI safety measures are inadequate to address these interaction-based risks. To address this emerging public health concern, we need coordinated action across clinical practice, AI development, and regulatory frameworks.

2505.21162 2026-03-12 cs.DL cs.CL cs.SI

Leveraging GANs for citation intent classification and its impact on citation network analysis

Davi A. Bezerra, Filipi N. Silva, Diego R. Amancio

详情
Journal ref
Journal of Informetrics 20 (2) 101791, 2026
英文摘要

Citations play a fundamental role in the scientific ecosystem, serving as a foundation for tracking the flow of knowledge, acknowledging prior work, and assessing scholarly influence. In scientometrics, they are also central to the construction of quantitative indicators. Not all citations, however, serve the same function: some provide background, others introduce methods, or compare results. Therefore, understanding citation intent allows for a more nuanced interpretation of scientific impact. In this paper, we adopted a GAN-based method to classify citation intents. Our results revealed that the proposed method achieves competitive classification performance, closely matching state-of-the-art results with substantially fewer parameters. This demonstrates the effectiveness and efficiency of leveraging GAN architectures combined with contextual embeddings in intent classification task. We also investigated whether filtering citation intents affects the centrality of papers in citation networks. Analyzing the network constructed from the unArXiv dataset, we found that paper rankings can be significantly influenced by citation intent. All four centrality metrics examined- degree, PageRank, closeness, and betweenness - were sensitive to the filtering of citation types. The betweenness centrality displayed the greatest sensitivity, showing substantial changes in ranking when specific citation intents were removed.

2504.14373 2026-03-12 cs.GR cs.CV

SEGA: Drivable 3D Gaussian Head Avatar from a Single Image

Chen Guo, Zhuo Su, Liao Wang, Jian Wang, Shuang Li, Xu Chang, Zhaohu Li, Yang Zhao, Guidong Wang, Yebin Liu, Ruqi Huang

详情
英文摘要

Creating photorealistic 3D head avatars from limited input has become increasingly important for applications in virtual reality, telepresence, and digital entertainment. While recent advances like neural rendering and 3D Gaussian splatting have enabled high-quality digital human avatar creation and animation, most methods rely on multiple images or multi-view inputs, limiting their practicality for real-world use. In this paper, we propose SEGA, a novel approach for Single-imagE-based 3D drivable Gaussian head Avatar creation that combines generalized prior models with a new hierarchical UV-space Gaussian Splatting framework. SEGA seamlessly combines priors derived from large-scale 2D datasets with 3D priors learned from multi-view, multi-expression, and multi-ID data, achieving robust generalization to unseen identities while ensuring 3D consistency across novel viewpoints and expressions. We further present a hierarchical UV-space Gaussian Splatting framework that leverages FLAME-based structural priors and employs a dual-branch architecture to disentangle dynamic and static facial components effectively. The dynamic branch encodes expression-driven fine details, while the static branch focuses on expression-invariant regions, enabling efficient parameter inference and precomputation. This design maximizes the utility of limited 3D data and achieves real-time performance for animation and rendering. Additionally, SEGA performs person-specific fine-tuning to further enhance the fidelity and realism of the generated avatars. Experiments show our method outperforms state-of-the-art approaches in generalization ability, identity preservation, and expression realism, advancing one-shot avatar creation for practical applications.

2504.09836 2026-03-12 math.OC cs.LG cs.RO cs.SY eess.SY

Score Matching Diffusion Based Feedback Control and Planning of Nonlinear Systems

Karthik Elamvazhuthi, Darshan Gadginmath, Fabio Pasqualetti

详情
英文摘要

In this paper, we propose a deterministic diffusion-based framework for controlling the probability density of nonlinear control-affine systems, with theoretical guarantees for drift-free and linear time-invariant (LTI) dynamics. The central idea is to first excite the system with white noise so that a forward diffusion process explores the reachable regions of state space, and then to design a deterministic feedback law that acts as a denoising mechanism driving the system back toward a desired target distribution supported on the target set. This denoising phase provides a feedback controller that steers the control system to the target set. In this framework, control synthesis reduces to constructing a deterministic reverse process that reproduces the desired evolution of state densities. We derive existence conditions ensuring such deterministic realizations of time-reversals for controllable drift-free and LTI systems, and show that the resulting feedback laws provide a tractable alternative to nonlinear control by viewing density control as a relaxation of controlling a system to target sets. Numerical studies on a unicycle model with obstacles, a five-dimensional driftless system, and a four-dimensional LTI system demonstrate reliable diffusion-inspired density control.

2504.09723 2026-03-12 cs.HC cs.CL

AgentA/B: Automated and Scalable Web A/BTesting with Interactive LLM Agents

Yuxuan Lu, Ting-Yao Hsu, Hansu Gu, Limeng Cui, Yaochen Xie, William Headden, Bingsheng Yao, Akash Veeragouni, Jiapeng Liu, Sreyashi Nag, Jessie Wang, Dakuo Wang

详情
英文摘要

A/B testing experiment is a widely adopted method for evaluating UI/UX design decisions in modern web applications. Yet, traditional A/B testing remains constrained by its dependence on the large-scale and live traffic of human participants, and the long time of waiting for the testing result. Through formative interviews with six experienced industry practitioners, we identified critical bottlenecks in current A/B testing workflows. In response, we present AgentA/B, a novel system that leverages Large Language Model-based autonomous agents (LLM Agents) to automatically simulate user interaction behaviors with real webpages. AgentA/B enables scalable deployment of LLM agents with diverse personas, each capable of navigating the dynamic webpage and interactively executing multi-step interactions like search, clicking, filtering, and purchasing. In a demonstrative controlled experiment, we employ AgentA/B to simulate a between-subject A/B testing with 1,000 LLM agents Amazon.com, and compare agent behaviors with real human shopping behaviors at a scale. Our findings suggest AgentA/B can emulate human-like behavior patterns.

2504.08937 2026-03-12 cs.GR cs.CV cs.LG eess.IV stat.ML

Rethinking Few-Shot Image Fusion: Granular Ball Priors Enable General-Purpose Deep Fusion

Minjie Deng, Yan Wei, An Wu, Yuncan Ouyang, Hao Zhai, Qianyao Peng

详情
英文摘要

In image fusion tasks, the absence of real fused images as supervision signals poses significant challenges for supervised learning. Existing deep learning methods typically address this issue either by designing handcrafted priors or by relying on large-scale datasets to learn model parameters. Different from previous approaches, this paper introduces the concept of incomplete priors, which formally describe handcrafted priors at the algorithmic level and estimate their confidence. Based on this idea, we couple incomplete priors with the neural network through a sample-level adaptive loss function, enabling the network to learn and re-infer fusion rules under conditions that approximate the real fusion process.To generate incomplete priors, we propose a Granular Ball Pixel Computation (GBPC) algorithm based on the principles of granular computing. The algorithm models fused-image pixels as information units, estimating pixel weights at a fine-grained level while statistically evaluating prior reliability at a coarse-grained level. This design enables the algorithm to perceive cross-modal discrepancies and perform adaptive inference.Experimental results demonstrate that even under few-shot conditions, a lightweight neural network can still learn effective fusion rules by training only on image patches extracted from ten image pairs. Extensive experiments across multiple fusion tasks and datasets further show that the proposed method achieves superior performance in both visual quality and model compactness. The code is available at: https://github.com/DMinjie/GBFF

2502.02558 2026-03-12 hep-ex cs.CV cs.LG

Particle Trajectory Representation Learning with Masked Point Modeling

Sam Young, Yeon-jae Jwa, Kazuhiro Terao

Comments Preprint. 28 pages, 18 figures. v3 includes new results on data efficiency and attention maps

详情
英文摘要

Effective self-supervised learning (SSL) techniques have been key to unlocking large datasets for representation learning. While many promising methods have been developed using online corpora and captioned photographs, their application to scientific domains, where data encodes highly specialized knowledge, remains a challenge. Liquid Argon Time Projection Chambers (LArTPCs) provide high-resolution 3D imaging for fundamental physics, but analysis of their sparse, complex point cloud data often relies on supervised methods trained on large simulations, introducing potential biases. We introduce the Point-based Liquid Argon Masked Autoencoder (PoLAr-MAE), applying masked point modeling to unlabeled LArTPC images using domain-specific volumetric tokenization and energy prediction. We show this SSL approach learns physically meaningful trajectory representations directly from data. This yields remarkable data efficiency: fine-tuning on just 100 labeled events achieves track/shower semantic segmentation performance comparable to the state-of-the-art supervised baseline trained on $>$100,000 events. Furthermore, internal attention maps exhibit emergent instance segmentation of particle trajectories. While challenges remain, particularly for fine-grained features, we make concrete SSL's potential for building a foundation model for LArTPC image analysis capable of serving as a common base for all data reconstruction tasks. To facilitate further progress, we release PILArNet-M, a large dataset of 1M LArTPC events. Project site: https://youngsm.com/polarmae.

2501.07437 2026-03-12 stat.ML cs.LG

Pairwise Comparisons without Stochastic Transitivity: Model, Theory and Applications

Sze Ming Lee, Yunxiao Chen

Comments 55 pages, 2 figures

详情
英文摘要

Most statistical models for pairwise comparisons, including the Bradley-Terry (BT) and Thurstone models and many extensions, make a relatively strong assumption of stochastic transitivity. This assumption imposes the existence of an unobserved global ranking among all the players/teams/items and monotone constraints on the comparison probabilities implied by the global ranking. However, the stochastic transitivity assumption does not hold in many real-world scenarios of pairwise comparisons, especially games involving multiple skills or strategies. As a result, models relying on this assumption can have suboptimal predictive performance. In this paper, we propose a general family of statistical models for pairwise comparison data without a stochastic transitivity assumption, substantially extending the BT and Thurstone models. In this model, the pairwise probabilities are determined by a (approximately) low-dimensional skew-symmetric matrix. Likelihood-based estimation methods and computational algorithms are developed, which allow for sparse data with only a small proportion of observed pairs. Theoretical analysis shows that the proposed estimator achieves minimax-rate optimality, which adapts effectively to the sparsity level of the data. The spectral theory for skew-symmetric matrices plays a crucial role in the implementation and theoretical analysis. The proposed method's superiority against the BT model, along with its broad applicability across diverse scenarios, is further supported by simulations and real data analysis.

2411.00143 2026-03-12 eess.IV cs.LG

Enhancing Brain Source Reconstruction by Initializing 3D Neural Networks with Physical Inverse Solutions

Marco Morik, Ali Hashemi, Klaus-Robert Müller, Stefan Haufe, Shinichi Nakajima

Comments Accepted in IEEE Transactions on Medical Imaging

详情
Journal ref
IEEE Transactions on Medical Imaging ( Volume: 45, Issue: 1, pp. 231 - 242, 2026)
英文摘要

Reconstructing brain sources is a fundamental challenge in neuroscience, crucial for understanding brain function and dysfunction. Electroencephalography (EEG) signals have a high temporal resolution. However, identifying the correct spatial location of brain sources from these signals remains difficult due to the ill-posed structure of the problem. Traditional methods predominantly rely on manually crafted priors, missing the flexibility of data-driven learning, while recent deep learning approaches focus on end-to-end learning, typically using the physical information of the forward model only for generating training data. We propose the novel hybrid method 3D-PIUNet for EEG source localization that effectively integrates the strengths of traditional and deep learning techniques. 3D-PIUNet starts from an initial physics-informed estimate by using the pseudo inverse to map from measurements to source space. Secondly, by viewing the brain as a 3D volume, we use a 3D convolutional U-Net to capture spatial dependencies and refine the solution according to the learned data prior. Training the model relies on simulated pseudo-realistic brain source data, covering different source distributions. Trained on this data, our model significantly improves spatial accuracy, demonstrating superior performance over both traditional and end-to-end data-driven methods. Additionally, we validate our findings with real EEG data from a visual task, where 3D-PIUNet successfully identifies the visual cortex and reconstructs the expected temporal behavior, thereby showcasing its practical applicability.

2410.08727 2026-03-12 stat.ML cs.LG

Losing dimensions: Geometric memorization in generative diffusion

Beatrice Achilli, Enrico Ventura, Gianluigi Silvestri, Bao Pham, Gabriel Raya, Dmitry Krotov, Carlo Lucibello, Luca Ambrogioni

Comments 17 pages, 9 figures

详情
英文摘要

Diffusion models power leading generative AI, but when and how they memorize training data, especially on low-dimensional manifolds, remains unclear. We find memorization emerges gradually, not abruptly: as data become scarce, diffusion models experience a smooth collapse where their capacity to vary across independent directions diminishes. Measuring latent dimensionality via the learned score field, we reveal how generative behavior increasingly centers on a few examples while other variations "freeze out". We propose a geometric memorization theory, showing that salient features collapse first, then finer details, leading to near point-wise replication. This mirrors physical systems condensing into a few low-energy configurations. Our theoretical predictions align with both synthetic and real data, identifying geometric memorization as a distinct phase between generalization and exact copying.

2410.08226 2026-03-12 physics.geo-ph cs.LG stat.AP stat.ML

EarthquakeNPP: A Benchmark for Earthquake Forecasting with Neural Point Processes

Samuel Stockman, Daniel Lawson, Maximilian Werner

Comments Accepted to Transactions on Machine Learning Research (TMLR), 2026

详情
英文摘要

For decades, classical point process models, such as the epidemic-type aftershock sequence (ETAS) model, have been widely used for forecasting the event times and locations of earthquakes. Recent advances have led to Neural Point Processes (NPPs), which promise greater flexibility and improvements over such classical models. However, the currently-used benchmark for NPPs does not represent an up-to-date challenge in the seismological community, since it contains data leakage and omits the largest earthquake sequence from the region. Additionally, initial earthquake forecasting benchmarks fail to compare NPPs with state-of-the-art forecasting models commonly used in seismology. To address these gaps, we introduce EarthquakeNPP: a benchmarking platform that curates and standardizes existing public resources: globally available earthquake catalogs, the ETAS model, and evaluation protocols from the seismology community. The datasets cover a range of small to large target regions within California, dating from 1971 to 2021, and include different methodologies for dataset generation. Benchmarking experiments, using both log-likelihood and generative evaluation metrics widely recognised in seismology, show that none of the five NPPs tested outperform ETAS. These findings suggest that current NPP implementations are not yet suitable for practical earthquake forecasting. Nonetheless, EarthquakeNPP provides a platform to foster future collaboration between the seismology and machine learning communities.

2408.09335 2026-03-12 math.OC cs.LG q-fin.MF stat.ML

Exploratory Optimal Stopping: A Singular Control Formulation

Jodi Dianetti, Giorgio Ferrari, Renyuan Xu

Comments 49 pages, 3 figures

详情
英文摘要

This paper explores continuous-time and state-space optimal stopping problems from a reinforcement learning perspective. We begin by formulating the stopping problem using randomized stopping times, where the decision maker's control is represented by the probability of stopping within a given time-specifically, a bounded, non-decreasing, càdlàg control process. To encourage exploration and facilitate learning, we introduce a regularized version of the problem by penalizing the performance criterion with the cumulative residual entropy of the randomized stopping time. The regularized problem takes the form of an (n+1)-dimensional degenerate singular stochastic control with finite-fuel, where the regularized free boundary becomes the graph of a function mapping the state variable of the original stopping problem into the probability of stopping. We address this singular control problem through the dynamic programming principle, which enables us to identify the unique optimal exploratory strategy. Finally, we propose both model-based and model-free reinforcement learning algorithms tailored for exploratory optimal stopping problems. We establish policy improvement guarantees for the proposed algorithms. Moreover, the model-free method is of actor-critic type and it is scalable in high-dimensions under neural network parameterization.