arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1552
2603.16895 2026-03-19 eess.SP cs.LG

EEG-SeeGraph: Interpreting functional connectivity disruptions in dementias via sparse-explanatory dynamic EEG-graph learning

Fengcheng Wu, Zhenxi Song, Guoyang Xu, Kaisong Hu, Zirui Wang, Yi Guo, Zhiguo Zhang

详情
英文摘要

Robust and interpretable dementia diagnosis from noisy, non-stationary electroencephalography (EEG) is clinically essential yet remains challenging. To this end, we propose SeeGraph, a Sparse-Explanatory dynamic EEG-graph network that models time-evolving functional connectivity and employs a node-guided sparse edge mask to reveal the connections that drive diagnostic decisions, while remaining robust to noise and cross-site variability. SeeGraph comprises four components: (1) a dual-trajectory temporal encoder that models dynamic EEG with two streams, where node signals capture regional oscillations and edge signals capture interregional coupling; (2) a topology-aware positional encoder that derives graph-spectral Laplacian coordinates from the fused connectivity and augments node embeddings; (3) a node-guided sparse explanatory edge mask that gates the connectivity into a compact subgraph; and (4) a gated graph predictor that operates on the sparsified graph. The framework is trained using cross-entropy loss together with a sparsity regularizer on the mask, yielding noise-robust and interpretable diagnoses. The effectiveness of SeeGraph is validated on public and in-house EEG cohorts, including patients with neurodegenerative dementias and healthy controls, under both raw and noise-perturbed conditions. Its sparse, node-guided explanations highlight disease-relevant connections and align with established clinical findings on functional connectivity alterations, thereby offering transparent cues for neurological evaluation.

2603.16891 2026-03-19 eess.SP cs.AI cs.HC cs.LG

A Novel end-to-end Digital Health System Using Deep Learning-based ECG Analysis

Artemis Kontou, Natalia Miroshnikova, Costakis Matheou, Sophocles Sophocleous, Nicholas Tsekouras, Kleanthis Malialis, Panayiotis Kolios

Comments Preprint submitted to the International Journal of Information Management Data Insights (Elsevier). 15 pages, 5 figures

详情
英文摘要

This study presents AI-HEART, a cloud-based information system for managing and analysing long-duration ambulatory electrocardiogram (ECG) recordings and supporting clinician decision-making. The platform operationalises an end-to-end pipeline that ingests multi-day three-lead ECGs, normalises inputs, performs signal preprocessing, and applies dedicated deep neural networks for wave delineation, noise/quality detection, and beat- and rhythm-level multi-class arrhythmia classification. To address class imbalance and real-world signal variability, model development combines large clinically annotated datasets with expert-in-the-loop curation and generative augmentation for under-represented rhythms. Empirical evaluation on three-lead ambulatory ECG data shows that delineation accuracy is sufficient for automated interval measurement, noise detection reliably flags poor-quality segments, and arrhythmia classification achieves high specificity with clinically useful macro-averaged performance across common and rarer rhythms. Beyond predictive accuracy, AI-HEART provides a scalable deployment approach for integrating AI into routine ECG services, enabling traceable outputs, audit-friendly storage of recordings and derived annotations, and clinician review/editing that captures feedback for controlled model improvement. The findings demonstrate the technical feasibility and operational value of a noise-aware AI-ECG platform as a digital health information system.

2603.16890 2026-03-19 cs.MM cs.SD eess.AS

Amanous: Distribution-Switching for Superhuman Piano Density on Disklavier

Joonhyung Bae

详情
英文摘要

The automated piano enables note densities, polyphony, and register changes far beyond human physical limits, yet the three dominant traditions for composing such textures--Nancarrow's tempo canons, Xenakis's stochastic distributions, and L-system grammars--have developed in isolation. This paper presents Amanous, a hardware-aware composition system for Yamaha Disklavier that unifies these methodologies through distribution-switching: L-system symbols select distinct distributional regimes rather than merely modulating parameters within a fixed family. Four contributions are reported. (1) A four-layer architecture (symbolic, parametric, numeric, physical) produces statistically distinct sections with large effect sizes (d = 3.70-5.34), validated by per-layer degradation and ablation experiments. (2) A hardware abstraction layer formalizes velocity-dependent latency and key reset constraints, keeping superhuman textures within the Disklavier's actuable envelope. (3) A density sweep reveals a computational saturation transition at 24-30 notes/s (bootstrap 95% CI: 23.3-50.0), beyond which single-domain melodic metrics lose discriminative power and cross-domain coupling becomes necessary. (4) A convergence point calculus operationalizes tempo-canon geometry as a control interface, enabling convergence events to trigger distribution switches linking macro-temporal structure to micro-level texture. All results are computational; a psychoacoustic validation protocol is proposed for future work. The pipeline has been deployed on a physical Disklavier, demonstrating algorithmic self-consistency and sub-millisecond software precision. Supplementary materials (Excerpts 1-4): https://www.amanous.xyz. Source code: https://github.com/joonhyungbae/Amanous.

2603.16886 2026-03-19 q-fin.ST cs.LG q-fin.GN

A Controlled Comparison of Deep Learning Architectures for Multi-Horizon Financial Forecasting: Evidence from 918 Experiments

Nabeel Ahmad Saidd

详情
英文摘要

Multi-horizon price forecasting is central to portfolio allocation, risk management, and algorithmic trading, yet deep learning architectures have proliferated faster than rigorous financial benchmarks can evaluate them. This study provides a controlled comparison of nine architectures (Autoformer, DLinear, iTransformer, LSTM, ModernTCN, N-HiTS, PatchTST, TimesNet, and TimeXer) spanning Transformer, MLP, CNN, and RNN families across cryptocurrency, forex, and equity index markets at 4-hour and 24-hour horizons. A total of 918 experiments were conducted under a strict five-stage protocol including fixed-seed Bayesian hyperparameter optimization, configuration freezing per asset class, multi-seed retraining, uncertainty aggregation, and statistical validation. ModernTCN achieves the best mean rank (1.333) with a 75 percent first-place rate, followed by PatchTST (2.000). Results reveal a clear three-tier ranking structure and show that architecture explains nearly all performance variance, while seed randomness is negligible. Rankings remain stable across horizons despite 2 to 2.5 times error amplification. Directional accuracy remains near 50 percent across all configurations, indicating that MSE-trained models lack directional skill at hourly resolution. The findings highlight the importance of architectural inductive bias over raw parameter count and provide reproducible guidance for multi-step financial forecasting.

2603.16885 2026-03-19 eess.SP cs.CL cs.HC cs.LG

DECODE: Dual-Enhanced Conditioned Diffusion for EEG Forecasting

Mehran Shabanpour, Sadaf Khademi, Konstantinos N Plataniotis, Arash Mohammadi

详情
英文摘要

Forecasting Electroncephalography (EEG) signals during cognitive events remains a fundamental challenge in neuroscience and Brain-Computer Interfaces (BCIs), as existing methods struggle to capture both the stochastic nature of neural dynamics and the semantic context of behavioral tasks. We present the Dual-Enhanced COnditioned Diffusion (DECODE) for EEG, a novel framework that unifies semantic guidance from natural language descriptions with temporal dynamics from historical signals to generate event-specific neural responses. DECODE leverages pre-trained language models to condition the diffusion process on rich textual descriptions of cognitive events, while maintaining temporal coherence through history-based Langevin dynamics. Evaluated on a real-world driving task dataset with five distinct behaviors, DECODE achieves sub-microvolt prediction accuracy (MAE = 0.626 microvolt) over 75 timestep horizons while maintaining well-calibrated uncertainty estimates. Our framework demonstrates that natural language can effectively bridge high-level cognitive descriptions and low-level neural dynamics, opening new possibilities for zero-shot generalization to novel behaviors and interpretable BCIs. By generating physiologically plausible, event-specific EEG trajectories conditioned on semantic descriptions, DECODE establishes a new paradigm for understanding and predicting context-dependent neural activity.

2603.16882 2026-03-19 eess.SY cs.RO cs.SY math.DG

The Port-Hamiltonian Structure of Vehicle Manipulator Systems

Ramy Rashad

详情
英文摘要

This paper presents a port-Hamiltonian formulation of vehicle-manipulator systems (VMS), a broad class of robotic systems including aerial manipulators, underwater manipulators, space robots, and omnidirectional mobile manipulators. Unlike existing Lagrangian formulations that obscure the underlying energetic structure, the proposed port-Hamiltonian formulation explicitly reveals the energy flow and conservation properties of these complex mechanical systems. We derive the port-Hamiltonian dynamics from first principles using Hamiltonian reduction theory. Two complementary formulations are presented: a standard form that directly exposes the energy structure, and an inertially-decoupled form that leverages the principal bundle structure of the VMS configuration space and is particularly suitable for control design and numerical simulation. The coordinate-free geometric approach we follow avoids singularities associated with local parameterizations of the base orientation. We rigorously establish the mathematical equivalence between our port-Hamiltonian formulations and existing reduced Euler-Lagrange and Boltzmann-Hamel equations found in the robotics and geometric mechanics literature.

2603.16879 2026-03-19 eess.SY cs.AI cs.LG cs.SY

PowerModelsGAT-AI: Physics-Informed Graph Attention for Multi-System Power Flow with Continual Learning

Chidozie Ezeakunne, Jose E. Tabarez, Reeju Pokharel, Anup Pandey

Comments 26 pages, 11 figures, 1 ancillary supplementary PDF

详情
英文摘要

Solving the alternating current power flow equations in real time is essential for secure grid operation, yet classical Newton-Raphson solvers can be slow under stressed conditions. Existing graph neural networks for power flow are typically trained on a single system and often degrade on different systems. We present PowerModelsGAT-AI, a physics-informed graph attention network that predicts bus voltages and generator injections. The model uses bus-type-aware masking to handle different bus types and balances multiple loss terms, including a power-mismatch penalty, using learned weights. We evaluate the model on 14 benchmark systems (4 to 6,470 buses) and train a unified model on 13 of these under N-2 (two-branch outage) conditions, achieving an average normalized mean absolute error of 0.89% for voltage magnitudes and R^2 > 0.99 for voltage angles. We also show continual learning: when adapting a base model to a new 1,354-bus system, standard fine-tuning causes severe forgetting with error increases exceeding 1000% on base systems, while our experience replay and elastic weight consolidation strategy keeps error increases below 2% and in some cases improves base-system performance. Interpretability analysis shows that learned attention weights correlate with physical branch parameters (susceptance: r = 0.38; thermal limits: r = 0.22), and feature importance analysis supports that the model captures established power flow relationships.

2603.16875 2026-03-19 cs.HC cs.AI

Attention Guidance through Video Script: A Case Study of Object Focusing on 360° VR Video Tours

Paulo Vitor Santana Silva, Arthur Ricardo Sousa Vitória, Diogo Fernandes Costa Silva, Arlindo Rodrigues Galvão Filho

详情
Journal ref
SVR 2024: Proceedings of the 26th Symposium on Virtual and Augmented Reality
英文摘要

Within the expansive domain of virtual reality (VR), 360° VR videos immerse viewers in a spherical environment, allowing them to explore and interact with the virtual world from all angles. While this video representation offers unparalleled levels of immersion, it often lacks effective methods to guide viewers' attention toward specific elements within the virtual environment. This paper combines the models Grounding Dino and Segment Anything (SAM) to guide attention by object focusing based on video scripts. As a case study, this work conducts the experiments on a 360° video tour on the University of Reading. The experiment results show that video scripts can improve the user experience in 360° VR Videos Tour by helping in the task of directing the user's attention.

2603.16874 2026-03-19 cs.HC cs.AI

Disclosure By Design: Identity Transparency as a Behavioural Property of Conversational AI Models

Anna Gausen, Sarenne Wallbridge, Hannah Rose Kirk, Jennifer Williams, Christopher Summerfield

Comments 25 pages, 5 figures

详情
英文摘要

As conversational AI systems become more realistic and widely deployed, users are increasingly uncertain about whether they are interacting with a human or an AI system. When AI identity is unclear, users may unwittingly share sensitive information, place unwarranted trust in AI-generated advice, or fall victim to AI-enabled fraud. More broadly, a persistent lack of transparency can erode trust in mediated communication. While regulations like the EU AI Act and California's BOT Act require AI systems to identify themselves, they provide limited guidance on reliable disclosure in real-time conversation. Existing transparency mechanisms also leave gaps: interface indicators can be omitted by deployers, and provenance tools require coordinated infrastructure and cannot provide reliable real-time verification. We ask how conversational AI systems should maintain identity transparency as human-AI interactions become more ambiguous and diverse. We advocate for disclosure by design, where AI systems explicitly disclose their artificial identity when directly asked. Implemented as model behaviour, disclosure can persist across deployment contexts without relying on user interfaces, while preserving user agency to verify identity on demand without disrupting immersive uses like role-playing. To assess current practice, we present the first multi-modal (text and voice) evaluation of disclosure behaviour in deployed systems across baseline, role-playing, and adversarial settings. We find that baseline disclosure rates are often high but drop substantially in role-play and can be suppressed under adversarial prompting. Importantly, disclosure rates vary significantly across providers and modalities, highlighting the fragility of current disclosure behaviour. We conclude with technical interventions to help developers embed disclosure as a fundamental property of conversational AI models.

2603.16873 2026-03-19 cs.HC cs.CV

The Truth, the Whole Truth, and Nothing but the Truth: Automatic Visualization Evaluation from Reconstruction Quality

Roxana Bujack, Li-Ta Lo, Ethan Stam, Ayan Biswas, David Rogers

详情
英文摘要

Recent advances in AI enable the automatic generation of visualizations directly from textual prompts using agentic workflows. However, visualizations produced via one-shot generative methods often suffer from insufficient quality, typically requiring a human in the loop to refine the outputs. Human evaluation, though effective, is costly and impractical at scale. To alleviate this problem, we propose an automated metric that evaluates visualization quality without relying on extensive human-labeled datasets. Instead, our approach uses the original underlying data as implicit ground truth. Specifically, we introduce a method that measures visualization quality by assessing the reconstruction accuracy of the original data from the visualization itself. This reconstruction-based metric provides an autonomous and scalable proxy for thorough human evaluation, facilitating more efficient and reliable AI-driven visualization workflows.

2603.16239 2026-03-19 math.NA cs.LG cs.NA

Neural Pushforward Samplers for the Fokker-Planck Equation on Embedded Riemannian Manifolds

Andrew Qing He, Wei Cai

Comments 13 pages, 2 figures, 1 table, 1 algorithm

详情
英文摘要

In this paper, we extend the Weak Adversarial Neural Pushforward Method to the Fokker--Planck equation on compact embedded Riemannian manifolds. The method represents the solution as a probability distribution via a neural pushforward map that is constrained to the manifold by a retraction layer, enforcing manifold membership and probability conservation by construction. Training is guided by a weak adversarial objective using ambient plane-wave test functions, whose intrinsic differential operators are derived in closed form from the geometry of the embedding, yielding a fully mesh-free and chart-free algorithm. Both steady-state and time-dependent formulations are developed, and numerical results on a double-well problem on the two-sphere demonstrate the capability of the method in capturing multimodal invariant distributions on curved spaces.

2603.15080 2026-03-19 cs.DB cs.AI q-bio.QM

Open Biomedical Knowledge Graphs at Scale: Construction, Federation, and AI Agent Access with Samyama Graph Database

Madhulatha Mandarapu, Sandeep Kunkunuru

Comments 12 pages, 7 tables, open-source code and data

详情
英文摘要

Biomedical knowledge is fragmented across siloed databases -- Reactome for pathways, STRING for protein interactions, ClinicalTrials.gov for study registries, DrugBank for drug vocabularies, DGIdb for drug-gene interactions, SIDER for side effects. We present three open-source biomedical knowledge graphs -- Pathways KG (118,686 nodes, 834,785 edges from 5 sources), Clinical Trials KG (7,774,446 nodes, 26,973,997 edges from 5 sources), and Drug Interactions KG (32,726 nodes, 191,970 edges from 3 sources) -- built on Samyama, a high-performance graph database written in Rust. Our contributions are threefold. First, we describe a reproducible ETL pattern for constructing large-scale KGs from heterogeneous public data sources, with cross-source deduplication, batch loading (Python Cypher and Rust native loaders), and portable snapshot export. Second, we demonstrate cross-KG federation: loading all three snapshots into a single graph tenant enables property-based joins across datasets. Third, we introduce schema-driven MCP server generation for LLM agent access, evaluated on a new BiomedQA benchmark (40 pharmacology questions): domain-specific MCP tools achieve 98% accuracy vs. 85% for schema-aware text-to-Cypher and 75% for standalone GPT-4o, with zero schema errors. All data sources are open-license. The combined federated graph (7.9M nodes, 28M edges) loads in approximately 3 minutes on commodity cloud hardware, with single-KG queries completing in 80-100ms and cross-KG federation joins in 1-4s

2603.14832 2026-03-19 eess.IV cs.CV cs.LG

Halfway to 3D: Ensembling 2.5D and 3D Models for Robust COVID-19 CT Diagnosis

Tuan-Anh Yang, Bao V. Q. Bui, Chanh-Quang Vo-Van, Truong-Son Hy

详情
英文摘要

We propose a deep learning framework for COVID-19 detection and disease classification from chest CT scans that integrates both 2.5D and 3D representations to capture complementary slice-level and volumetric information. The 2.5D branch processes multi-view CT slices (axial, coronal, sagittal) using a DINOv3 vision transformer to extract robust visual features, while the 3D branch employs a ResNet-18 architecture to model volumetric context and is pretrained with Variance Risk Extrapolation (VREx) followed by supervised contrastive learning to improve cross-source robustness. Predictions from both branches are combined through logit-level ensemble inference. Experiments on the PHAROS-AIF-MIH benchmark demonstrate the effectiveness of the proposed approach: for binary COVID-19 detection, the ensemble achieves 94.48% accuracy and a 0.9426 Macro F1-score, outperforming both individual models, while for multi-class disease classification the 2.5D DINOv3 model achieves the best performance with 79.35% accuracy and a 0.7497 Macro F1-score. These results highlight the benefit of combining pretrained slice-based representations with volumetric modeling for robust multi-source medical imaging analysis. Code is available at https://github.com/HySonLab/PHAROS-AIF-MIH

2603.14441 2026-03-19 stat.ML cs.LG

AR-Flow VAE: A Structured Autoregressive Flow Prior Variational Autoencoder for Unsupervised Blind Source Separation

Yuan-Hao Wei, Fu-Hao Deng, Lin-Yong Cui, Yan-Jie Sun

详情
英文摘要

Blind source separation (BSS) seeks to recover latent source signals from observed mixtures. Variational autoencoders (VAEs) offer a natural perspective for this problem: the latent variables can be interpreted as source components, the encoder can be viewed as a demixing mapping from observations to sources, and the decoder can be regarded as a remixing process from inferred sources back to observations. In this work, we propose AR-Flow VAE, a novel VAE-based framework for BSS in which each latent source is endowed with a parameter-adaptive autoregressive flow prior. This prior significantly enhances the flexibility of latent source modeling, enabling the framework to capture complex non-Gaussian behaviors and structured dependencies, such as temporal correlations, that are difficult to represent with conventional priors. In addition, the structured prior design assigns distinct priors to different latent dimensions, thereby encouraging the latent components to separate into different source signals under heterogeneous prior constraints. Experimental results validate the effectiveness of the proposed architecture for blind source separation. More importantly, this work provides a foundation for future investigations into the identifiability and interpretability of AR-Flow VAE.

2603.14045 2026-03-19 cs.IR cs.CL

The Reasoning Bottleneck in Graph-RAG: Structured Prompting and Context Compression for Multi-Hop QA

Yasaman Zarrinkia, Venkatesh Srinivasan, Alex Thomo

Comments 11 pages, 2 figures, 9 tables; under review

详情
英文摘要

Graph-RAG systems achieve strong multi-hop question answering by indexing documents into knowledge graphs, but strong retrieval does not guarantee strong answers. Evaluating KET-RAG, a leading Graph-RAG system, on three multi-hop QA benchmarks (HotpotQA, MuSiQue, 2WikiMultiHopQA), we find that 77% to 91% of questions have the gold answer in the retrieved context, yet accuracy is only 35% to 78%, and 73% to 84% of errors are reasoning failures. We propose two augmentations: (i) SPARQL chain-of-thought prompting, which decomposes questions into triple-pattern queries aligned with the entity-relationship context, and (ii) graph-walk compression, which compresses the context by ~60% via knowledge-graph traversal with no LLM calls. SPARQL CoT improves accuracy by +2 to +14 pp; graph-walk compression adds +6 pp on average when paired with structured prompting on smaller models. Surprisingly, we show that, with question-type routing, a fully augmented budget open-weight Llama-8B model matches or exceeds the unaugmented Llama-70B baseline on all three benchmarks at ~12x lower cost. A replication on LightRAG confirms that our augmentations transfer across Graph-RAG systems.

2603.10485 2026-03-19 stat.ML cs.LG

Dual Space Preconditioning for Gradient Descent in the Overparameterized Regime

Reza Ghane, Danil Akhtiamov, Babak Hassibi

详情
英文摘要

In this work we study the convergence properties of the Dual Space Preconditioned Gradient Descent, encompassing optimizers such as Normalized Gradient Descent, Gradient Clipping and Adam. We consider preconditioners of the form $\nabla K$, where $K: \mathbb{R}^p \to \mathbb{R}$ is convex and assume that the latter is applied to train an over-parameterized linear model with loss of the form $\ell({X} {W} - {Y})$, for weights ${W} \in \mathbb{R}^{d \times k}$, labels ${Y} \in \mathbb{R}^{n \times k}$ and data ${X} \in \mathbb{R}^{n \times d}$. Under the aforementioned assumptions, we prove that the iterates of the preconditioned gradient descent always converge to a point ${W}_{\infty} \in \mathbb{R}^{d \times k}$ satisfying ${X}{W}_{\infty} = {Y}$. Our proof techniques are of independent interest as we introduce a novel version of the Bregman Divergence with accompanying identities that allow us to establish convergence. We also study the implicit bias of Dual Space Preconditioned Gradient Descent. First, we demonstrate empirically that, for general $K(\cdot)$, ${W}_\infty$ depends on the chosen learning rate, hindering a precise characterization of the implicit bias. Then, for preconditioners of the form $K({G}) = h(\|{G}\|_F)$, known as \textit{isotropic preconditioners}, we show that ${W}_\infty$ minimizes $\|{W}_\infty - {W}_0\|_F^2$ subject to ${X}{W}_\infty = {Y}$, where ${W}_0$ is the initialization. Denoting the convergence point of GD initialized at ${W}_0$ by ${W}_{\text{GD}, \infty}$, we thus note ${W}_{\infty} = {W}_{\text{GD}, \infty}$ for isotropic preconditioners. Finally, we show that a similar fact holds for general preconditioners up to a multiplicative constant, namely, $\|{W}_0 - {W}_{\infty}\|_F \le c \|{W}_0 - {W}_{\text{GD}, \infty}\|_F$ for a constant $c>0$.

2603.05904 2026-03-19 cs.AR cs.AI

LUMINA: LLM-Guided GPU Architecture Exploration via Bottleneck Analysis

Tao Zhang, Rui Ma, Shuotao Xu, Yongqiang Xiong, Peng Cheng

详情
英文摘要

GPU design space exploration (DSE) for modern AI workloads, such as Large-Language Model (LLM) inference, is challenging because of GPUs' vast, multi-modal design spaces, high simulation costs, and complex design optimization objectives (e.g. performance, power and area trade-offs). Existing automated DSE methods are often prohibitively expensive, either requiring an excessive number of exploration samples or depending on intricate, manually crafted analyses of interdependent critical paths guided by human heuristics. We present LUMINA, an LLM-driven GPU architecture exploration framework that leverage AI to enhance the DSE efficiency and efficacy for GPUs. LUMINA extracts architectural knowledge from simulator code and performs sensitivity studies to automatically compose DSE rules,which are auto-corrected during exploration. A core component of LUMINA is a DSE Benchmark that comprehensively evaluates and enhances LLMs' capabilities across three fundamental skills required for architecture optimization, which provides a principled and reproducible basis for model selection and ensuring consistent architectural reasoning. In the design space with 4.7 million possible samples, LUMINA identifies 6 designs of better performance and area than an A100 GPU efficiently, using only 20 steps via LLM-assisted bottleneck analysis. In comparison, LUMINA achieves 17.5x higher than design space exploration efficiency, and 32.9% better designs (i.e. Pareto Hypervolume) than Machine-Learning baselines, showcasing its ability to deliver high-quality design guidance with minimal search cost.

2601.19567 2026-03-19 cond-mat.stat-mech cs.LG

Learning the Intrinsic Dimensionality of Fermi-Pasta-Ulam-Tsingou Trajectories: A Nonlinear Approach using a Deep Autoencoder Model

Gionni Marchetti

Comments 12 pages, 12 figures. Preliminary results were presented in November 2025 at the IUPAP Conference on Computational Physics, CP2025 XXXVI, Oak Ridge National Laboratory in Oak Ridge. The revised version contains further results and analysis

详情
英文摘要

We address the intrinsic dimensionality (ID) of high-dimensional trajectories, comprising $n_s = 4\,000\,000$ data points, of the Fermi-Pasta-Ulam-Tsingou (FPUT) $β$ model with $N = 32$ oscillators. To this end, a deep autoencoder (DAE) is used to infer the ID in the weakly nonlinear regime where energy recurrences are observed ($β\lesssim 1$). We find that the trajectories lie on a nonlinear Riemannian manifold of dimension $m^{\ast} = 2$ embedded in a $64$-dimensional phase space. By contrast, principal component analysis (PCA) together with the Participation Ratio (PR) method provides only a reasonable upper bound on the ID for each value of $β$. Our DAE further reveals that the ID increases to $m^{\ast} = 3$ at $β= 1.1$, coinciding with a symmetry-breaking (SB) phenomenon characteristic of the $β$ model, in which additional energy modes with even wave numbers $k = 2, 4$ become excited. Notably, the SB phenomenon cannot be detected by the linear approach provided by PCA.

2601.17054 2026-03-19 cs.CY cs.AI

Failing on Bias Mitigation: A Case Study on the Challenges of Fairness in Government Data

Hongbo Bo, Jingyu Hu, Debbie Watson, Weiru Liu

详情
英文摘要

The potential for bias and unfairness in AI-supporting government services raises ethical and legal concerns. Using crime rate prediction with the Bristol City Council data as a case study, we examine how these issues persist. Rather than auditing real-world deployed systems, our goal is to understand why widely adopted bias mitigation techniques often fail when applied to government data. Our findings reveal that bias mitigation approaches applied to government data are not always effective -- not because of flaws in model architecture or metric selection, but due to the inherent properties of the data itself. Through comparing a set of comprehensive models and fairness methods, our experiments consistently show that the mitigation efforts cannot overcome the embedded unfairness in the data -- further reinforcing that the origin of bias lies in the structure and history of government datasets. We then explore the reasons for the mitigation failures in predictive models on government data and highlight the potential sources of unfairness posed by data distribution shifts, the accumulation of historical bias, and delays in data release. We also discover the limitations of the blind spots in fairness analysis and bias mitigation methods when only targeting a single sensitive feature through a set of intersectional fairness experiments. Although this study is limited to one city, the findings are highly suggestive, which can contribute to an early warning that biases in government data may persist even with standard mitigation methods.

2601.13708 2026-03-19 quant-ph cs.LG

Generative Adversarial Networks for Resource State Generation

Shahbaz Shaik, Sourav Chatterjee, Sayantan Pramanik, Indranil Chakrabarty

详情
英文摘要

We introduce a physics-informed Generative Adversarial Network framework that recasts quantum resource-state generation as an inverse-design task. By embedding task-specific utility functions into training, the model learns to generate valid two-qubit states optimized for teleportation and entanglement broadcasting. Comparing decomposition-based and direct-generation architectures reveals that structural enforcement of Hermiticity, trace-one, and positivity yields higher fidelity and training stability than loss-only approaches. The framework reproduces theoretical resource boundaries for Werner-like and Bell-diagonal states with fidelities exceeding ~98%, establishing adversarial learning as a lightweight yet effective method for constraint-driven quantum-state discovery. This approach provides a scalable foundation for automated design of tailored quantum resources for information-processing applications, exemplified with teleportation and broadcasting of entanglement, and it opens up the possibility of using such states in efficient quantum network design.

2512.22552 2026-03-19 cs.GT cs.LG

Computing Pure-Strategy Nash Equilibria in a Two-Party Policy Competition: Existence and Algorithmic Approaches

Chuang-Chieh Lin, Chi-Jen Lu, Po-An Chen, Chih-Chieh Hung

Comments A full version of the extended abstract in AAMAS 2026

详情
英文摘要

We formulate two-party policy competition as a two-player non-cooperative game, generalizing Lin et al.'s work (2021). Each party selects a real-valued policy vector as its strategy from a compact subset of Euclidean space, and a voter's utility for a policy is given by the inner product with their preference vector. To capture the uncertainty in the competition, we assume that a policy's winning probability increases monotonically with its total utility across all voters, and we formalize this via an affine isotonic function. A player's payoff is defined as the expected utility received by its supporters. In this work, we first test and validate the isotonicity hypothesis through voting simulations. Next, we prove the existence of a pure-strategy Nash equilibrium (PSNE) in both one- and multi-dimensional settings. Although we construct a counterexample demonstrating the game's non-monotonicity, our experiments show that a decentralized gradient-based algorithm typically converges rapidly to an approximate PSNE. Finally, we present a grid-based search algorithm that finds an $ε$-approximate PSNE of the game in time polynomial in the input size and $1/ε$.

2512.06933 2026-03-19 cs.CE cs.CL cs.HC

TxSum: User-Centered Ethereum Transaction Understanding with Micro-Level Semantic Grounding

Zifan Peng, Jingyi Zheng, Yule Liu, Huaiyu Jia, Qiming Ye, Jingyu Liu, Xufeng Yang, Mingchen Li, Qingyuan Gong, Xuechao Wang, Xinlei He

详情
英文摘要

Understanding the economic intent of Ethereum transactions is critical for user safety, yet current tools expose only raw on-chain data or surface-level intent, leading to widespread "blind signing" (approving transactions without understanding them). Through interviews with 16 Web3 users, we find that effective explanations should be structured, risk-aware, and grounded at the token-flow level. Motivated by these findings, we formulate TxSum, a new user-centered NLP task for Ethereum transaction understanding, and construct a dataset of 187 complex Ethereum transactions annotated with transaction-level summaries and token flow-level semantic labels. We further introduce MATEX, a grounded multi-agent framework for high-stakes transaction explanation. It selectively retrieves external knowledge under uncertainty and audits explanations against raw traces to improve token-flow-level factual consistency. MATEX achieves the strongest overall explanation quality, especially on micro-level factuality and intent quality. It improves user comprehension on complex transactions from 52.9% to 76.5% over the strongest baseline and raises malicious-transaction rejection from 36.0% to 88.0%, while maintaining a low false-rejection rate on benign transactions.

2511.07367 2026-03-19 cond-mat.str-el cs.AI

Reduced Density Matrices Through Machine Learning

Awwab A. Azam, Lexu Zhao, Jiabin Yu

Comments 8+32 pages, 7+6 figures, 0+6 tables

详情
英文摘要

$n$-particle reduced density matrices ($n$-RDMs) play a central role in understanding correlated phases of matter, but their calculation is often computationally inefficient for strongly-correlated states at large system sizes. In this work, we use neural network (NN) architectures to accelerate and even predict $n$-RDMs for large systems. Our underlying intuition is that, for gapped states, $n$-RDMs are often smooth functions over the Brillouin zone (BZ) and are therefore interpolable, allowing NNs trained on small-size systems to predict large-size ones. Building on this, we devise two NNs: (i) a self-attention NN that maps random RDMs to physical ones, and (ii) a Sinusoidal Representation Network (SIREN) that directly maps momentum-space coordinates to RDM values. We test the NNs on RDMs in three 2D models: the pair-pair correlation functions of the Richardson model of superconductivity, the translationally-invariant Hartree-Fock (HF) 1-RDM in a four-band repulsive model, and the translation-breaking HF 1-RDM in the half-filled Hubbard model. We find that a SIREN trained on a $6\times 6$ momentum mesh and a SIREN trained on $4$ tilted meshes (each of which has $12$ momentum points) can predict the $18\times 18$ pair-pair correlation function with a relative accuracy of $94.29\%$ and $93.77\%$, respectively. NNs trained on $6\times 6$ and $8\times 8$ meshes provide high-quality initial guesses for $50\times 50$ translation-invariant HF and $30\times 30$ fully translation-breaking-allowed HF, reducing the required number of iterations by up to $91.63\%$ and $92.78\%$, respectively, compared to random initializations. Our results illustrate the potential of NN-based methods for interpolable $n$-RDMs, which might open a new avenue for future research on strongly correlated phases.

2510.19995 2026-03-19 cs.MA cs.CL

Communication to Completion: Modeling Collaborative Workflows with Intelligent Multi-Agent Communication

Yiming Lu, Xun Wang, Simin Ma, Shujian Liu, Sathish Reddy Indurthi, Song Wang, Haoyun Deng, Fei Liu, Kaiqiang Song

Comments 13 pages

详情
英文摘要

Multi-agent LLM systems have demonstrated impressive capabilities in complex collaborative tasks, yet most frameworks treat communication as instantaneous and free, overlooking a fundamental constraint in real world teamwork, collaboration cost. We propose a scalable framework implemented via Communication to Completion (C2C), which explicitly models communication as a constrained resource with realistic temporal costs. We introduce the Alignment Factor (AF), a dynamic metric inspired by Shared Mental Models, to quantify the link between task understanding and work efficiency. Through experiments on 15 software engineering workflows spanning three complexity tiers and team sizes from 5 to 17 agents, we demonstrate that cost-aware strategies achieve over 40% higher efficiency compared to unconstrained interaction. Our analysis reveals emergent coordination patterns: agents naturally adopt manager centric hub-and-spoke topologies, strategically escalate from asynchronous to synchronous channels based on complexity, and prioritize high value help requests. These patterns remain consistent across multiple frontier models (GPT-5.2, Claude Sonnet 4.5, Gemini 2.5 Pro). This study moves beyond simple agent construction, offering a theoretical foundation for quantifying and optimizing the dynamics of collaboration in future digital workplaces.

2510.17903 2026-03-19 stat.ML cs.LG

Learning Time-Varying Graphs from Incomplete Graph Signals

Chuansen Peng, Xiaojing Shen

详情
英文摘要

This paper tackles the challenging problem of jointly inferring time-varying network topologies and imputing missing data from partially observed graph signals. We propose a unified non-convex optimization framework to simultaneously recover a sequence of graph Laplacian matrices while reconstructing the unobserved signal entries. Unlike conventional decoupled methods, our integrated approach facilitates a bidirectional flow of information between the graph and signal domains, yielding superior robustness, particularly in high missing-data regimes. To capture realistic network dynamics, we introduce a fused-lasso type regularizer on the sequence of Laplacians. This penalty promotes temporal smoothness by penalizing large successive changes, thereby preventing spurious variations induced by noise while still permitting gradual topological evolution. For solving the joint optimization problem, we develop an efficient Alternating Direction Method of Multipliers (ADMM) algorithm, which leverages the problem's structure to yield closed-form solutions for both the graph and signal subproblems. This design ensures scalability to large-scale networks and long time horizons. On the theoretical front, despite the inherent non-convexity, we establish a convergence guarantee, proving that the proposed ADMM scheme converges to a stationary point. Furthermore, we derive non-asymptotic statistical guarantees, providing high-probability error bounds for the graph estimator as a function of sample size, signal smoothness, and the intrinsic temporal variability of the graph. Extensive numerical experiments validate the approach, demonstrating that it significantly outperforms state-of-the-art baselines in both convergence speed and the joint accuracy of graph learning and signal recovery.

2510.10496 2026-03-19 cs.HC cs.AI

Personalized Motion Guidance Framework for Athlete-Centric Coaching

Ryota Takamido, Chiharu Suzuki, Hiroki Nakamoto

详情
英文摘要

A critical challenge in contemporary sports science lies in filling the gap between group-level insights derived from controlled hypothesis-driven experiments and the real-world need for personalized coaching tailored to individual athletes' unique movement patterns. This study developed a Personalized Motion Guidance Framework (PMGF) to enhance athletic performance by generating individualized motion-refinement guides using generative artificial intelligence techniques. PMGF leverages a vertical autoencoder to encode motion sequences into athlete-specific latent representations, which can then be directly manipulated to generate meaningful guidance motions. Two manipulation strategies were explored: (1) smooth interpolation between the learner's motion and a target (e.g., expert) motion to facilitate observational learning, and (2) shifting the motion pattern in an optimal direction in the latent space using a local optimization technique. The results of the validation experiment with data from 51 baseball pitchers revealed that (1) PMGF successfully generated smooth transitions in motion patterns between individuals across all 1,275 pitcher pairs, and (2) the features significantly altered through PMGF manipulations reflected known performance-enhancing characteristics, such as increased stride length and knee extension associated with higher ball velocity, indicating that PMGF induces biomechanically plausible improvements. We propose a future extension called general-PMGF to enhance the applicability of this framework. This extension incorporates bodily, environmental, and task constraints into the generation process, aiming to provide more realistic and versatile guidance across diverse sports contexts.

2509.26003 2026-03-19 cs.NE cs.LG

Scaling Equilibrium Propagation to Deeper Neural Network Architectures

Sankar Vinayak Elayedam, Gopalakrishnan Srinivasan

详情
Journal ref
Proc. 2025 First International Conference on Intelligent Computing and Systems at the Edge (ICEdge), vol. 1, 2025, pp. 1-7
英文摘要

Equilibrium propagation has been proposed as a biologically plausible alternative to the backpropagation algorithm. The local nature of gradient computations, combined with the use of convergent RNNs to reach equilibrium states, make this approach well-suited for implementation on neuromorphic hardware. However, previous studies on equilibrium propagation have been restricted to networks containing only dense layers or relatively small architectures with a few convolutional layers followed by a final dense layer. These networks have a significant gap in accuracy compared to similarly sized feedforward networks trained with backpropagation. In this work, we introduce the Hopfield-Resnet architecture, which incorporates residual (or skip) connections in Hopfield networks with clipped $\mathrm{ReLU}$ as the activation function. The proposed architectural enhancements enable the training of networks with nearly twice the number of layers reported in prior works. For example, Hopfield-Resnet13 achieves 93.92\% accuracy on CIFAR-10, which is $\approx$3.5\% higher than the previous best result and comparable to that provided by Resnet13 trained using backpropagation.

2509.25857 2026-03-19 cs.GR cs.AI cs.CV

Vector sketch animation generation with differentiable motion trajectories

Xinding Zhu, Xinye Yang, Shuyang Zheng, Zhexin Zhang, Fei Gao, Jing Huang, Jiazhou Chen

Comments 14 pages, 12 figures

详情
英文摘要

Sketching is a direct and inexpensive means of visual expression. Though image-based sketching has been well studied, video-based sketch animation generation is still very challenging due to the temporal coherence requirement. In this paper, we propose a novel end-to-end automatic generation approach for vector sketch animation. To solve the flickering issue, we introduce a Differentiable Motion Trajectory (DMT) representation that describes the frame-wise movement of stroke control points using differentiable polynomial-based trajectories. DMT enables global semantic gradient propagation across multiple frames, significantly improving the semantic consistency and temporal coherence, and producing high-framerate output. DMT employs a Bernstein basis to balance the sensitivity of polynomial parameters, thus achieving more stable optimization. Instead of implicit fields, we introduce sparse track points for explicit spatial modeling, which improves efficiency and supports long-duration video processing. Evaluations on DAVIS and LVOS datasets demonstrate the superiority of our approach over SOTA methods. Cross-domain validation on 3D models and text-to-video data confirms the robustness and compatibility of our approach.

2509.10967 2026-03-19 physics.chem-ph cs.LG physics.bio-ph

Predictive Free Energy Simulations Through Hierarchical Distillation of Quantum Hamiltonians

Chenghan Li, Garnet Kin-Lic Chan

详情
英文摘要

Obtaining the free energies of condensed phase chemical reactions remains computationally prohibitive for high-level quantum mechanical methods. We introduce a hierarchical machine learning framework that bridges this gap by distilling knowledge from a small number of high-fidelity quantum calculations into increasingly coarse-grained, machine-learned quantum Hamiltonians. By retaining explicit electronic degrees of freedom, our approach further enables a faithful embedding of quantum and classical degrees of freedom that captures long-range electrostatics and the quantum response to a classical environment to infinite order. As validation, we compute the proton dissociation constants of weak acids and the kinetic rate of an enzymatic reaction entirely from first principles, reproducing experimental measurements within chemical accuracy or their uncertainties. Our work demonstrates a path to condensed phase simulations of reaction free energies at the highest levels of accuracy with converged statistics.

2507.03570 2026-03-19 cs.CY cs.IT cs.LG math.IT

From Street Form to Spatial Justice: Explaining Urban Exercise Inequality via a Triadic SHAP-Informed Framework

Minwei Zhao, Guosheng Yang, Zhuoni Zhang, Filip Biljecki, Hanzhi Zu, Cai Wu

Comments 41 pages, 4 tables and 15 figures

详情
英文摘要

Urban streets are essential everyday health infrastructure, yet their capacity to support physical activity is unevenly distributed. This study develops a theory-informed and explainable framework to diagnose street-level exercise deprivation by integrating Lefebvre's spatial triad with multi-source urban data and SHAP-based analysis. Using Shenzhen as a case study, we show that while conceived spatial attributes have the strongest overall influence on exercise intensity, local deprivation mechanisms vary substantially across contexts. We identify a seven-mode typology of deprivation and locate high-demand but low-support street segments as priority areas for intervention. The study offers both a theory-grounded analytical framework and a practical diagnostic tool for promoting spatial justice in everyday physical activity.