arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.08998 2026-03-24 math.AT cs.LG math.OA stat.ML

Universal Coefficients and Mayer-Vietoris Sequence for Groupoid Homology

Luciano Melodia

Comments Master's thesis, Code available at https://codeberg.org/Jiren/MSc

详情

英文摘要

We study homology of ample groupoids via the compactly supported Moore complex of the nerve. Let $A$ be a topological abelian group. For $n\ge 0$ set $C_n(\mathcal G;A) := C_c(\mathcal G_n,A)$ and define $\partial_n^A=\sum_{i=0}^n(-1)^i(d_i)_*$. This defines $H_n(\mathcal G;A)$. The theory is functorial for continuous étale homomorphisms. It is compatible with standard reductions, including restriction to saturated clopen subsets. In the ample setting it is invariant under Kakutani equivalence. We reprove Matui type long exact sequences and identify the comparison maps at chain level. For discrete $A$ we prove a natural universal coefficient short exact sequence $$0\to H_n(\mathcal G)\otimes_{\mathbb Z}A\xrightarrow{\ ι_n^{\mathcal G}\ }H_n(\mathcal G;A)\xrightarrow{\ κ_n^{\mathcal G}\ }\operatorname{Tor}_1^{\mathbb Z}\bigl(H_{n-1}(\mathcal G),A\bigr)\to 0.$$ The key input is the chain level isomorphism $C_c(\mathcal G_n,\mathbb Z)\otimes_{\mathbb Z}A\cong C_c(\mathcal G_n,A)$, which reduces the groupoid statement to the classical algebraic UCT for the free complex $C_c(\mathcal G_\bullet,\mathbb Z)$. We also isolate the obstruction for non-discrete coefficients. For a locally compact totally disconnected Hausdorff space $X$ with a basis of compact open sets, the image of $Φ_X:C_c(X,\mathbb Z)\otimes_{\mathbb Z}A\to C_c(X,A)$ is exactly the compactly supported functions with finite image. Thus $Φ_X$ is surjective if and only if every $f\in C_c(X,A)$ has finite image, and for suitable $X$ one can produce compactly supported continuous maps $X\to A$ with infinite image. Finally, for a clopen saturated cover $\mathcal G_0=U_1\cup U_2$ we construct a short exact sequence of Moore complexes and derive a Mayer-Vietoris long exact sequence for $H_\bullet(\mathcal G;A)$ for explicit computations.

URL PDF HTML ☆

赞 0 踩 0

2602.07098 2026-03-24 stat.CO cs.LG stat.ML

BayesFlow 2: Multi-Backend Amortized Bayesian Inference in Python

Lars Kühmichel, Jerry M. Huang, Valentin Pratz, Jonas Arruda, Hans Olischläger, Daniel Habermann, Simon Kucharsky, Lasse Elsemüller, Aayush Mishra, Niels Bracher, Svenja Jedhoff, Marvin Schmitt, Paul-Christian Bürkner, Stefan T. Radev

2602.00004 2026-03-24 cs.IR cs.CL cs.DL cs.LG

C$^2$-Cite: Contextual-Aware Citation Generation for Attributed Large Language Models

Yue Yu, Ting Bai, HengZhi Lan, Li Qian, Li Peng, Jie Wu, Wei Liu, Jian Luan, Chuan Shi

Comments WSDM26

2601.20626 2026-03-24 physics.ins-det cs.LG physics.data-an

Trigger Optimization and Event Classification for Dark Matter Searches in the CYGNO Experiment Using Machine Learning

F. D. Amaro, R. Antonietti, E. Baracchini, L. Benussi, C. Capoccia, M. Caponero, L. G. M. de Carvalho, G. Cavoto, I. A. Costa, A. Croce, M. D'Astolfo, G. D'Imperio, G. Dho, E. Di Marco, J. M. F. dos Santos, D. Fiorina, F. Iacoangeli, Z. Islam, E. Kemp, H. P. Lima, G. Maccarrone, R. D. P. Mano, D. J. G. Marques, G. Mazzitelli, P. Meloni, A. Messina, C. M. B. Monteiro, R. A. Nobrega, G. M. Oppedisano, I. F. Pains, E. Paoletti, F. Petrucci, S. Piacentini, D. Pierluigi, D. Pinci, F. Renga, A. Russo, G. Saviano, P. A. O. C. Silva, N. J. Spooner, R. Tesauro, S. Tomassini, D. Tozzi

Comments 6 pages, 1 figure. Proceedings of 14th Young Researcher Meeting (14YRM2025). Published in PoS(14YRM2025)003 (2026); updated to match published version

详情

DOI: 10.22323/1.519.0003
Journal ref: PoS(14YRM2025)003 (2026)

英文摘要

The CYGNO experiment employs an optical-readout Time Projection Chamber (TPC) to search for rare low-energy interactions using finely resolved scintillation images. While the optical readout provides rich topological information, it produces large, sparse megapixel images that challenge real-time triggering, data reduction, and background discrimination. We summarize two complementary machine-learning approaches developed within CYGNO. First, we present a fast and fully unsupervised strategy for online data reduction based on reconstruction-based anomaly detection. A convolutional autoencoder trained exclusively on pedestal images (i.e. frames acquired with GEM amplification disabled) learns the detector noise morphology and highlights particle-induced structures through localized reconstruction residuals, from which compact Regions of Interest (ROIs) are extracted. On real prototype data, the selected configuration retains (93.0 +/- 0.2)% of reconstructed signal intensity while discarding (97.8 +/- 0.1)% of the image area, with ~25 ms per-frame inference time on a consumer GPU. Second, we report a weakly supervised application of the Classification Without Labels (CWoLa) framework to data acquired with an Americium--Beryllium neutron source. Using only mixed AmBe and standard datasets (no event-level labels), a convolutional classifier learns to identify nuclear-recoil-like topologies. The achieved performance approaches the theoretical limit imposed by the mixture composition and isolates a high-score population with compact, approximately circular morphologies consistent with nuclear recoils.

URL PDF HTML ☆

赞 0 踩 0

2601.08806 2026-03-24 cs.SE cs.AI cs.CL

APEX-SWE

Abhi Kottamasu, Chirag Mahapatra, Sam Lee, Ben Pan, Aakash Barthwal, Akul Datta, Anurag Gupta, Pranav Mehta, Ajay Arun, Silas Alberti, Adarsh Hiremath, Brendan Foody, Bertie Vidgen

2601.08104 2026-03-24 nlin.CD cs.AI

High-Fidelity Modeling of Stochastic Chemical Dynamics on Complex Manifolds: A Multi-Scale SIREN-PINN Framework for the Curvature-Perturbed Ginzburg-Landau Equation

Julian Evan Chrisnanto, Salsabila Rahma Alia, Nurfauzi Fadillah, Yulison Herry Chrisnanto

Comments 25 pages, 9 figures

详情

英文摘要

The accurate identification and control of spatiotemporal chaos in reaction-diffusion systems remains a grand challenge in chemical engineering, particularly when the underlying catalytic surface possesses complex, unknown topography. In the \textit{Defect Turbulence} regime, system dynamics are governed by topological phase singularities (spiral waves) whose motion couples to manifold curvature via geometric pinning. Conventional Physics-Informed Neural Networks (PINNs) using ReLU or Tanh activations suffer from fundamental \textit{spectral bias}, failing to resolve high-frequency gradients and causing amplitude collapse or phase drift. We propose a Multi-Scale SIREN-PINN architecture leveraging periodic sinusoidal activations with frequency-diverse initialization, embedding the appropriate inductive bias for wave-like physics directly into the network structure. This enables simultaneous resolution of macroscopic wave envelopes and microscopic defect cores. Validated on the complex Ginzburg-Landau equation evolving on latent Riemannian manifolds, our architecture achieves relative state prediction error $ε_{L_2} \approx 1.92 \times 10^{-2}$, outperforming standard baselines by an order of magnitude while preserving topological invariants ($|ΔN_{defects}| < 1$). We solve the ill-posed \textit{inverse pinning problem}, reconstructing hidden Gaussian curvature fields solely from partial observations of chaotic wave dynamics (Pearson correlation $ρ= 0.965$). Training dynamics reveal a distinctive Spectral Phase Transition at epoch $\sim 2,100$, where cooperative minimization of physics and geometry losses drives the solver to Pareto-optimal solutions. This work establishes a new paradigm for Geometric Catalyst Design, offering a mesh-free, data-driven tool for identifying surface heterogeneity and engineering passive control strategies in turbulent chemical reactors.

URL PDF HTML ☆

赞 0 踩 0

2601.05162 2026-03-24 cs.GR cs.CV

GenAI-DrawIO-Creator: A Framework for Automated Diagram Generation

Jinze Yu, Dayuan Jiang

2512.23743 2026-03-24 cs.SE cs.AI

Hybrid-Code v2: Zero-Hallucination Clinical ICD-10 Coding via Neuro-Symbolic Verification and Automated Knowledge Base Expansion

Yunguo Yu

Comments Version 2: Substantially extended version with (1) multi-layer verification framework (format, evidence, negation, temporal, exclusion), (2) automated knowledge base expansion from unlabeled clinical text, (3) formal zero Type-I hallucination guarantees, and (4) expanded experimental evaluation on 5,000 cases with detailed error analysis. 28 pages, 3 figure, original research paper;

2511.12842 2026-03-24 physics.comp-ph cs.LG

Scalable learning of macroscopic stochastic dynamics

Mengyi Chen, Pengru Huang, Kostya S. Novoselov, Qianxiao Li

2511.12260 2026-03-24 cond-mat.mtrl-sci cs.LG physics.comp-ph

Reinforcement Learning for Chemical Ordering in Alloy Nanoparticles

Jonas Elsborg, Emma L. Hovmand, Arghya Bhowmik

Comments 22 pages, 9 figures, 1 table

2510.24727 2026-03-24 cs.CE cs.LG

Stiff Circuit System Modeling via Transformer

Weiman Yan, Yi-Chia Chang, Wanyu Zhao

2510.24358 2026-03-24 cs.SE cs.CL

Automatically Benchmarking LLM Code Agents through Agent-Driven Annotation and Evaluation

Lingyue Fu, Bolun Zhang, Hao Guan, Yaoming Zhu, Lin Qiu, Weiwen Liu, Xuezhi Cao, Xunliang Cai, Weinan Zhang, Yong Yu

Comments Accepted by AAMAS 2026

详情

DOI: 10.65109/HJFB4234
Journal ref: Proc. of the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026), Paphos, Cyprus, May 25-29, 2026

英文摘要

Recent advances in code agents have enabled automated software development at the project level, supported by large language models (LLMs). However, existing benchmarks for code agent evaluation face two major limitations. First, creating high-quality project-level evaluation datasets requires extensive domain expertise, leading to prohibitive annotation costs and limited diversity. Second, while recent Agent-as-a-Judge paradigms address the rigidity of traditional unit tests by enabling flexible metrics, their reliance on In-Context Learning (ICL) with general LLMs often results in inaccurate assessments that misalign with human standards. To address these challenges, we propose an agent-driven benchmark construction pipeline that leverages human supervision to efficiently generate diverse project-level tasks. Based on this, we introduce PRDBench, comprising 50 real-world Python projects across 20 domains, each with structured Product Requirement Documents (PRDs) and comprehensive criteria. Furthermore, to overcome the inaccuracy of general LLM judges, we propose a highly reliable evaluation framework powered by a specialized, fine-tuned model. Based on Qwen3-Coder-30B, our dedicated PRDJudge achieves over 90% human alignment in fixed-interface scenarios. Extensive experiments demonstrate that our suite provides a scalable, robust, and highly accurate framework for assessing state-of-the-art code agents.

URL PDF HTML ☆

赞 0 踩 0

2510.08084 2026-03-24 cs.CR cs.AI cs.LG

A Novel Ensemble Learning Approach for Enhanced IoT Attack Detection: Redefining Security Paradigms in Connected Systems

Hikmat A. M. Abdeljaber, Md. Alamgir Hossain, Sultan Ahmad, Ahmed Alsanad, Md Alimul Haque, Sudan Jha, Jabeen Nazeer

Comments 14 pages, 5 fiugres, 7 tables

2510.05181 2026-03-24 cs.CR cs.AI cs.CY

Auditing Pay-Per-Token in Large Language Models

Ander Artola Velasco, Stratis Tsirtsis, Manuel Gomez-Rodriguez

Comments AISTATS 2026

2509.19988 2026-03-24 stat.ML cs.LG q-bio.QM

BioBO: Biology-informed Bayesian Optimization for Perturbation Design

Yanke Li, Tianyu Cui, Tommaso Mansi, Mangal Prakash, Rui Liao

Comments ICLR 2026

2509.05909 2026-03-24 cond-mat.mtrl-sci cs.LG

Learning Magnetic Order Classification from Large-Scale Materials Databases

Ahmed E. Fahmy

Comments Main Text: 10 pages + 10 Figures & 3 Supplementary Tables. (Under Review)

2508.15276 2026-03-24 cs.DB cs.CL

AmbiSQL: Interactive Ambiguity Detection and Resolution for Text-to-SQL

Zhongjun Ding, Yin Lin, Tianjing Zeng, Rong Zhu, Bolin Ding, Jingren Zhou

2508.14936 2026-03-24 q-bio.QM cs.AI cs.LG stat.AP stat.ML

Can synthetic data reproduce real-world findings in epidemiology? A replication study using adversarial random forests

Jan Kapar, Kathrin Günther, Lori Ann Vallis, Klaus Berger, Nadine Binder, Hermann Brenner, Stefanie Castell, Beate Fischer, Volker Harth, Bernd Holleczek, Timm Intemann, Till Ittermann, André Karch, Thomas Keil, Lilian Krist, Berit Lange, Michael F. Leitzmann, Katharina Nimptsch, Nadia Obi, Iris Pigeot, Tobias Pischon, Tamara Schikowski, Börge Schmidt, Carsten Oliver Schmidt, Anja M. Sedlmair, Justine Tanoey, Harm Wienbergen, Andreas Wienke, Claudia Wigmann, Marvin N. Wright

详情

英文摘要

Synthetic data holds substantial potential to address practical challenges in epidemiology due to restricted data access and privacy concerns. However, many current methods suffer from limited quality, high computational demands, and complexity for non-experts. Furthermore, common evaluation strategies for synthetic data often fail to directly reflect statistical utility and measure privacy risks sufficiently. Against this background, a critical underexplored question is whether synthetic data can reliably reproduce key findings from epidemiological research while preserving privacy. We propose adversarial random forests (ARF) as an efficient and convenient method for synthesizing tabular epidemiological data. To evaluate its performance, we replicated statistical analyses from six epidemiological publications covering blood pressure, anthropometry, myocardial infarction, accelerometry, loneliness, and diabetes, from the German National Cohort (NAKO Gesundheitsstudie), the Bremen STEMI Registry U45 Study, and the Guelph Family Health Study. We further assessed how dataset dimensionality and variable complexity affect the quality of synthetic data, and contextualized ARF's performance by comparison with commonly used tabular data synthesizers in terms of utility, privacy, generalisation, and runtime. Across all replicated studies, results on ARF-generated synthetic data consistently aligned with original findings. Even for datasets with relatively low sample size-to-dimensionality ratios, replication outcomes closely matched the original results across descriptive and inferential analyses. Reduced dimensionality and variable complexity further enhanced synthesis quality. ARF demonstrated favourable performance regarding utility, privacy preservation, and generalisation relative to other synthesizers and superior computational efficiency.

URL PDF HTML ☆

赞 0 踩 0

2507.20115 2026-03-24 cs.NI cs.AI

Packet-Level DDoS Data Augmentation Using Dual-Stream Temporal-Field Diffusion

Gongli Xi, Ye Tian, Yannan Hu, Yuchao Zhang, Yapeng Niu, Xiangyang Gong

Comments Accepted by IEEE SECON 2026. 11 pages, 5 figures

2507.03156 2026-03-24 cs.SE cs.AI cs.HC

The Impact of LLM-Assistants on Software Developer Productivity: A Systematic Review and Mapping Study

Amr Mohamed, Maram Assi, Mariam Guizani

Comments 43 pages

详情

英文摘要

Large language model assistants (LLM-assistants) present new opportunities to transform software development. Developers are increasingly adopting these tools across tasks, including coding, testing, debugging, documentation, and design. Yet, despite growing interest, there is no synthesis of how LLM-assistants affect software developer productivity. In this paper, we present a systematic review and mapping of 39 peer-reviewed studies published between January 2014 and December 2024 that examine this impact. Our analysis reveals that the majority of studies report considerable benefits from LLM-assistants, though a notable subset identifies critical risks. Commonly reported gains include accelerated development, minimized code search, and the automation of trivial and repetitive tasks. However, studies also highlight concerns around cognitive offloading and reduced team collaboration. Our study reveals that whether LLM-based assistants improve or degrade code quality remains unresolved, as existing studies report contradictory outcomes contingent on context and evaluation criteria. While the majority of studies (90%) adopt a multi-dimensional perspective by examining at least two SPACE dimensions, reflecting increased awareness of the complexity of developer productivity, only 15% extend beyond three dimensions, indicating substantial room for more integrated evaluations. Satisfaction, Performance, and Efficiency are the most frequently investigated dimensions, whereas Communication and Activity remain underexplored. Most studies are exploratory (59%) and methodologically diverse, but lack longitudinal and team-based evaluations. This review surfaces key research gaps and provides recommendations for future research and practice. All artifacts associated with this study are publicly available at https://zenodo.org/records/18489222

URL PDF HTML ☆

赞 0 踩 0

2506.09161 2026-03-24 eess.IV cs.CV

From Explanations to Architecture: Explainability-Driven CNN Refinement for Brain Tumor Classification in MRI

Rajan Das Gupta, Md Imrul Hasan Showmick, Lei Wei, Mushfiqur Rahman Abir, Shanjida Akter, Md. Yeasin Rahat, Md. Jakir Hossen

Comments This is the preprint version of the manuscript. It is currently being prepared for submission to an academic conference

2504.14145 2026-03-24 cs.DC cs.AI

DIP: Efficient Large Multimodal Model Training with Dynamic Interleaved Pipeline

Zhenliang Xue, Hanpeng Hu, Xing Chen, Yimin Jiang, Yixin Song, Zeyu Mi, Yibo Zhu, Daxin Jiang, Yubin Xia, Haibo Chen

Comments To be published in ASPLOS'26

2503.11851 2026-03-24 eess.IV cs.AI cs.CV cs.LG

Interpretable Deep Learning Framework for Improved Disease Classification in Medical Imaging

Jutika Borah, Hidam Kumarjit Singh

Comments 18 pages, 8 figures, 5 tables

2503.04071 2026-03-24 stat.ML cs.LG

Tightening optimality gap with confidence through conformal prediction

Miao Li, Michael Klamkin, Russell Bent, Pascal Van Hentenryck

Comments none

2502.04907 2026-03-24 stat.ML cs.LG

Scalable Learning from Probability Measures with Mean Measure Quantization

Erell Gachon, Elsa Cazelles, Jérémie Bigot

2501.06404 2026-03-24 econ.EM cs.AI cs.LG stat.ML

A Hybrid Framework for Reinsurance Optimization: Integrating Generative Models and Reinforcement Learning

Stella C. Dong

2412.03083 2026-03-24 quant-ph cs.ET cs.LG

A Novel Single-Layer Quantum Neural Network for Approximate SRBB-Based Unitary Synthesis

Giacomo Belli, Marco Mordacci, Michele Amoretti

Comments 39+26 pages, 37 figures

详情

DOI: 10.22331/q-2026-03-20-2034
Journal ref: Quantum 10, 2034 (2026)

英文摘要

In this work, a novel quantum neural network is introduced as a means to approximate any unitary evolution through the Standard Recursive Block Basis (SRBB) and is subsequently redesigned with the number of CNOTs asymptotically reduced by an exponential contribution. This algebraic approach to the problem of unitary synthesis exploits Lie algebras and their topological features to obtain scalable parameterizations of unitary operators. First, the original SRBB-based scalability scheme, already known in the literature only from a theoretical point of view, is reformulated for efficient algorithm implementation and complexity management. Remarkably, 2-qubit operators emerge as a special case of the original scaling scheme. Furthermore, an algorithm is proposed to reduce the number of CNOT gates in the scalable variational quantum circuit, thus deriving a new implementable scaling scheme that requires only one layer of approximation. The single layer CNOT-reduced quantum neural network is implemented, and its performance is assessed with a variety of different unitary matrices, both sparse and dense, up to 6 qubits via the PennyLane library. The effectiveness of the approximation is measured with different metrics in relation to two optimizers: a gradient-based method and the Nelder-Mead method. The approximate CNOT-reduced SRBB-based synthesis algorithm is also tested on real hardware and compared with other valid approximation and decomposition methods available in the literature.

URL PDF HTML ☆

赞 0 踩 0

2410.09514 2026-03-24 cs.IR cs.AI

Eco-Aware Graph Neural Networks for Sustainable Recommendations

Antonio Purificato, Fabrizio Silvestri

Comments 9 pages, 2 tables, 3 figures, RecSoGood Workshop

2409.20431 2026-03-24 math.NA cs.LG cs.NA math.PR

Multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation overcome the curse of dimensionality when approximating semilinear parabolic partial differential equations in $L^p$-sense

Ariel Neufeld, Tuan Anh Nguyen

2408.05819 2026-03-24 stat.ML cs.LG

Fast convergence of a Federated Expectation-Maximization Algorithm

Zhixu Tao, Rajita Chandak, Sanjeev Kulkarni