arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1658
2604.05312 2026-04-17 nucl-th cs.LG physics.atom-ph

Predictions of charge density distributions for nuclei with $Z \geq 8$

Yun Dong Wang, Tian Shuai Shang, Hui Hui Xie, Peng Xiang Du, Jian Li, Haozhao Liang

Comments 56 pages, 4 tables, 3 figures

详情
Journal ref
NUCL SCI TECH 37, 93 (2026)
英文摘要

A deep neural network (DNN) has been developed to accurately predict nuclear charge density distributions for nuclei with proton numbers $Z \geq 8$. By incorporating essential nuclear structure features, the model achieves a significant improvement in predictive accuracy over conventional methods. The charge density distributions are analyzed using a Fourier-Bessel (FB) series expansion, and the DNN is trained on a comprehensive dataset derived from relativistic continuum Hartree-Bogoliubov (RCHB) theory calculations. The model demonstrates exceptional performance, with root-mean-square deviations of 0.0123 fm and 0.0198 fm for charge radii on the training and validation sets, respectively, remarkably surpassing the precision of the original RCHB calculations. Beyond advancing nuclear physics research, this high-precision model provides critical data for applications in atomic physics, nuclear astrophysics, and related fields.

2603.27118 2026-04-17 eess.IV cs.CV cs.SY eess.SP eess.SY

Quantitative measurements of biological/chemical concentrations using smartphone cameras

Zhendong Cao, Hongji Dai, Zhida Li, Ash Parameswaran

详情
英文摘要

This paper presents a smartphone-based imaging system capable of quantifying the concentration of an assortment of biological/chemical assay samples. The main objective is to construct an image database which characterizes the relationship between color information and concentrations of the biological/chemical assay sample. For this aim, a designated optical setup combined with image processing and data analyzing techniques was implemented. A series of experiments conducted on selected assays, including fluorescein, RNA Mango, homogenized milk and yeast have demonstrated that the proposed system estimates the concentration of fluorescent materials and colloidal mixtures comparable to currently used commercial and laboratory instruments. Furthermore, by utilizing the camera and computational power of smartphones, eventual development can be directed toward extremely compact, inexpensive and portable analysis and diagnostic systems which will allow experiments and tests to be conducted in remote or impoverished areas.

2603.09643 2026-04-17 cs.ET cs.AI

MM-tau-p$^2$: Persona-Adaptive Prompting for Robust Multi-Modal Agent Evaluation in Dual-Control Settings

Anupam Purwar, Aditya Choudhary

Comments A benchmark for evaluating multimodal both voice and text LLM agents in dualcontrol settings. We introduce persona adaptive prompting and 12 new metrics to assess robustness safety efficiency and recovery in customer support scenarios

详情
英文摘要

Current evaluation frameworks and benchmarks for LLM powered agents focus on text chat driven agents, these frameworks do not expose the persona of user to the agent, thus operating in a user agnostic environment. Importantly, in customer experience management domain, the agent's behaviour evolves as the agent learns about user personality. With proliferation of real time TTS and multi-modal language models, LLM based agents are gradually going to become multi-modal. Towards this, we propose the MM-tau-p$^2$ benchmark with metrics for evaluating the robustness of multi-modal agents in dual control setting with and without persona adaption of user, while also taking user inputs in the planning process to resolve a user query. In particular, our work shows that even with state of-the-art frontier LLMs like GPT-5, GPT 4.1, there are additional considerations measured using metrics viz. multi-modal robustness, turn overhead while introducing multi-modality into LLM based agents. Overall, MM-tau-p$^2$ builds on our prior work FOCAL and provides a holistic way of evaluating multi-modal agents in an automated way by introducing 12 novel metrics. We also provide estimates of these metrics on the telecom and retail domains by using the LLM-as-judge approach using carefully crafted prompts with well defined rubrics for evaluating each conversation.

2602.22699 2026-04-17 cs.CR cs.DB cs.LG

DPSQL+: A Differentially Private SQL Library with a Minimum Frequency Rule

Tomoya Matsumoto, Shokichi Takakura, Shun Takagi, Satoshi Hasegawa

Comments Accepted to TPDP 2026. Code is available at https://github.com/lycorp-jp/DPSQL_Plus

详情
英文摘要

SQL is the de facto interface for exploratory data analysis; however, releasing exact query results can expose sensitive information through membership or attribute inference attacks. Differential privacy (DP) provides rigorous privacy guarantees, but in practice, DP alone may not satisfy governance requirements such as the \emph{minimum frequency rule}, which requires each released group (cell) to include contributions from at least $k$ distinct individuals. In this paper, we present \textbf{DPSQL+}, a privacy-preserving SQL library that simultaneously enforces user-level $(\varepsilon,δ)$-DP and the minimum frequency rule. DPSQL+ adopts a modular architecture consisting of: (i) a \emph{Validator} that statically restricts queries to a DP-safe subset of SQL; (ii) an \emph{Accountant} that consistently tracks cumulative privacy loss across multiple queries; and (iii) a \emph{Backend} that interfaces with various database engines, ensuring portability and extensibility. Experiments on the TPC-H benchmark demonstrate that DPSQL+ achieves practical accuracy across a wide range of analytical workloads -- from basic aggregates to quadratic statistics and join operations -- and allows substantially more queries under a fixed global privacy budget than prior libraries in our evaluation.

2602.13529 2026-04-17 cs.CR cs.CL

SecureGate: Learning When to Reveal PII Safely via Token-Gated Dual-Adapters for Federated LLMs

Mohamed Shaaban, Mohamed Elmahallawy

详情
英文摘要

Federated learning (FL) enables collaborative training across organizational silos without sharing raw data, making it attractive for privacy-sensitive applications. With the rapid adoption of large language models (LLMs), federated fine-tuning of generative LLMs has gained attention as a way to leverage distributed data while preserving confidentiality. However, this setting introduces fundamental challenges: (i) privacy leakage of personally identifiable information (PII) due to LLM memorization, and (ii) a persistent tension between global generalization and local utility under heterogeneous data. Existing defenses, such as data sanitization and differential privacy, reduce leakage but often degrade downstream performance. We propose SecureGate, a privacy-aware federated fine-tuning framework for LLMs that provides fine-grained privacy control without sacrificing utility. SecureGate employs a dual-adapter LoRA architecture: a secure adapter that learns sanitized, globally shareable representations, and a revealing adapter that captures sensitive, organization-specific knowledge. A token-controlled gating module selectively activates these adapters at inference time, enabling controlled information disclosure without retraining. Extensive experiments across multiple LLMs and real-world datasets show that SecureGate improves task utility while substantially reducing PII leakage, achieving up to a 31.66X reduction in inference attack accuracy and a 17.07X reduction in extraction recall for unauthorized requests. Additionally, it maintains 100% routing reliability to the correct adapter and incurs only minimal computational and communication overhead.

2602.04930 2026-04-17 cs.CR cs.AI

Attack Selection Reduces Safety in Concentrated AI Control Settings against Trusted Monitoring

Joachim Schaeffer, Arjun Khandelwal, Tyler Tracy

详情
英文摘要

Future AI deployments will likely be monitored for malicious behaviour. The ability of these AIs to subvert monitors by adversarially selecting against them - attack selection - is particularly concerning. To study this, we let a red team create attack policies that attempt to insert attacks into code without being caught by a monitor in the concentrated BigCodeBench backdooring setting. We decompose attack selection into two problems: mapping attacks to a quality score and mapping quality scores to submission probabilities. We frame attack selection as a classification problem and show that safety is significantly more sensitive to FPR than TPR. We find that prompting the attacker model to reason about the monitor while being cautious with attack selection reduces safety from a baseline of 99% to 59% at 0.5% auditing budget, emphasizing that eliciting attack selection capabilities of models is vital to avoid overly optimistic safety scores in control evaluations.

2601.12407 2026-04-17 cs.CR cs.CL cs.LG

De-Anonymization at Scale via Tournament-Style Attribution

Lirui Zhang, Huishuai Zhang

Comments 14 pages, ACL 2026 Oral

详情
英文摘要

As LLMs rapidly advance and enter real-world use, their privacy implications are increasingly important. We study an authorship de-anonymization threat: using LLMs to link anonymous documents to their authors, potentially compromising settings such as double-blind peer review. We propose De-Anonymization at Scale (DAS), a large language model-based method for attributing authorship among tens of thousands of candidate texts. DAS uses a sequential progression strategy: it randomly partitions the candidate corpus into fixed-size groups, prompts an LLM to select the text most likely written by the same author as a query text, and iteratively re-queries the surviving candidates to produce a ranked top-k list. To make this practical at scale, DAS adds a dense-retrieval prefilter to shrink the search space and a majority-voting style aggregation over multiple independent runs to improve robustness and ranking precision. Experiments on anonymized review data show DAS can recover same-author texts from pools of tens of thousands with accuracy well above chance, demonstrating a realistic privacy risk for anonymous platforms. On standard authorship benchmarks (Enron emails and blog posts), DAS also improves both accuracy and scalability over prior approaches, highlighting a new LLM-enabled de-anonymization vulnerability.

2512.22174 2026-04-17 cs.DC cs.AI cs.AR cs.CR cs.LG

BitFlipScope: Scalable Fault Localization and Recovery for Bit-Flip Corruptions in LLMs

Muhammad Zeeshan Karamat, Sadman Saif, Christiana Chamon Garcia

Comments Accepted at the IEEE International Symposium on Hardware Oriented Security and Trust (HOST) 2026

详情
英文摘要

Large Language Models (LLMs) deployed in practical and safety-critical settings are increasingly susceptible to bit-flip faults caused by hardware degradation, cosmic radiation, or deliberate fault-injection attacks such as Rowhammer. These faults silently corrupt internal parameters and can lead to unpredictable or dangerous model behavior. Localizing these corruptions is essential: without identifying the affected region, it is impossible to diagnose the source of degradation, apply targeted corrective measures, or restore model functionality without resorting to costly fine-tuning or full retraining. This work introduces BitFlipScope, a scalable, software-based framework for identifying fault-affected regions within transformer architectures under two deployment scenarios. When a clean reference model is available, BitFlipScope performs differential analysis of outputs, hidden states, and internal activations for detecting anomalous behavior indicative of corruption to pinpoint or localize faults. When no reference model exists, it uses residual-path perturbation and loss-sensitivity profiling to infer the fault-impacted region directly from the corrupted model. In both settings, the framework not only enables effective fault diagnosis but also supports lightweight performance recovery without fine-tuning, offering a practical path to restoring corrupted models. Together, these capabilities make BitFlipScope an important step toward trustworthy, fault-resilient LLM deployment in hardware-prone and adversarial environments.

2512.10159 2026-04-17 cs.CY cs.AI cs.HC

Enhancing Large Language Model-Based Systems for End-to-End Circuit Analysis Problem Solving

Liangliang Chen, Weiyu Sun, Huiru Xie, Yongnuo Cai, Ying Zhang

Comments Liangliang Chen and Weiyu Sun contributed equally to this work

详情
英文摘要

LLMs have demonstrated strong performance in data-rich domains such as programming, yet their reliability in engineering tasks remains limited. Circuit analysis--requiring multimodal understanding and precise mathematical reasoning--highlights these challenges. Although Gemini 2.5 Pro shows improved capabilities in diagram interpretation and analog-circuit reasoning, it still struggles to consistently produce correct solutions when given both textual problem descriptions and circuit diagrams. Meanwhile, engineering education demands scalable AI tools capable of generating accurate solutions for applications such as automated homework feedback. This paper presents an enhanced end-to-end circuit problem-solving framework built upon Gemini. We first conduct a systematic benchmark on undergraduate circuit problems and identify two key failure modes: 1) circuit-recognition hallucinations, particularly incorrect source polarity detection, and 2) reasoning-process hallucinations, such as incorrect current direction assumptions. To address recognition errors, we integrate a fine-tuned YOLO detector and OpenCV-based processing to isolate voltage and current sources, enabling Gemini to accurately re-identify source polarities from cropped images. To mitigate reasoning errors, we introduce an ngspice-driven verification loop, in which simulation discrepancies trigger iterative solution refinement with optional HITL feedback. Experimental results demonstrate that the proposed pipeline achieves 97.59% accuracy, substantially outperforming Gemini's baseline of 79.52%. Furthermore, on four variations of hand-drawn circuit diagrams, accuracy improves from 56.06%--71.21% to 93.94%--95.45% with statistically significant gains. These results highlight the robustness, scalability, and practical applicability of the proposed framework for engineering education and real-world circuit analysis tasks.

2512.05024 2026-04-17 stat.ME cs.AI cs.LG

Model-Free Assessment of Simulator Fidelity via Quantile Curves

Garud Iyengar, Yu-Shiou Willy Lin, Kaizheng Wang

Comments 39 pages, 15 figures

详情
英文摘要

As generative AI models are increasingly used to simulate real-world systems, quantifying the ``sim-to-real'' gap is critical. For each input setting of interest -- which we call a \emph{scenario}, such as a survey question or operating condition -- the real and simulated systems are associated with unobserved latent population parameters, and their discrepancy varies across scenarios. A fundamental challenge is that, for any given scenario, this discrepancy cannot be observed directly, since both systems are accessible only through finite samples, often of heterogeneous sizes across scenarios. Standard predictive inference methods are therefore ill-suited, as they quantify uncertainty in observable outputs rather than latent population parameters. To address this, we construct confidence sets for these latent parameters and use them to derive a robust proxy for the sim-to-real discrepancy. We then estimate the quantile function of this proxy to obtain a distribution-level risk profile of the simulator, which supports a broad range of statistical summaries, including statistical inference for the real output distribution in a new scenario, the calculation of risk measures like Conditional Value-at-Risk (CVaR), and principled comparisons across simulators. Our method is model-agnostic and handles general output spaces, such as categorical survey responses and continuous multi-dimensional data. We demonstrate the practical utility of this method by evaluating the alignment of four major LLMs with human populations on the WorldValueBench dataset.

2511.10909 2026-04-17 cs.AR cs.LG cs.NA math.NA

Bit-Accurate Modeling of GPU Matrix Multiply-Accumulate Units: Demystifying Numerical Discrepancy and Accuracy

Peichen Xie, Shuotao Xu, Yang Wang, Fan Yang, Mao Yang

详情
英文摘要

Modern AI accelerators rely on matrix multiply-accumulate units (MMAUs), such as NVIDIA Tensor Cores and AMD Matrix Cores, to accelerate deep neural network workloads. MMAUs expose only instruction-level or API-level interfaces of matrix multiply-accumulate (MMA) operations, while leaving internal floating-point arithmetic behaviors undocumented. Consequently, MMAUs across vendors and architectural generations often produce numerical discrepancies for identical inputs, and sometimes exhibit reduced numerical accuracy that can cause training instability. Diagnosing and understanding the root causes of these effects is challenging without white-box models of their arithmetic behaviors. This paper proposes closed-loop feature probing (CLFP), a generic and systematic framework for constructing complete arithmetic behavior models of MMA operations. Based on this framework, we analyze all MMA instructions on ten GPU architectures spanning from NVIDIA Volta to RTX Blackwell and from AMD CDNA1 to CDNA3, and derive the first bit-accurate arithmetic models for these MMAUs. Our models explain previously observed cross-platform numerical discrepancies and accuracy issues, enable white-box numerical error analysis, reveal four precision bottleneck designs and one numerical asymmetry design that significantly affect numerical accuracy, and provide software workarounds as well as design guidance for future MMAUs. This work is open-source on https://github.com/microsoft/MMA-Sim .

2509.02571 2026-04-17 eess.AS cs.AI cs.LG cs.SD eess.SP

Gaussian Process Regression of Steering Vectors With Physics-Aware Deep Composite Kernels for Augmented Listening

Diego Di Carlo, Shoichi Koyama, Nugraha Aditya Arie, Fontaine Mathieu, Bando Yoshiaki, Yoshii Kazuyoshi

详情
英文摘要

This paper investigates continuous representations of steering vectors over frequency and microphone/source positions for augmented listening (e.g., spatial filtering and binaural rendering), enabling user-parameterized control of the reproduced sound field. Steering vectors have typically been used for representing the spatial response of a microphone array as a function of the look-up direction. The basic algebraic representation of these quantities assuming an idealized environment cannot deal with the scattering effect of the sound field. One may thus collect a discrete set of real steering vectors measured in dedicated facilities and super-resolve (i.e., upsample) them. Recently, physics-aware deep learning methods have been effectively used for this purpose. Such deterministic super-resolution, however, suffers from the overfitting problem due to the non-uniform uncertainty over the measurement space. To solve this problem, we integrate an expressive representation based on the neural field (NF) into the principled probabilistic framework based on the Gaussian process (GP). Specifically, we propose a physics-aware composite kernel that models the directional incoming waves and the subsequent scattering effect. Our comprehensive comparative experiment showed the effectiveness of the proposed method under data insufficiency conditions. In downstream tasks such as speech enhancement and binaural rendering using the simulated data of the SPEAR challenge, the oracle performances were attained with less than ten times fewer measurements.

2506.23334 2026-04-17 eess.IV cs.AI cs.CV

Federated Breast Cancer Detection Enhanced by Synthetic Ultrasound Image Augmentation

Hongyi Pan, Ziliang Hong, Gorkem Durak, Ziyue Xu, Ulas Bagci

Comments This paper has been accepted to EMBC2026

详情
英文摘要

Federated learning enables collaborative training of deep learning models across institutions without sharing sensitive patient data. However, its performance is often limited by small datasets and non-independent, identically distributed data, which can impair model generalization. In this work, we propose a generative model-based data augmentation framework for breast ultrasound classification. It leverages synthetic images generated by deep convolutional generative adversarial networks and a class-conditioned denoising diffusion probabilistic model. Experiments on three publicly available datasets (BUSI, BUS-BRA, and UDIAT) demonstrated that incorporating a suitable number of synthetic images improved average AUC from 0.9206 to 0.9362 for FedAvg and from 0.9429 to 0.9574 for FedProx. Furthermore, we noticed that excessive use of synthetic data reduced performance. This highlights the importance of balancing real and synthetic samples. Our results underscore the potential of generative model-based augmentation to enhance federated breast ultrasound image classification.

2506.14844 2026-04-17 eess.IV cs.CV cs.LG

Improving Prostate Gland Segmentation Using Transformer based Architectures

Shatha Abudalou Yasin Yilmaz Yoganand Balagurunathan

详情
英文摘要

Inter reader variability and cross site domain shift challenge the automatic segmentation of prostate anatomy using T2 weighted MRI images. This study investigates whether transformer models can retain precision amid such heterogeneity. We compare the performance of UNETR and SwinUNETR in prostate gland segmentation against our previous 3D UNet model [1], based on 546 MRI (T2weighted) volumes annotated by two independent experts. Three training strategies were analyzed: single cohort dataset, 5 fold cross validated mixed cohort, and gland size based dataset. Hyperparameters were tuned by Optuna. The test set, from an independent population of readers, served as the evaluation endpoint (Dice Similarity Coefficient). In single reader training, SwinUNETR achieved an average dice score of 0.816 for Reader#1 and 0.860 for Reader#2, while UNETR scored 0.8 and 0.833 for Readers #1 and #2, respectively, compared to the baseline UNets 0.825 for Reader #1 and 0.851 for Reader #2. SwinUNETR had an average dice score of 0.8583 for Reader#1 and 0.867 for Reader#2 in cross-validated mixed training. For the gland size-based dataset, SwinUNETR achieved an average dice score of 0.902 for Reader#1 subset and 0.894 for Reader#2, using the five-fold mixed training strategy (Reader#1, n=53; Reader#2, n=87) at larger gland size-based subsets, where UNETR performed poorly. Our findings demonstrate that global and shifted-window self-attention effectively reduces label noise and class imbalance sensitivity, resulting in improvements in the Dice score over CNNs by up to five points while maintaining computational efficiency. This contributes to the high robustness of SwinUNETR for clinical deployment.

2506.13408 2026-04-17 eess.SP cs.LG cs.NI

HELENA: High-Efficiency Learning-based channel Estimation using dual Neural Attention

Miguel Camelo Botero, Esra Aycan Beyazit, Nina Slamnik-Kriještorac, Johann M. Marquez-Barja

详情
英文摘要

Accurate channel estimation is critical for high-performance Orthogonal Frequency-Division Multiplexing systems such as 5G New Radio, particularly under low signal-to-noise ratio and stringent latency constraints. This letter presents HELENA, a compact deep learning model that combines a lightweight convolutional backbone with two efficient attention mechanisms: patch-wise multi-head self-attention for capturing global dependencies and a squeeze-and-excitation block for local feature refinement. Compared to CEViT, a state-of-the-art vision transformer-based estimator, HELENA reduces inference time by 45.0\% (0.175\,ms vs.\ 0.318\,ms), achieves comparable accuracy ($-16.78$\,dB vs.\ $-17.30$\,dB), and requires $8\times$ fewer parameters (0.11M vs.\ 0.88M), demonstrating its suitability for low-latency, real-time deployment.

2506.13139 2026-04-17 stat.ML cs.LG

Random Matrix Theory for Deep Learning: Beyond Eigenvalues of Linear Models

Zhenyu Liao, Michael W. Mahoney

Comments 30 pages, 6 figures, to appear on IEEE Signal Processing Magazine

详情
英文摘要

Modern Machine Learning (ML) and Deep Neural Networks (DNNs) often operate on high-dimensional data and rely on overparameterized models, where classical low-dimensional intuitions break down. In particular, the proportional regime where the data dimension, sample size, and number of model parameters are all large and comparable, gives rise to novel and sometimes counterintuitive behaviors. This paper extends traditional Random Matrix Theory (RMT) beyond eigenvalue-based analysis of linear models to address the challenges posed by nonlinear ML models such as DNNs in this regime. We introduce the concept of High-dimensional Equivalent, which unifies and generalizes both Deterministic Equivalent and Linear Equivalent, to systematically address three technical challenges: high dimensionality, nonlinearity, and the need to analyze generic eigenspectral functionals. Leveraging this framework, we provide precise characterizations of the training and generalization performance of linear models, nonlinear shallow networks, and deep networks. Our results capture rich phenomena, including scaling laws, double descent, and nonlinear learning dynamics, offering a unified perspective on the theoretical understanding of deep learning in high dimensions.

2506.11251 2026-04-17 stat.ME cs.AI cs.LG

Measuring multi-calibration

Ido Guy, Daniel Haimovich, Fridolin Linder, Nastaran Okati, Lorenzo Perini, Niek Tax, Mark Tygert

Comments 25 pages, 12 tables

详情
英文摘要

A suitable scalar metric can help measure multi-calibration, defined as follows. When the expected values of observed responses are equal to corresponding predicted probabilities, the probabilistic predictions are known as "perfectly calibrated." When the predicted probabilities are perfectly calibrated simultaneously across several subpopulations, the probabilistic predictions are known as "perfectly multi-calibrated." In practice, predicted probabilities are seldom perfectly multi-calibrated, so a statistic measuring the distance from perfect multi-calibration is informative. A recently proposed metric for calibration, based on the classical Kuiper statistic, is a natural basis for a new metric of multi-calibration and avoids well-known problems of metrics based on binning or kernel density estimation. The newly proposed metric weights the contributions of different subpopulations in proportion to their signal-to-noise ratios; data analyses' ablations demonstrate that the metric becomes noisy when omitting the signal-to-noise ratios from the metric. Numerical examples on benchmark data sets illustrate the new metric.

2506.03487 2026-04-17 cs.IR cs.CL

ProRank: Prompt Warmup via Reinforcement Learning for Small Language Models Reranking

Xianming Li, Aamir Shakir, Rui Huang, Tsz-fung Andrew Lee, Julius Lipp, Benjamin Clavié, Jing Li

Comments Accepted by ACL2026 Findings

详情
英文摘要

Reranking is fundamental to information retrieval and retrieval-augmented generation, with recent Large Language Models (LLMs) significantly advancing reranking quality. Most current works rely on large-scale LLMs (>7B parameters), presenting high computational costs. Small Language Models (SLMs) offer a promising alternative because of computational efficiency. However, our preliminary quantitative analysis reveals key limitations of SLMs: their representation space is narrow, leading to reduced expressiveness, and they struggle with understanding task prompts without fine-tuning. To address these issues, we introduce a novel two-stage training approach, ProRank, for SLM-based document reranking. We propose using reinforcement learning to improve the understanding of task prompts. Additionally, we introduce fine-grained score learning to enhance representation expressiveness and further improve document reranking quality. Extensive experiments suggest that ProRank consistently outperforms both the most advanced open-source and proprietary reranking models. Notably, our 0.5B ProRank even surpasses powerful LLM reranking models on the BEIR benchmark, establishing that properly trained SLMs can achieve superior document reranking performance while maintaining computational efficiency.

2505.14838 2026-04-17 cs.DL cs.AI

In-depth Research Impact Summarization through Fine-Grained Temporal Citation Analysis

Hiba Arnaout, Noy Sternlicht, Tom Hope, Iryna Gurevych

详情
Journal ref
ACL 2026
英文摘要

Understanding the impact of scientific publications is crucial for identifying breakthroughs and guiding future research. Traditional metrics based on citation counts often miss the nuanced ways a paper contributes to its field. In this work, we propose a new task: generating nuanced, expressive, and time-aware impact summaries that capture both praise (confirmation citations) and critique (correction citations) through the evolution of fine-grained citation intents. We introduce an evaluation framework tailored to this task, showing moderate to strong human correlation on subjective metrics such as insightfulness. Expert feedback from professors reveals a strong interest in these summaries and suggests future improvements. Data and code are made available.

2503.21432 2026-04-17 hep-ph cs.LG hep-th

Exploring the flavor structure of leptons via diffusion models

Satsuki Nishimura, Hajime Otsuka, Haruki Uchiyama

Comments 23 pages, 5 figures, v2: published version

详情
Journal ref
Phys. Rev. D 113, 055030 (2026)
英文摘要

We propose a method to explore the flavor structure of leptons using diffusion models, which are known as one of generative artificial intelligence (generative AI). We consider a simple extension of the Standard Model with the type I seesaw mechanism and train a neural network to generate the neutrino mass matrix. By utilizing transfer learning, the diffusion model generates 104 solutions that are consistent with the neutrino mass squared differences and the leptonic mixing angles. The distributions of the CP phases and the sums of neutrino masses, which are not included in the conditional labels but are calculated from the solutions, exhibit non-trivial tendencies. In addition, the effective mass in neutrinoless double beta decay is concentrated near the boundaries of the existing confidence intervals, allowing us to verify the obtained solutions through future experiments. An inverse approach using the diffusion model is expected to facilitate the experimental verification of flavor models from a perspective distinct from conventional analytical methods.

2410.05882 2026-04-17 eess.IV cs.CV cs.LG cs.NE

Frame forecasting in cine MRI using the PCA respiratory motion model: comparing recurrent neural networks trained online and transformers

Michel Pohl, Mitsuru Uesaka, Hiroyuki Takahashi, Kazuyuki Demachi, Ritu Bhusal Chhatkuli

Comments 43 pages, 19 figures. Revised version with minor corrections and improved figures and language. Accepted for publication in Computerized Medical Imaging and Graphics

详情
英文摘要

Respiratory motion complicates accurate irradiation of thoraco-abdominal tumors during radiotherapy, as treatment-system latency entails target-location uncertainties. This work addresses frame forecasting in chest and liver cine MRI to compensate for such delays. We investigate RNNs trained with online learning algorithms, enabling adaptation to changing respiratory patterns via on-the-fly parameter updates, and transformers, increasingly common in time-series forecasting for their ability to capture long-term dependencies. Experiments used 12 sagittal thoracic and upper-abdominal cine-MRI sequences from ETH Zürich and OvGU; the OvGU data exhibited higher motion variability, noise, and lower contrast. PCA decomposes the Lucas-Kanade optical-flow field into static deformation modes and low-dimensional, time-dependent weights. We compare various methods for forecasting these weights: linear filters, population and sequence-specific transformer encoders, and RNNs trained with real-time recurrent learning (RTRL), unbiased online recurrent optimization, decoupled neural interfaces, and sparse one-step approximation (SnAp-1). Predicted displacements were used to warp the reference frame and generate future images. Prediction accuracy decreased with the horizon h. Linear regression performed best at short horizons (1.3mm geometrical error at h=0.32s, ETH Zürich dataset), while RTRL and SnAp-1 outperformed the other algorithms at medium-to-long horizons, with geometrical errors below 1.4mm and 2.8mm on the sequences from ETH Zürich and OvGU, respectively. The sequence-specific transformer was competitive for low-to-medium horizons, but transformers remained overall limited by data scarcity and domain shift between datasets. Predicted frames visually resembled the ground truth, with notable errors occurring near the diaphragm at end-inspiration and regions affected by out-of-plane motion.

2304.02594 2026-04-17 cs.NE cs.AI q-bio.NC

Competitive plasticity to reduce the energetic costs of learning

Mark CW van Rossum

详情
英文摘要

The brain is not only constrained by energy needed to fuel computation, but it is also constrained by energy needed to form memories. Experiments have shown that learning simple conditioning tasks already carries a significant metabolic cost. Yet, learning a task like MNIST to 95% accuracy appears to require at least 10^{8} synaptic updates. Therefore the brain has likely evolved to be able to learn using as little energy as possible. We explored the energy required for learning in feedforward neural networks. Based on a parsimonious energy model, we propose two plasticity restricting algorithms that save energy: 1) only modify synapses with large updates, and 2) restrict plasticity to subsets of synapses that form a path through the network. Combining these two methods leads to substantial energy savings while only incurring a small increase in learning time. In biology networks are often much larger than the task requires. In particular in that case, large savings can be achieved. Thus competitively restricting plasticity helps to save metabolic energy associated to synaptic plasticity. The results might lead to a better understanding of biological plasticity and a better match between artificial and biological learning. Moreover, the algorithms might also benefit hardware because in electronics memory storage is energetically costly as well.

2604.15313 2026-04-17 astro-ph.CO cond-mat.dis-nn gr-qc

Gravitational-wave lensing beyond rays: a disordered-system approach

Ripalta Amoruso, Ginevra Braga, Alice Garoffolo, Francescopaolo Lopez, Nicola Bartolo, Sabino Matarrese

Comments 26 pages, plus technical appendices, 2 figures

详情
英文摘要

We develop a framework to describe gravitational wave propagation through a stochastic distribution of weak gravitational lenses beyond the geometric optics limit. We model the lens distribution as a static random background field and formulate the problem in the language of quenched disorder, treating the disorder averaged density matrix as the fundamental object from which observables are computed. Using the Schwinger Keldysh formalism, we construct a path-integral representation of the averaged density matrix and derive its explicit form perturbatively for a suitable class of couplings. The result naturally separates into a quadratic exponential term, which governs the suppression of phase sensitive contributions in the averaged description, and a purely oscillatory contribution, which modifies coherent propagation through a disorder-induced correction to the propagation kernel. This provides a unified description of interference, diffraction, and statistical fluctuations of the lens distribution within a single framework. We also identify the physical scales controlling the onset of coherence loss and illustrate the formalism in the case of Gaussian wave packets. More generally, the derivation applies to any system described by the same class of actions, making the framework relevant beyond gravitational wave lensing to wave propagation in disordered media.

2604.15307 2026-04-17 quant-ph cs.IT math.IT

Heuristic Search for Minimum-Distance Upper-Bound Witnesses in Quantum APM-LDPC Codes

Kenta Kasai

详情
英文摘要

This paper investigates certified upper bounds on the minimum distance of an explicit family of Calderbank-Shor-Steane quantum LDPC codes constructed from affine permutation matrices. All codes considered here have active Tanner graphs of girth eight. Rather than attempting to prove a general lower bound for the full code distance, we focus on constructing low-weight non-stabilizer logical representatives, which yield valid upper bounds once they are verified to lie in the opposite parity-check kernel and outside the stabilizer row space. We develop a unified framework for such witnesses arising from latent row relations, restricted-lift subspaces including block-compressed, selected-fiber, and CRT-stripe constructions, cycle- 8 elementary trapping-set structures, and decoder-failure residuals. In every case, search is used only to generate candidates; the reported bounds begin only after explicit kernel and row-space exclusion tests have been passed. For the latent part, we also identify a block-compression criterion under which the certification becomes exact. Applying these methods to representative APM-LDPC codes sharpens previously reported upper bounds and provides concrete certified values across the explored parameter range.

2604.15305 2026-04-17 math.CO math.MG

Erdős's diameter conjecture for separated distances fails in high dimensions

Boon Suan Ho

Comments 6 pages

详情
英文摘要

Erdős asked whether every $n$-point set in Euclidean space whose $\binom{n}{2}$ pairwise distances are mutually at least $1$ apart must have diameter at least $(1+o(1))n^2$. We disprove this statement by constructing for every prime power $q$ a set $\mathcal X_q\subset \mathbb R^{q^2+q}$ of $n=q+1$ points such that all pairwise distances in $\mathcal X_q$ are mutually at least $1$ apart, while $$\operatorname{diam}(\mathcal X_q)\le\Bigl(1-\frac{1}{π^2}+o(1)\Bigr)n^2.$$ The proof is fully formalized in Lean 4.

2604.15300 2026-04-17 quant-ph math-ph math.MP

Ensembles of random quantum states tunable from volume law to area law

Héloïse Albot, Sebastian Paeckel

Comments 23 pages, 6 figures

详情
英文摘要

A standard approach to generate random pure quantum states relies on sampling from the Haar measure. However, the entanglement properties of such states present a fundamental challenge for their general applicability. Here, we introduce the $σ$-ensembles $\unicode{x2013}$ a family of random quantum states with only a single control parameter. Crucially, these states are designed such that they can be tuned between volume-law and area-law behavior, which has been a major obstacle thus far. We construct representatives of this ensemble by imposing a probability distribution on the eigenvalues of the successive subsystems, and subsequently reconstructing a compatible global state using the matrix product state (MPS) formalism. Due to their area-law entanglement, our approach circumvents the intractability of Haar-random pure states in classical simulations of quantum systems and is more representative of typical Hamiltonian ground states.

2604.15298 2026-04-17 quant-ph cs.DS

Super-Constant Weight Dicke States in Constant Depth Without Fanout

Lucas Gretta, Meghal Gupta, Malvika Raj Joshi

详情
英文摘要

An $n$-qubit Dicke state of weight $k$, is the uniform superposition over all $n$-bit strings of Hamming weight $k$. Dicke states are an entanglement resource with important practical applications in the NISQ era and, for instance, play a central role in Decoded Quantum Interferometry (DQI). Furthermore, any symmetric state can be expressed as a superposition of Dicke states. First, we give explicit constant-depth circuits that prepare $n$-qubit Dicke states for all $k \leq \text{polylog}(n)$, using only multi-qubit Toffoli gates and single-qubit unitaries. This gives the first $\text{QAC}^0$ construction of super-constant weight Dicke states. Previous constant-depth constructions for any super-constant $k$ required the FANOUT$_n$ gate, while $\text{QAC}^0$ is only known to implement FANOUT$_k$ for $k$ up to $\text{polylog}(n)$. Moreover, we show that any weight-$k$ Dicke state can be constructed with access to FANOUT$_{\min(k,n-k)}$, rather than FANOUT$_n$. Combined with recent hardness results, this yields a tight characterization: for $k \leq n/2$, weight-$k$ Dicke states can be prepared in $\text{QAC}^0$ if and only if FANOUT$_k \in \text{QAC}^0$. We further extend our techniques to show that, in fact, \emph{any} superposition of $n$-qubit Dicke states of weight at most $k$ can be prepared in $\text{QAC}^0$ with access to FANOUT$_k$. Taking $k = n$, we obtain the first $O(1)$-depth unitary construction for arbitrary symmetric states. In particular, any symmetric state can be prepared in constant depth on quantum hardware architectures that support FANOUT$_n$, such as trapped ions with native global entangling operations.

2604.15296 2026-04-17 q-bio.CB

Unity and Diversity of Intracellular pH Maintenance Mechanisms

Guillaume Terradot, Vincent Danos

Comments 45 pages, 16 Figures

详情
英文摘要

All cells must sustain ionic motive forces (IMFs) -- the electrochemical gradients of permeant ions, together with the membrane potential they produce -- to regulate intracellular pH, drive secondary transport, and power ATP synthesis. Because membranes are imperfectly impermeable, IMFs continuously dissipate through passive leakage, and active transport must compensate at an energetic cost that competes with growth and biosynthesis. How environmental conditions set this cost, and why cells across the tree of life share a common ionic logic yet deploy strikingly diverse transporter repertoires, has lacked a unifying quantitative account. Here we derive a thermodynamic lower bound on the power required to maintain IMFs at steady state. The bound equals the rate of free-energy dissipation by ion leakage, holds across a broad family of electrophysiological models, and is independent of organism, energy source, or transporter architecture. Cost minimization recovers, from first principles, the universal K+-rich, Na+-poor cytoplasm observed across taxa: asymmetric membrane permeabilities alone are sufficient to explain it. The same framework predicts that extremophiles face higher maintenance costs under extreme pH, salinity, and temperature, and that when sustaining a large proton motive force becomes prohibitive, cells should shift to metabolic regimes compatible with smaller PMF, providing a thermodynamic rationale for stress-induced metabolic reconfiguration. Finally, we show that perfect energetic efficiency is unattainable in practice, and that this very imperfection, combined with environmental variability, selects for the diversity of transport architectures observed in nature: each architecture is optimal within a discrete regime of environmental constraints.

2604.15295 2026-04-17 cs.IT math.IT

Reed--Muller Codes Achieve the Symmetric Capacity on Finite-State Channels

Henry D. Pfister, Navin Kashyap, Jean-Francois Chamberland, Galen Reeves

Comments 14 pages, extended version of paper accepted to ISIT 2026

详情
英文摘要

We study reliable communication over finite-state channels (FSCs) using Reed--Muller (RM) codes. Building on recent symmetry-based analyses for memoryless channels, we show that a sequence of binary RM codes (with some random scrambling) can achieve the symmetric capacity (or uniform-input information rate) of a binary-input indecomposable FSC. Our approach has three components. First, we establish a capacity-via-symmetry theorem for doubly-transitive group codes on discrete memoryless channels (DMCs) with non-binary inputs, under some symmetry and puncturing conditions. Then, we reduce a binary-input FSC to an almost memoryless non-binary channel by grouping adjacent input bits into blocks and interleaving non-binary codes onto the channel. Finally, we show that the interleaved non-binary codes can be constructed from a single binary RM code.

2604.15293 2026-04-17 math.DS math.NT

Borel--Bernstein and Hirst-type Theorems for Nearest-Integer Complex Continued Fractions over Euclidean Imaginary Quadratic Fields

Kangrae Park

Comments 28 pages

详情
英文摘要

For each $d \in {1,2,3,7,11}$, let $T_d$ be the nearest-integer complex continued fraction map associated with the Euclidean ring $\mathcal{O}*d$, and let $(a_n)$ be its digit sequence. We prove two metric results for this five-system family. First, for every sequence $(u_n)*{n\ge 1}$ with $u_n \ge 1$, the set of points for which $|a_n| \ge u_n$ for infinitely many $n$ has full or zero normalized Lebesgue measure according as $\sum_{n=1}^\infty u_n^{-2}$ diverges or converges. This gives a unified Borel--Bernstein theorem, extending the Hurwitz case $d=1$ to all five Euclidean imaginary quadratic fields. Second, for any infinite set $S \subset \mathcal{O}_d$, if $τ(S)$ denotes its convergence exponent, then the digit-restricted set $F_d(S)={z:\ a_n(z)\in S\ \text{for all } n,\ |a_n(z)|\to\infty}$ satisfies $\dim_H F_d(S)=τ(S)/2$. More generally, for any cutoff function $f(n)\to\infty$, the set $F_d(S,f)={z\in F_d(S):\ |a_n(z)|\le f(n)\ \text{for all } n}$ is either empty or has the same Hausdorff dimension $τ(S)/2$. The proof combines quantitative ergodic properties of the nearest-integer systems with a large-digit conformal iterated function subsystem that is $2$-decaying. We also obtain applications to sparse patterns, shrinking targets, and almost-sure $L'evy$- and Khinchine-type laws.