arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1414
专题追踪
2507.10854 2026-02-13 cs.CR cs.AI cs.LG

PhreshPhish: A Real-World, High-Quality, Large-Scale Phishing Website Dataset and Benchmark

Thomas Dalton, Hemanth Gowda, Girish Rao, Sachin Pargi, Alireza Hadj Khodabakhshi, Joseph Rombs, Stephan Jou, Manish Marwah

详情
英文摘要

Phishing remains a pervasive and growing threat, inflicting heavy economic and reputational damage. While machine learning has been effective in real-time detection of phishing attacks, progress is hindered by lack of large, high-quality datasets and benchmarks. In addition to poor-quality due to challenges in data collection, existing datasets suffer from leakage and unrealistic base rates, leading to overly optimistic performance results. In this paper, we introduce PhreshPhish, a large-scale, high-quality dataset of phishing websites that addresses these limitations. Compared to existing public datasets, PhreshPhish is substantially larger and provides significantly higher quality, as measured by the estimated rate of invalid or mislabeled data points. Additionally, we propose a comprehensive suite of benchmark datasets specifically designed for realistic model evaluation by minimizing leakage, increasing task difficulty, enhancing dataset diversity, and adjustment of base rates more likely to be seen in the real world. We train and evaluate multiple solution approaches to provide baseline performance on the benchmark sets. We believe the availability of this dataset and benchmarks will enable realistic, standardized model comparison and foster further advances in phishing detection. The datasets and benchmarks are available on Hugging Face (https://huggingface.co/datasets/phreshphish/phreshphish).

2507.02890 2026-02-13 stat.AP cs.LG stat.ML

Robust Short-Term OEE Forecasting in Industry 4.0 via Topological Data Analysis

Korkut Anapa, İsmail Güzel, Ceylan Yozgatlıgil

Comments 44 pages

详情
英文摘要

In Industry 4.0 manufacturing environments, forecasting Overall Equipment Efficiency (OEE) is critical for data-driven operational control and predictive maintenance. However, the highly volatile and nonlinear nature of OEE time series--particularly in complex production lines and hydraulic press systems--limits the effectiveness of forecasting. This study proposes a novel informational framework that leverages Topological Data Analysis (TDA) to transform raw OEE data into structured engineering knowledge for production management. The framework models hourly OEE data from production lines and systems using persistent homology to extract large-scale topological features that characterize intrinsic operational behaviors. These features are integrated into a SARIMAX (Seasonal Autoregressive Integrated Moving Average with Exogenous Regressors) architecture, where TDA components serve as exogenous variables to capture latent temporal structures. Experimental results demonstrate forecasting accuracy improvements of at least 17% over standard seasonal benchmarks, with Heat Kernel-based features consistently identified as the most effective predictors. The proposed framework was deployed in a Global Lighthouse Network manufacturing facility, providing a new strategic layer for production management and achieving a 7.4% improvement in total OEE. This research contributes a formal methodology for embedding topological signatures into classical stochastic models to enhance decision-making in knowledge-intensive production systems.

2504.13811 2026-02-13 cs.CR cs.LG

Can LLMs Handle WebShell Detection? Overcoming Detection Challenges with Behavioral Function-Aware Framework

Feijiang Han, Jiaming Zhang, Chuyi Deng, Jianheng Tang, Yunhuai Liu

Comments Published as a conference paper at COLM 2025 (The new version has been polished and expanded with more detailed future work ideas)

详情
英文摘要

WebShell attacks - where adversaries implant malicious scripts on web servers - remain a persistent threat. Prior machine-learning and deep-learning detectors typically depend on task-specific supervision and can be brittle under data scarcity, rapid concept drift, and out-of-distribution (OOD) deployment. Large language models (LLMs) have recently shown strong code understanding capabilities, but their reliability for WebShell detection remains unclear. We address this gap by (i) systematically evaluating seven LLMs (including GPT-4, LLaMA-3.1-70B, and Qwen-2.5 variants) against representative sequence- and graph-based baselines on 26.59K PHP scripts, and (ii) proposing Behavioral Function-Aware Detection (BFAD), a behavior-centric framework that adapts LLM inference to WebShell-specific execution patterns. BFAD anchors analysis on security-sensitive PHP functions via a Critical Function Filter, constructs compact LLM inputs with Context-Aware Code Extraction, and selects in-context demonstrations using Weighted Behavioral Function Profiling, which ranks examples by a behavior-weighted, function-level similarity. Empirically, we observe a consistent precision-recall asymmetry: larger LLMs often achieve high precision but miss attacks (lower recall), while smaller models exhibit the opposite tendency; moreover, off-the-shelf LLM prompting underperforms established detectors. BFAD substantially improves all evaluated LLMs, boosting F1 by 13.82% on average; notably, GPT-4, LLaMA-3.1-70B, and Qwen-2.5-Coder-14B exceed prior SOTA benchmarks, while Qwen-2.5-Coder-3B becomes competitive with traditional methods. Overall, our results clarify when LLMs succeed or fail on WebShell detection, provide a practical recipe, and highlight future directions for making LLM-based detection more reliable.

2503.14354 2026-02-13 cs.AR cs.AI cs.CV cs.ET eess.IV

Retrospective: A CORDIC Based Configurable Activation Function for NN Applications

Omkar Kokane, Gopal Raut, Salim Ullah, Mukul Lokhande, Adam Teman, Akash Kumar, Santosh Kumar Vishvakarma

Journal ref IEEE Computer Society Annual Symposium on VLSI (ISVLSI), Kalamata, Greece, 2025

详情
英文摘要

A CORDIC-based configuration for the design of Activation Functions (AF) was previously suggested to accelerate ASIC hardware design for resource-constrained systems by providing functional reconfigurability. Since its introduction, this new approach for neural network acceleration has gained widespread popularity, influencing numerous designs for activation functions in both academic and commercial AI processors. In this retrospective analysis, we explore the foundational aspects of this initiative, summarize key developments over recent years, and introduce the DA-VINCI AF tailored for the evolving needs of AI applications. This new generation of dynamically configurable and precision-adjustable activation function cores promise greater adaptability for a range of activation functions in AI workloads, including Swish, SoftMax, SeLU, and GeLU, utilizing the Shift-and-Add CORDIC technique. The previously presented design has been optimized for MAC, Sigmoid, and Tanh functionalities and incorporated into ReLU AFs, culminating in an accumulative NEURIC compute unit. These enhancements position NEURIC as a fundamental component in the resource-efficient vector engine for the realization of AI accelerators that focus on DNNs, RNNs/LSTMs, and Transformers, achieving a quality of results (QoR) of 98.5%.

2412.10251 2026-02-13 nlin.CD cs.LG cs.SY eess.SY

Controlling Dynamical Systems into Unseen Target States Using Machine Learning

Daniel Köglmayr, Alexander Haluszczynski, Christoph Räth

详情
英文摘要

We present a novel, model-free, and data-driven methodology for controlling complex dynamical systems into previously unseen target states, including those with significantly different and complex dynamics. Leveraging a parameter-aware realization of next-generation reservoir computing (NGRC), our approach accurately predicts system behavior in unobserved parameter regimes, enabling control over transitions to arbitrary target states utilizing a new prediction evaluation and selection scheme. Crucially, this includes states with dynamics that differ fundamentally from known regimes, such as shifts from periodic to intermittent or chaotic behavior. The method's parameter awareness facilitates non-stationary control with which control scenarios are generated and evaluated on the basis of predefined control objective. In addition to proving the method for transient-free control to extrapolated chaotic target states over transition times, we demonstrate the method's effectiveness on a nonlinear power system model. Our method successfully navigates transitions even in scenarios where system collapse is observed frequently, while ensuring fast transitions and avoiding prolonged transient behavior. By extending the applicability of machine learning-based control mechanisms to previously inaccessible target dynamics, the methodology opens the door to new control applications while maintaining exceptional efficiency.

2409.17525 2026-02-13 q-bio.NC cs.CL

When a Man Says He Is Pregnant: Event-related Potential Evidence for a Rational Account of Speaker-contextualized Language Comprehension

Hanlin Wu, Zhenguang G. Cai

Journal ref J Cogn Neurosci 2026; 38 (3): 545-560

详情
英文摘要

Spoken language is often, if not always, understood in a context formed by the identity of the speaker. For example, we can easily make sense of an utterance such as "I'm going to have a manicure this weekend" or "The first time I got pregnant I had a hard time" when spoken by a woman, but it would be harder to understand when it is spoken by a man. Previous ERP studies have shown mixed results regarding the neurophysiological responses to such speaker-content mismatches, with some reporting an N400 effect and others a P600 effect. In an EEG experiment involving 64 participants, we used social and biological mismatches as test cases to demonstrate how these distinct ERP patterns reflect different aspects of rational inference. We showed that when the mismatch involves social stereotypes (e.g., men getting a manicure), listeners can arrive at a "literal" interpretation by integrating the content with their social knowledge, though this integration requires additional effort due to stereotype violations-resulting in an N400 effect. In contrast, when the mismatch involves biological knowledge (e.g., men getting pregnant), a "literal" interpretation becomes highly implausible or impossible, leading listeners to treat the input as potentially containing errors and engage in correction processes-resulting in a P600 effect. Supporting this rational inference framework, we found that the social N400 effect decreased as a function of the listener's personality trait of openness (as more open-minded individuals maintain more flexible social expectations), while the biological P600 effect remained robust (as biological constraints are recognized regardless of individual personalities). Our findings help to reconcile empirical inconsistencies and reveal how rational inference shapes speaker-contextualized language comprehension.

2409.01869 2026-02-13 math.OC cs.LG

Feature-Based Interpretable Surrogates for Optimization

Marc Goerigk, Michael Hartisch, Sebastian Merten, Kartikey Sharma

详情
英文摘要

For optimization models to be used in practice, it is crucial that users trust the results. A key factor in this aspect is the interpretability of the solution process. A previous framework for inherently interpretable optimization models used decision trees to map instances to solutions of the underlying optimization model. Based on this work, we investigate how we can use more general optimization rules to further increase interpretability and, at the same time, give more freedom to the decision-maker. The proposed rules do not map to a concrete solution but to a set of solutions characterized by common features. To find such optimization rules, we present an exact methodology using mixed-integer programming formulations as well as heuristics. We also outline the challenges and opportunities that these methods present. In particular, we demonstrate the improvement in solution quality that our approach offers compared to existing interpretable surrogates for optimization, and we discuss the relationship between interpretability and performance. These findings are supported by experiments using both synthetic and real-world data.

2403.17770 2026-02-13 eess.IV cs.CV

CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node Segmentation

Yongrui Yu, Hanyu Chen, Zitian Zhang, Qiong Xiao, Wenhui Lei, Linrui Dai, Yu Fu, Hui Tan, Guan Wang, Peng Gao, Xiaofan Zhang

详情
英文摘要

Despite the significant success achieved by deep learning methods in medical image segmentation, researchers still struggle in the computer-aided diagnosis of abdominal lymph nodes due to the complex abdominal environment, small and indistinguishable lesions, and limited annotated data. To address these problems, we present a pipeline that integrates the conditional diffusion model for lymph node generation and the nnU-Net model for lymph node segmentation to improve the segmentation performance of abdominal lymph nodes through synthesizing a diversity of realistic abdominal lymph node data. We propose LN-DDPM, a conditional denoising diffusion probabilistic model (DDPM) for lymph node (LN) generation. LN-DDPM utilizes lymph node masks and anatomical structure masks as model conditions. These conditions work in two conditioning mechanisms: global structure conditioning and local detail conditioning, to distinguish between lymph nodes and their surroundings and better capture lymph node characteristics. The obtained paired abdominal lymph node images and masks are used for the downstream segmentation task. Experimental results on the abdominal lymph node datasets demonstrate that LN-DDPM outperforms other generative methods in the abdominal lymph node image synthesis and better assists the downstream abdominal lymph node segmentation task.

2402.01353 2026-02-13 cs.LO cs.AI cs.LG

Compiling High-Level Neural Network Specifications into VNN-LIB Queries

Matthew L. Daggitt, Wen Kokke, Robert Atkey

详情
英文摘要

The formal verification of traditional software has been revolutionised by verification-orientated languages such as Dafny and F* which enable developers to write high-level specifications that are automatically compiled down to low-level SMT-LIB queries. In contrast, neural network verification currently lacks such infrastructure, often requiring users to express requirements in formats close to the low-level VNN-LIB query format. This gap persists because targeting VNN-LIB presents unique algorithmic challenges when compared to targeting SMT-LIB: VNN-LIB is restricted to a fixed finite set of variables representing the input and outputs of the network, and even toy neural network specifications have an extremely large number of variables. In this paper, we present the first algorithm for compiling high-level neural network specifications into optimised VNN-LIB queries. Our algorithm is numerically sound and supports a far rich logical fragment than existing tools, including transformations of variables, first-class quantifiers, and specifications involving multiple networks or multiple applications of the same network. We implement this algorithm within the Vehicle framework and demonstrate that its performance is asymptotically optimal for benchmark specifications.

2212.06338 2026-02-13 stat.ML cs.LG

Minimax Optimal Estimation of Stability Under Distribution Shift

Hongseok Namkoong, Yuanzhe Ma, Peter W. Glynn

Journal ref Operations Research 2026 74:1, 464-483

详情
英文摘要

The performance of decision policies and prediction models often deteriorates when applied to environments different from the ones seen during training. To ensure reliable operation, we analyze the stability of a system under distribution shift, which is defined as the smallest change in the underlying environment that causes the system's performance to deteriorate beyond a permissible threshold. In contrast to standard tail risk measures and distributionally robust losses that require the specification of a plausible magnitude of distribution shift, the stability measure is defined in terms of a more intuitive quantity: the level of acceptable performance degradation. We develop a minimax optimal estimator of stability and analyze its convergence rate, which exhibits a fundamental phase shift behavior. Our characterization of the minimax convergence rate shows that evaluating stability against large performance degradation incurs a statistical cost. Empirically, we demonstrate the practical utility of our stability framework by using it to compare system designs on problems where robustness to distribution shift is critical.

2602.12277 2026-02-13 astro-ph.CO astro-ph.GA

Reionization Bubbles from Real-Space Cross Correlations of Line Intensity Maps

Emilie Thélie, Sarah Libanore, Yonatan Sklansky, Julian B. Muñoz, Ely D. Kovetz

Comments 11 pages, 8 figures, 1 table. Comments welcome

详情
英文摘要

We propose a new way to reconstruct the ionized-bubble size distribution during the Epoch of Reionization (EoR) through the real-space cross-correlation of 21-cm and star-forming line-intensity maps. Understanding the evolution and timing of the EoR is crucial for both astrophysics and cosmology, and a wealth of information on the first sources can be extracted from the study of ionized bubbles. Nevertheless, directly mapping bubbles is challenging due to the high redshifts involved, possible selection biases, and foregrounds in 21-cm maps. Here, we exploit the real-space cross-correlation $ξ_{21,ν}$ between 21-cm and line-intensity mapping (LIM) signals to reconstruct the evolution of bubble sizes during reionization. For the first time, we show that $ξ_{21,ν}(r)$ departs from a saturation level for each separation $r$ when bubbles of size $r$ begin to form, providing a handle for the onset of bubbles of each radius. Moreover, we demonstrate that $ξ_{21,ν}$ evolves from positive to negative as the EoR progresses, reaching a minimum (i.e. maximum anti-correlation) when bubbles of radius $r$ reach peak abundance. We show that these results are robust to changes in the astrophysical model as well as the timing/topology of reionization. This real-space observable complements usual Fourier-space estimators by capturing the localized nature of bubbles, offering new insights into the sources driving cosmic reionization.

2602.12272 2026-02-13 astro-ph.HE

The Wandering Supermassive Black Hole Powering the off-nuclear TDE AT2024tvd

M. Guolo, A. Mummery, S. van Velzen, M. Nicholl, S. Gezari, Y. Yao, K. C. Chambers, T. de Boer, M. E. Huber, C. -C. Lin, T. B. Lowe, E. A. Magnier, G. Paek, R. Wainscoat

Comments Submitted to ApJ Letters

详情
英文摘要

We present an analysis of the spectral energy distribution (SED) of the off-nuclear tidal disruption event (TDE) AT2024tvd during its late-time plateau phase, combining X-ray spectra and UV/optical photometry. Using a fully relativistic, compact accretion disk model with self-consistent inner-disk Comptonization, we reproduce the observed SED without significant residuals. The inferred black hole mass ${\rm log}{10}(M{\bullet}/M_\odot) \approx 6.0 \pm 0.2$, and the inferred disk parameters place AT2024tvd within known TDE-disk scaling relations ($L_{\rm bol}^{\rm disk}/L_{\rm Edd} \propto T_{\rm p}^4 \propto M_{\bullet}^{-1}$, $L_{\rm plat} \propto M_{\bullet}^{2/3}$, $R_{\rm out}/r_{\rm g} \propto M_{\bullet}^{-2/3}$). Our results show that: (i) there is no \textit{detected} star cluster or dwarf galaxy associated with the source, down to a mass limit of $\log_{10}(M_{\rm gal}/M_{\odot}) \leq 7.6$; (ii) the black hole is a wandering supermassive, rather than intermediate-mass, black hole; and (iii) the source represents an extreme case of black hole-to-host mass ratio, with $M_{\bullet}/M_{\rm gal} > 3\%$, consistent with a heavily tidally stripped nucleus. The latter aligns with cosmological simulations predicting that surviving host remnants of most wandering black holes should not retain a detectable stellar overdensity when located at small halo-centric distances. We discuss differences with previous analyses of this source and highlight why our modeling approach provides a more physically consistent solution with more reliable parameter inference.

2602.12269 2026-02-13 quant-ph

Certification of linear optical quantum state preparation

Riko Schadow, Naomi Spier, Stefan N. van den Hoven, Malaquias Correa Anguita, Redlef B. G. Braamhaar, Sara Marzban, Jens Eisert, Jelmer J. Renema, Nathan Walk

Comments 32 pages, 6 figures

详情
英文摘要

Certification is important to guarantee the correct functioning of quantum devices. A key certification task is verifying that a device has produced a desired output state. In this work, we study this task in the context of photonic platforms, where single photons are propagated through linear optical interferometers to create large, entangled resource states for metrology, communication, quantum advantage demonstrations and for so-called linear optical quantum computing (LOQC). This setting derives its computational power from the indistinguishability of the photons, i.e., their relative overlap. Therefore, standard fidelity witnesses developed for distinguishable particles (including qubits) do not apply directly, because they merely certify the closeness to some fixed target state. We introduce a measure of fidelity suitable for this setting and show several different ways to witness it, based on earlier proposals for measuring genuine multi-photon indistinguishability. We argue that a witness based upon the discrete Fourier transform is an optimal choice. We experimentally implement this witness and certify the fidelity of several multi-photon states.

2602.12266 2026-02-13 quant-ph gr-qc

Repulsive Gravitational Force as a Witness of the Quantum Nature of Gravity

Pablo L. Saldanha, Chiara Marletto, Vlatko Vedral

Comments 4 pages, 2 figures

详情
英文摘要

We show that a single spatially superposed 'source' mass acting on a 'probe' matter wavepacket can reveal the quantum nature of the gravitational field. For this we use a specific state preparation and measurement of the superposed source mass, including a postselection, which altogether results in a repulsive gravitational force on the probe particle. A classical gravitational field can never lead to repulsion, as the effect requires quantum interference of two distinct states of gravity. We also present a calculation in the Heisenberg picture under the formalism of weak values that illustrates how repulsion is achieved. Finally, we estimate the range of parameters (masses and the spatio-temporal extent of interference) for which the experiment is feasible.

2602.12264 2026-02-13 cs.IT cs.NI eess.SP math.IT

Transmit or Idle: Efficient AoI Optimal Transmission Policy for Gossiping Receivers

Irtiza Hasan, Ahmed Arafa

Comments To appear in IEEE ICC 2026

详情
英文摘要

We study the optimal transmission and scheduling policy for a transmitter (source) communicating with two gossiping receivers aiming at tracking the source's status over time using the age of information (AoI) metric. Gossiping enables local information exchange in a decentralized manner without relying solely on the transmitter's direct communication, which we assume incurs a transmission cost. On the other hand, gossiping may be communicating stale information, necessitating the transmitter's intervention. With communication links having specific success probabilities, we formulate an average-cost Markov Decision Process (MDP) to jointly minimize the sum AoI and transmission cost for such a system in a time-slotted setting. We employ the Relative Value Iteration (RVI) algorithm to evaluate the optimal policy for the transmitter and then prove several structural properties showing that it has an age-difference threshold structure with minimum age activation in the case where gossiping is relatively more reliable. Specifically, direct transmission is optimal only if the minimum AoI of the receivers is large enough and their age difference is below a certain threshold. Otherwise, the transmitter idles to effectively take advantage of gossiping and reduce direct transmission costs. Numerical evaluations demonstrate the significance of our optimal policy compared to multiple baselines. Our result is a first step towards characterizing optimal freshness and transmission cost trade-offs in gossiping networks.

2602.12263 2026-02-13 hep-ph nucl-th

Systematic Operator Construction for Non-relativistic Effective Field Theories: Hilbert Series versus Young Tensor

Yong-Kang Li, Yi-Ning Wang, Jiang-Hao Yu

Comments 96 pages, 24 tables, 4 figures

详情
英文摘要

This work establishes a systematic framework for operator construction in the non-relativistic effective field theory, incorporating both the three dimensional Euclidean symmetry and the internal symmetries. By employing double cover of the rotation group, we extend the Hilbert series to the non-relativistic systems, and eliminates redundancies introduced by the spin operator. We also generalize the Young tensor method to the non-relativistic cases through the $SU(2)$ semi-standard Young tableaux, which allows for the construction of operator bases with repeated fields at any given mass dimension. Utilizing the Young tensor technique and Hibert series as cross-check, we obtain the complete operator bases for the following cases: heavy particle (and also heavy quark) effective theory operators up to mass dimension 9; pion-less effective theory operators, including nucleon-nucleon contact interactions up to $\mathcal{O}(Q^4)$ and three-nucleon interactions at $\mathcal{O}(Q^2)$; and finally the spin-1/2 dark matter-nucleon operators up to $\mathcal{O}(v^4)$.

2602.12261 2026-02-13 math.PR math.CO

Half-plane non-coexistence without FKG

Frederik Ravn Klausen, Noah Kravitz

Comments 17 pages, 5 figures

详情
英文摘要

For $μ$ an edge percolation measure on the infinite square lattice, let $μ_{\textit{hp}}$ (respectively, $μ^*_{hp}$) denote its marginal (respectively, the marginal of its planar dual process) on the upper half-plane. We show that if $μ$ is translation-invariant and ergodic and almost surely has only finitely many infinite clusters, then either almost surely $μ_{hp}$ has no infinite cluster, or almost surely $μ^*_{hp}$ has no infinite cluster. By the classical Burton--Keane argument, these hypotheses are satisfied if $μ$ is translation-invariant and ergodic and has finite-energy. In contrast to previous ``non-coexistence'' theorems, our result does not impose a positive-correlation (FKG) hypothesis on $μ$. Our arguments also apply to the random-cluster model (including the regime $q<1$, which lacks FKG), the uniform spanning tree, and the uniform odd subgraph.

2602.12256 2026-02-13 cs.SE

Automated Test Suite Enhancement Using Large Language Models with Few-shot Prompting

Alex Chudic, Gül Çalıklı

Comments 13 pages, 3 figures, accepted to ICPC 2026 (34th International Conference on Program Comprehension)

详情
英文摘要

Unit testing is essential for verifying the functional correctness of code modules (e.g., classes, methods), but manually writing unit tests is often labor-intensive and time-consuming. Unit tests generated by tools that employ traditional approaches, such as search-based software testing (SBST), lack readability, naturalness, and practical usability. LLMs have recently provided promising results and become integral to developers' daily practices. Consequently, software repositories now include a mix of human-written tests, LLM-generated tests, and those from tools employing traditional approaches such as SBST. While LLMs' zero-shot capabilities have been widely studied, their few-shot learning potential for unit test generation remains underexplored. Few-shot prompting enables LLMs to learn from examples in the prompt, and automatically retrieving such examples could enhance test suites. This paper empirically investigates how few-shot prompting with different test artifact sources, comprising human, SBST, or LLM, affects the quality of LLM-generated unit tests as program comprehension artifacts and their contribution to improving existing test suites by evaluating not only correctness and coverage but also readability, cognitive complexity, and maintainability in hybrid human-AI codebases. We conducted experiments on HumanEval and ClassEval datasets using GPT-4o, which is integrated into GitHub Copilot and widely used among developers. We also assessed retrieval-based methods for selecting relevant examples. Our results show that LLMs can generate high-quality tests via few-shot prompting, with human-written examples producing the best coverage and correctness. Additionally, selecting examples based on the combined similarity of problem description and code consistently yields the most effective few-shot prompts.

2602.12255 2026-02-13 physics.optics cond-mat.mtrl-sci physics.comp-ph

Vision Transformer for Multi-Domain Phase Retrieval in Coherent Diffraction Imaging

Jialun Liu, David Yang, Ian Robinson

详情
英文摘要

Bragg coherent diffraction imaging (BCDI) phase retrieval becomes rapidly difficult in the strong-phase regime, where a crystal contains distortions beyond half a lattice spacing. An important special case is the phase domain problem, where blocks of a crystal are displaced with sharp jumps at domain walls. The strong-phase, here defined as beyond $\pm π/2$, generates split Bragg peaks and dense fringe structure for which classical iterative solvers often stagnate or return different solutions from different initialisations. Here, we introduce an unsupervised Fourier Vision Transformer (Fourier ViT) to solve this block-phase, multi-domain phase-retrieval problem directly from measured 2D Bragg diffraction intensities. Fourier ViT couples reciprocal-space information globally through multiscale Fourier token mixing, while shallow convolutional front and back-ends provide local filtering and reconstruction. We validate the approach on large-scale synthetic datasets of Voronoi multi-domain crystals with strong-phase contrast under realistic noise corruptions, and on experimental diffraction from a $\mathrm{La}_{2-x}\mathrm{Ca}_x\mathrm{MnO}_4$ nanocrystal. Across the regimes considered, Fourier ViT achieves the lowest reciprocal-space mismatch ($χ^2$) among the compared methods and preserves domain-resolved phase reconstructions for increasing numbers of domains. On experimental data, with the same real-space support, Fourier ViT matches the iterative benchmark $χ^2$ while improving robustness to random initialisations, yielding a higher success rate of low-$χ^2$ reconstructions than the complex convolutional neural network baseline.

2602.12243 2026-02-13 cs.MA

Federated Gaussian Process Learning via Pseudo-Representations for Large-Scale Multi-Robot Systems

Sanket A. Salunkhe, George P. Kontoudis

Comments Accepted at 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)

Journal ref 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)

详情
英文摘要

Multi-robot systems require scalable and federated methods to model complex environments under computational and communication constraints. Gaussian Processes (GPs) offer robust probabilistic modeling, but suffer from cubic computational complexity, limiting their applicability in large-scale deployments. To address this challenge, we introduce the pxpGP, a novel distributed GP framework tailored for both centralized and decentralized large-scale multi-robot networks. Our approach leverages sparse variational inference to generate a local compact pseudo-representation. We introduce a sparse variational optimization scheme that bounds local pseudo-datasets and formulate a global scaled proximal-inexact consensus alternating direction method of multipliers (ADMM) with adaptive parameter updates and warm-start initialization. Experiments on synthetic and real-world datasets demonstrate that pxpGP and its decentralized variant, dec-pxpGP, outperform existing distributed GP methods in hyperparameter estimation and prediction accuracy, particularly in large-scale networks.

2602.12240 2026-02-13 physics.chem-ph

Harmonic-to-anharmonic thermodynamic integration made simple using REG TI

Venkat Kapil

Comments 7 pages, 2 figures

Journal ref J. Chem. Phys. 164, 051101 (2026)

详情
英文摘要

Standard harmonic-to-anharmonic thermodynamic integration (TI) is known to develop a near singularity in the integrand for solids exhibiting diffusive degrees of freedom, such as rotating functional groups or migrating defects. This pathology results in numerical challenges for estimating absolute free energies within a single thermodynamic cycle. In this work, we introduce a simple regularization that removes this singularity and yields a well-behaved integrand that can be accurately evaluated on a uniform grid. The approach -- termed Regularized End-point Gradient (REG) TI -- is demonstrated on a model system and on predicting the relative stability of paracetamol polymorphs for which quasi-free methyl rotations lead to a near singularity in standard TI. We expect REG TI to simplify anharmonic free energy calculations for solids and to potentially enable their automation.

2602.12239 2026-02-13 math.CT

Tininess and right adjoints to exponentials

Enrique Ruiz Hernández, Pedro Solórzano

Comments 41 pages. Key words: Tininess, amazing right adjoints, precohesion

详情
英文摘要

Objects $T$ whose exponential functor $(-)^T$ admits a right adjoint $(-)_T$ are known under different names. The fact that they exist, yet that the only set that satisfies this in the category of sets is the singleton made Lawvere suggest they ought to be ``amazingly tiny'' -- hence Lawvere's acronym ``A.T.O.M.'' This report explores how intuitively tiny any such object is. Evidences both in favor and to the contrary are produced by looking at their categorical behavior (subobjects, quotients, retracts, etc) when the ambient category is a topos. The topological behavior (connectedness, contractibility, connected components, etc) of both $T$ and $(-)_T$ is further analyzed in toposes that satisfy certain precohesive conditions over their decidable objects, where this tininess is tested against parts of Lawvere's foundational proposal for Synthetic Differential Geometry.

2602.12234 2026-02-13 stat.ME math.OC

Batch-based Bayesian Optimal Experimental Design in Linear Inverse Problems

Sofia Mäkinen, Andrew B. Duncan, Tapio Helin

Comments 25 pages, 5 figures

详情
英文摘要

Experimental design is central to science and engineering. A ubiquitous challenge is how to maximize the value of information obtained from expensive or constrained experimental settings. Bayesian optimal experimental design (OED) provides a principled framework for addressing such questions. In this paper, we study experimental design problems such as the optimization of sensor locations over a continuous domain in the context of linear Bayesian inverse problems. We focus in particular on batch design, that is, the simultaneous optimization of multiple design variables, which leads to a notoriously difficult non-convex optimization problem. We tackle this challenge using a promising strategy recently proposed in the frequentist setting, which relaxes A-optimal design to the space of finite positive measures. Our main contribution is the rigorous identification of the Bayesian inference problem corresponding to this relaxed A-optimal OED formulation. Moreover, building on recent work, we develop a Wasserstein gradient-flow -based optimization algorithm for the expected utility and introduce novel regularization schemes that guarantee convergence to an empirical measure. These theoretical results are supported by numerical experiments demonstrating both convergence and the effectiveness of the proposed regularization strategy.

2602.12232 2026-02-13 astro-ph.CO hep-ph hep-th

Extending the Cosmological Collider: New Scaling Regimes and Constraints from BOSS

Daniel Green, Jiashu Han, Benjamin Wallisch

Comments 62 pages, 20 figures

详情
英文摘要

Primordial non-Gaussianity generated by additional fields during inflation offers a compelling observational target. Heavy fields imprint characteristic oscillatory signals in non-Gaussian correlation functions of the inflaton, a process sometimes referred to as cosmological-collider physics. These distinct signatures are compelling windows into ultra-high-energy physics, but are often suppressed, making standard equilateral non-Gaussianity the most promising discovery channel in many scenarios. In this paper, we show that direct couplings between the inflaton and additional fields can lead to a wide variety of novel, observationally relevant signals which open new parameter regimes that simultaneously exhibit the characteristics of light and heavy fields. We identify these primordial signatures in the late-time observables of the large-scale structure of the Universe, where they most significantly modify the scale-dependent bias of the galaxy power spectrum to include an oscillatory modulation around a non-trivial power law. We explore the full range of parameters that phenomenologically arise in these models and study the sensitivity of current and future galaxy surveys, finding that this new class of primordial non-Gaussianity is particularly accessible in near-term surveys due to its oscillatory feature. Finally, we perform an analysis of existing data from the final release of the Baryon Oscillation Spectroscopic Survey (BOSS DR12). While we find no evidence for a signal, we demonstrate significant improvements in sensitivity over respective non-oscillatory scenarios and place the first constraints on this extended parameter space of oscillatory non-Gaussianity.

2602.12231 2026-02-13 cs.GT

Adjusted Winner: from Splitting to Selling

Robert Bredereck, Bin Sun, Eyal Briman, Nimrod Talmon

详情
英文摘要

The Adjusted Winner (AW) method is a fundamental procedure for the fair division of indivisible resources between two agents. However, its reliance on splitting resources can lead to practical complications. To address this limitation, we propose an extension of AW that allows the sale of selected resources under a budget constraint, with the proceeds subsequently redistributed, thereby aiming for allocations that remain as equitable as possible. Alongside developing this extended framework, we provide an axiomatic analysis that examines how equitability and envy-freeness are modified in our setting. We then formally define the resulting combinatorial problems, establish their computational complexity, and design a fully polynomial-time approximation scheme (FPTAS) to mitigate their inherent intractability. Finally, we complement our theoretical results with computer-based simulations.

2602.12228 2026-02-13 quant-ph cond-mat.str-el

Non-Abelian Quantum Low-Density Parity Check Codes and Non-Clifford Operations from Gauging Logical Gates via Measurements

Maine Christos, Chiu Fan Bowen Lo, Vedika Khemani, Rahul Sahay

Comments 36 pages total: 29 pages main + 7 pages supplemental

详情
英文摘要

In this work, we introduce constructions for non-Abelian qLDPC codes obtained by gauging transversal Clifford gates using measurement and feedback. In particular, we identify two qualitatively different approaches to gauging qLDPC codes to obtain their non-Abelian counterparts. The first approach applies to codes that exhibit a generalized form of Poincaré duality and leads to a qLDPC non-Abelian Clifford stabilizer code, whose stabilizers are reminiscent of the action of a Type-III twisted quantum double. Our second approach applies to general qLDPC codes, and uses a graph of ancilla qubits which may be tailored to properties of the input codes to gauge a single transversal gate. For both constructions, the resulting gauged codes are shown to have properties analogous to 2D non-Abelian topological order -- e.g. the analog of a single anyon on a torus. We conclude by demonstrating that our gauging procedures enable magic state preparation via the measurement of logical Clifford gates. Consequently, our gauging constructions offer a protocol for performing non-Clifford operations on any qLDPC code.

2602.12226 2026-02-13 math.GT

A resistance invariant of special alternating links

Michal Jablonowski

详情
英文摘要

We introduce a new numerical invariant for special, reduced, alternating diagrams of oriented knots and links, defined in terms of the Laplacian matrix of the associated Tait graph. For a special alternating diagram, the Laplacian encodes both the combinatorics of the checkerboard graph and the crossing signs. While its spectrum depends on the chosen diagram, we show that a specific quadratic trace expression involving the Laplacian and its Moore-Penrose pseudoinverse is invariant under flype moves. The invariant admits an interpretation in terms of total effective resistance of the associated weighted graph viewed as an electrical network. Explicit computations for pairs of flype-related diagrams demonstrate that, although the Laplacian characteristic polynomials differ, the invariant FP coincides. Values for several prime alternating knots are provided.

2602.12225 2026-02-13 cond-mat.quant-gas nucl-th physics.atom-ph

Second excited state of ${}^4\mathrm{He}$ tetramer

A. Deltuva

Comments 3 figs

Journal ref Physical Review A 113, 013306 (2026)

详情
英文摘要

The four-boson universality suggests the existence of the second excited tetramer state in a system of cold ${}^4\mathrm{He}$ atoms. It is not bound but could be seen as a resonance in the atom-trimer scattering. This process is rigorously calculated using the momentum-space transition operator framework with two realistic interatomic potentials. The $S$-wave phase shift and cross section show a resonant behavior below the excited trimer threshold, but there are sizable nonresonant contributions from $P$ and $D$ waves as well. The position and width of the resonant state is determined, and for the latter significant finite-range effects are found.

2602.12223 2026-02-13 cond-mat.mtrl-sci cond-mat.mes-hall cond-mat.supr-con

Kagome edge states under lattice termination, spin-orbit coupling, and magnetic order

Sajid Sekh, Annica M. Black-Schaffer, Andrzej Ptok

Comments main text: 14 pages, 7 figures. supplement: 1 page, 2 figures

详情
英文摘要

We study the edge state properties of a two-dimensional kagome lattice using a tight-binding approach, focusing on the role of lattice termination, spin-orbit coupling, and magnetic order. In the pristine limit, we show that the existence of localized edge states is highly sensitive to boundary geometry, with certain terminations completely suppressing edge modes. Kane-Mele spin-orbit coupling opens a bulk gap and stabilizes topologically protected helical edge states, yielding a robust $\mathbb{Z}_2$ insulating phase that is insensitive to termination details. In contrast, the combined effect of a Zeeman field and Rashba spin-orbit coupling drives the system into Chern insulating phases, with Chern numbers consistent with the number of chiral edge modes. We further demonstrate that non-coplanar magnetic textures generate multiple Chern phases through finite scalar spin chirality, with Kane-Mele coupling strongly tuning the topological gaps. Our results provide important insights into the tunability of edge states in the kagome lattice, which can be key to designing materials with novel electronic properties and topological phases.

2602.12219 2026-02-13 math.CO

A Chain Ring Analogue of the Erdos-Ko-Rado Theorem

Ivan Landjev, Emiliyan Rogachev, Assia Rousseva

详情
英文摘要

In this paper, we prove an analogue of the Erdős-Ko-Rado theorem intersecting families of subspaces in projective Hjelmslev geometries over finite chain rings of nilpotency index 2. We give an example of maximal families that are not canonically intersectng.