arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2489
专题追踪
2508.13831 2026-04-07 stat.ML cs.LG

Smooth Flow Matching for Synthesizing Functional Data

Jianbin Tan, Anru R. Zhang

详情
英文摘要

Functional data, i.e., smooth random functions observed over a continuous domain, are increasingly available in areas such as biomedical research, health informatics, and epidemiology. However, effective statistical analysis for functional data is often hindered by challenges such as privacy constraints, sparse and irregular sampling, infinite-dimensionality, and non-Gaussian structures. To address these challenges, we introduce a novel framework named Smooth Flow Matching (SFM), tailored for generative modeling of functional data that enables statistical analysis without exposing sensitive real data. Under a copula framework, SFM constructs a parsimonious smooth flow to generate infinite-dimensional functional data, free of Gaussianity and low-rank assumptions. It is computationally efficient, handles irregular observations, and guarantees the smoothness of the generated functions, offering a practical and flexible solution in scenarios where existing deep generative methods are not applicable. Through extensive simulation studies, we demonstrate the advantages of SFM in terms of both synthetic data quality and computational efficiency. We then apply SFM to generate clinical trajectory data from the MIMIC-IV patient electronic health records (EHR) longitudinal database. Our analysis showcases the ability of SFM to produce high-quality surrogate data for downstream tasks, highlighting its potential to boost the utility of EHR data for clinical applications.

2508.11175 2026-04-07 quant-ph cs.LG eess.SP

The Role of Entanglement in Quantum Reservoir Computing with Coupled Kerr Nonlinear Oscillators

Ali Karimi, Hadi Zadeh-Haghighi, Youssef Kora, Christoph Simon

详情
英文摘要

Quantum Reservoir Computing (QRC) uses quantum dynamics to efficiently process temporal data. In this work, we investigate a QRC framework based on two coupled Kerr nonlinear oscillators, a system well-suited for time-series prediction tasks due to its complex nonlinear interactions and potentially high-dimensional state space. We explore how its performance in forecasting both linear and nonlinear time-series depends on key physical parameters: input drive strength, Kerr nonlinearity, and oscillator coupling, and analyze the role of entanglement in improving the reservoir's computational performance, focusing on its effect on predicting non-trivial time series. Using logarithmic negativity to quantify entanglement and normalized root mean square error (NRMSE) to evaluate predictive accuracy, individual parameter sweeps show that optimal performance occurs at moderate but non-zero entanglement. Furthermore, an aggregated binned analysis reveals that this moderate entanglement is consistently associated with the optimal average predictive performance across the parameter space, an observation that persists up to a threshold in the input frequency. This relationship persists under some levels of dissipation and dephasing. In particular, we find that higher dissipation rates can enhance performance. These findings contribute to the broader understanding of quantum reservoirs for high performance, efficient quantum machine learning and time-series forecasting.

2508.05012 2026-04-07 cs.DB cs.AI cs.CL

Making Prompts First-Class Citizens for Adaptive LLM Pipelines

Ugur Cetintemel, Shu Chen, Alexander W. Lee, Deepti Raghavan, Duo Lu, Andrew Crotty

Comments 6 pages, 2 figures, appears in CIDR'26

详情
英文摘要

Modern LLM pipelines increasingly resemble complex data-centric applications: they retrieve data, correct errors, call external tools, and coordinate interactions between agents. Yet, the central element controlling this entire process -- the prompt -- remains a brittle, opaque string that is entirely disconnected from the surrounding program logic. This disconnect fundamentally limits opportunities for reuse, optimization, and runtime adaptivity. In this paper, we describe our vision and an initial design of SPEAR (Structured Prompt Execution and Adaptive Refinement), a new approach to prompt management that treats prompts as first-class citizens in the execution model. Specifically, SPEAR enables: (1) structured prompt management, with prompts organized into versioned views to support introspection and reasoning about provenance; (2) adaptive prompt refinement, whereby prompts can evolve dynamically during execution based on runtime feedback; and (3) policy-driven control, a mechanism for the specification of automatic prompt refinement logic as when-then rules. By tackling the problem of runtime prompt refinement, SPEAR plays a complementary role in the vast ecosystem of existing prompt optimization frameworks and semantic query processing engines. We describe a number of related optimization opportunities unlocked by the SPEAR model, and our preliminary results demonstrate the strong potential of this approach.

2506.15461 2026-04-07 cs.DC cs.LG

All is Not Lost: LLM Recovery without Checkpoints

Nikolay Blagoev, Oğuzhan Ersoy, Lydia Yiyu Chen

详情
英文摘要

Training LLMs on decentralized nodes or on-spot instances, lowers the training cost and enables model democratization. The inevitable challenge here is the transient churns of nodes due to failures and the operator's scheduling policies, leading to losing parts of the model (some layers). The conventional approaches to recover from failures is to either use checkpointing, where periodically a copy of the entire model is sent to an additional storage, or redundant computation. These approaches yield significant communication and/or computation overhead even in non-failure cases and scale poorly in settings with large models. In this paper we propose CheckFree, an efficient recovery method where a failing stage is substituted by weighted averaging of the closest neighboring stages. In contrast to the state of the art, CheckFree requires no additional computation or storage. However, because of the nature of averaging neighbouring stages, it can only recover failures of intermediate stages. We further extend our method to CheckFree+ with out-of-order pipeline execution to tolerate crashes of the first and last stages. Thanks to out-of-order pipelining, behaviour of the first and last stages are mimicked by their neighboring ones, which allows CheckFree+ to recover them by copying the neighboring stages. To recover the (de-)embedding layers, CheckFree+ copies those layers in the neighboring stages, which requires relatively small storage overhead. We extensively evaluate our method on LLaMa models of model sizes from 124M to 1.5B with varying failure frequencies. In the case of low and medium failure rates (5-10%), CheckFree and CheckFree+ outperform both checkpointing and redundant computation in terms of convergence wall-clock time, achieving up to 12% improvement over redundant computation. Both of our proposals can be ran via our code available at: https://github.com/gensyn-ai/CheckFree

2506.09749 2026-04-07 cs.CE cs.AI

Large Language Models for Combinatorial Optimization of Design Structure Matrix

Shuo Jiang, Min Xie, Jianxi Luo

详情
英文摘要

In complex engineering systems, the dependencies among components or development activities are often modeled and analyzed using Design Structure Matrix (DSM). Reorganizing elements within a DSM to minimize feedback loops and enhance modularity or process efficiency constitutes a challenging combinatorial optimization (CO) problem in engineering design and operations. As problem sizes increase and dependency networks become more intricate, traditional optimization methods that rely solely on mathematical heuristics often fail to capture the contextual nuances and struggle to deliver effective solutions. In this study, we explore the potential of Large Language Models (LLMs) to address such CO problems by leveraging their capabilities for advanced reasoning and contextual understanding. We propose a novel LLM-based framework that integrates network topology with contextual domain knowledge for iterative optimization of DSM sequencing-a common CO problem. Experiments on various DSM cases demonstrate that our method consistently achieves faster convergence and superior solution quality compared to both stochastic and deterministic baselines. Notably, incorporating contextual domain knowledge significantly enhances optimization performance regardless of the chosen LLM backbone. These findings highlight the potential of LLMs to solve complex engineering CO problems by combining semantic and mathematical reasoning. This approach paves the way towards a new paradigm in LLM-based engineering design optimization.

2506.07816 2026-04-07 stat.ML cs.LG math.PR

Accelerating Constrained Sampling: A Large Deviations Approach

Yingli Wang, Changwei Tu, Xiaoyu Wang, Lingjiong Zhu

Comments 59 pages, 15 figures

详情
英文摘要

The problem of sampling a target probability distribution on a constrained domain arises in many applications including machine learning. For constrained sampling, various Langevin algorithms such as projected Langevin Monte Carlo (PLMC), based on the discretization of reflected Langevin dynamics (RLD) and more generally skew-reflected non-reversible Langevin Monte Carlo (SRNLMC), based on the discretization of skew-reflected non-reversible Langevin dynamics (SRNLD), have been proposed and studied in the literature. This work focuses on the long-time behavior of SRNLD, where a skew-symmetric matrix is added to RLD. Although acceleration for SRNLD has been studied, it is not clear how one should design the skew-symmetric matrix in the dynamics to achieve good performance in practice. We establish a large deviation principle (LDP) for the empirical measure of SRNLD when the skew-symmetric matrix is chosen such that its product with the outward unit normal vector field on the boundary is zero. By explicitly characterizing the rate functions, we show that this choice of the skew-symmetric matrix accelerates the convergence to the target distribution compared to RLD and reduces the asymptotic variance. Numerical experiments for SRNLMC based on the proposed skew-symmetric matrix show superior performance, which validate the theoretical findings from the large deviations theory.

2506.01882 2026-04-07 quant-ph cs.LG

Learning thermodynamic master equations for open quantum systems

Peter Sentz, Stanley Nicholson, Yujin Cho, Sohail Reddy, Brendan Keith, Stefanie Günther

Comments 22 pages, 9 figures, submitted to Quantum

详情
英文摘要

The characterization of Hamiltonians and other components of open quantum dynamical systems plays a crucial role in quantum computing and other applications. Scientific machine learning techniques have been applied to this problem in a variety of ways, including by modeling with deep neural networks. However, the majority of mathematical models describing open quantum systems are linear, and the natural nonlinearities in learnable models have not been incorporated using physical principles. We present a data-driven model for open quantum systems that includes learnable, thermodynamically consistent terms. The trained model is interpretable, as it directly estimates the system Hamiltonian and linear components of coupling to the environment. We validate the model on synthetic two and three-level data, as well as experimental two-level data collected from a quantum device at Lawrence Livermore National Laboratory.

2505.21510 2026-04-07 physics.soc-ph cs.CL

Complexity counts: global and local perspectives on Indo-Aryan numeral systems

Chundra Cathcart

详情
英文摘要

The numeral systems of Indo-Aryan languages such as Hindi, Gujarati, and Bengali are highly unusual in that unlike most numeral systems (e.g., those of English, Chinese, etc.), forms referring to 1--99 are highly non-transparent and cannot be constructed using straightforward rules for forming combinations of tens and digits. As an example, Hindi/Urdu {\it ikyānve} `91' is not decomposable into the composite elements {\it ek} `one' and {\it nave} `ninety' in the way that its English counterpart is. This paper further clarifies the position of Indo-Aryan languages within the typology of numeral systems, and explores the linguistic and non-linguistic factors that may be responsible for the persistence of complex systems in these languages. Using data from multiple databases, we develop and employ a number of cross-linguistically applicable metrics to quantify the complexity of languages' numeral systems, and demonstrate that Indo-Aryan languages have decisively more complex numeral systems than the world's languages as a whole, though individual Indo-Aryan languages differ from each other in terms of the complexity of the patterns they display. We investigate the factors (e.g., religion, geographic isolation, etc.) that underlie complexity in numeral systems, with a focus on South Asia, in an attempt to develop an account of why complex numeral systems developed and persisted in certain Indo-Aryan languages but not elsewhere. Finally, we demonstrate that Indo-Aryan numeral systems adhere to certain general pressures toward efficient communication found cross-linguistically, despite their high complexity. We call for this somewhat overlooked dimension of complexity to be taken seriously when discussing general variation in numeral systems.

2505.18288 2026-04-07 stat.ML cs.LG

Operator Learning for Schrödinger Equation: Unitarity, Error Bounds, and Time Generalization

Yash Patel, Unique Subedi, Ambuj Tewari

Comments 37 pages

详情
英文摘要

We consider the problem of learning the evolution operator for the time-dependent Schrödinger equation, where the Hamiltonian may vary with time. Existing neural network-based surrogates often ignore fundamental properties of the Schrödinger equation, such as linearity and unitarity, and lack theoretical guarantees on prediction error or time generalization. To address this, we introduce a linear estimator for the evolution operator that preserves a weak form of unitarity. We establish both upper bounds and lower bounds on the prediction error of the proposed estimator that hold uniformly over classes of sufficiently smooth initial wave functions. Additionally, we derive time generalization bounds that quantify how the estimator extrapolates beyond the time points seen during training. Experiments across real-world Hamiltonians -- including hydrogen atoms, ion traps for qubit design, and optical lattices -- show that our estimator achieves relative errors up to two orders of magnitude smaller than state-of-the-art methods such as the Fourier Neural Operator and DeepONet.

2503.19381 2026-04-07 cs.SE cs.LG

Towards Build Optimization Using Digital Twins

Henri Aïdasso, Francis Bordeleau, Ali Tizghadam

Comments Accepted at the 21st International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2025)

详情
英文摘要

Despite the indisputable benefits of Continuous Integration (CI) pipelines (or builds), CI still presents significant challenges regarding long durations, failures, and flakiness. Prior studies addressed CI challenges in isolation, yet these issues are interrelated and require a holistic approach for effective optimization. To bridge this gap, this paper proposes a novel idea of developing Digital Twins (DTs) of build processes to enable global and continuous improvement. To support such an idea, we introduce the CI Build process Digital Twin (CBDT) framework as a minimum viable product. This framework offers digital shadowing functionalities, including real-time build data acquisition and continuous monitoring of build process performance metrics. Furthermore, we discuss guidelines and challenges in the practical implementation of CBDTs, including (1) modeling different aspects of the build process using Machine Learning, (2) exploring what-if scenarios based on historical patterns, and (3) implementing prescriptive services such as automated failure and performance repair to continuously improve build processes.

2503.14469 2026-04-07 cs.DB cs.AI

Causality-Based Scores Alignment in Explainable Data Management

Felipe Azua, Leopoldo Bertossi

Comments Extended version of published paper in with final revisions and appendix

详情
英文摘要

Different attribution scores have been proposed to quantify the relevance of database tuples for query answering in databases; e.g. Causal Responsibility, the Shapley Value, the Banzhaf Power-Index, and the Causal Effect. They have been analyzed in isolation. This work is a first investigation of score alignment depending on the query and the database; i.e. on whether they induce compatible rankings of tuples. We concentrate mostly on causality-based scores; and provide a syntactic dichotomy result for queries: on one side, pairs of scores are always aligned, on the other, they are not always aligned. It turns out that the presence of exogenous tuples makes a crucial difference in this regard.

2501.11782 2026-04-07 cs.HC cs.AI

Human-AI Collaborative Game Testing with Vision Language Models

Boran Zhang, Muhan Xu, Zhijun Pan

Comments Experiment Report

详情
英文摘要

As modern video games become increasingly complex, traditional manual testing methods are proving costly and inefficient, limiting the ability to ensure high-quality game experiences. While advancements in Artificial Intelligence (AI) offer the potential to assist human testers, the effectiveness of AI in truly enhancing real-world human performance remains underexplored. This study investigates how AI can improve game testing by developing and experimenting with an AI-assisted workflow that leverages state-of-the-art machine learning models for defect detection. Through an experiment involving 800 test cases and 276 participants of varying backgrounds, we evaluate the effectiveness of AI assistance under four conditions: with or without AI support, and with or without detailed knowledge of defects and design documentation. The results indicate that AI assistance significantly improves defect identification performance, particularly when paired with detailed knowledge. However, challenges arise when AI errors occur, negatively impacting human decision-making. Our findings show the importance of optimizing human-AI collaboration and implementing strategies to mitigate the effects of AI inaccuracies. By this research, we demonstrate AI's potential and problems in enhancing efficiency and accuracy in game testing workflows and offers practical insights for integrating AI into the testing process.

2411.02225 2026-04-07 stat.ML cs.IT cs.LG math.IT math.ST stat.TH

Sparse Max-Affine Regression

Haitham Kanj, Seonho Kim, Kiryung Lee

详情
英文摘要

This paper presents Sparse Gradient Descent as a solution for variable selection in convex piecewise linear regression, where the model is given as the maximum of $k$-affine functions $ x \mapsto \max_{j \in [k]} \langle a_j^\star, x \rangle + b_j^\star$ for $j = 1,\dots,k$. Here, $\{ a_j^\star\}_{j=1}^k$ and $\{b_j^\star\}_{j=1}^k$ denote the ground-truth weight vectors and intercepts. A non-asymptotic local convergence analysis is provided for Sp-GD under sub-Gaussian noise when the covariate distribution satisfies the sub-Gaussianity and anti-concentration properties. When the model order and parameters are fixed, Sp-GD provides an $ε$-accurate estimate given $\mathcal{O}(\max(ε^{-2}σ_z^2,1)s\log(d/s))$ observations where $σ_z^2$ denotes the noise variance. This also implies the exact parameter recovery by Sp-GD from $\mathcal{O}(s\log(d/s))$ noise-free observations. The proposed initialization scheme uses sparse principal component analysis to estimate the subspace spanned by $\{ a_j^\star\}_{j=1}^k$, then applies an $r$-covering search to estimate the model parameters. A non-asymptotic analysis is presented for this initialization scheme when the covariates and noise samples follow Gaussian distributions. When the model order and parameters are fixed, this initialization scheme provides an $ε$-accurate estimate given $\mathcal{O}(ε^{-2}\max(σ_z^4,σ_z^2,1)s^2\log^4(d))$ observations. A new transformation named Real Maslov Dequantization (RMD) is proposed to transform sparse generalized polynomials into sparse max-affine models. The error decay rate of RMD is shown to be exponentially small in its temperature parameter. Furthermore, theoretical guarantees for Sp-GD are extended to the bounded noise model induced by RMD. Numerical Monte Carlo results corroborate theoretical findings for Sp-GD and the initialization scheme.

2410.21169 2026-04-07 cs.MM cs.AI cs.CL cs.CV

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction

Qintong Zhang, Bin Wang, Victor Shea-Jay Huang, Junyuan Zhang, Zhengren Wang, Hao Liang, Conghui He, Wentao Zhang

详情
英文摘要

Document parsing (DP) transforms unstructured or semi-structured documents into structured, machine-readable representations, enabling downstream applications such as knowledge base construction and retrieval-augmented generation (RAG). This survey provides a comprehensive and timely review of document parsing research. We propose a systematic taxonomy that organizes existing approaches into modular pipeline-based systems and unified models driven by Vision-Language Models (VLMs). We provide a detailed review of key components in pipeline systems, including layout analysis and the recognition of heterogeneous content such as text, tables, mathematical expressions, and visual elements, and then systematically track the evolution of specialized VLMs for document parsing. Additionally, we summarize widely adopted evaluation metrics and high-quality benchmarks that establish current standards for parsing quality. Finally, we discuss key open challenges, including robustness to complex layouts, reliability of VLM-based parsing, and inference efficiency, and outline directions for building more accurate and scalable document intelligence systems.

2408.05857 2026-04-07 cs.ET cs.LG

Comparative Evaluation of Memory Technologies for Synaptic Crossbar Arrays- Part 2: Design Knobs and DNN Accuracy Trends

Jeffry Victor, Chunguang Wang, Sumeet K. Gupta

Journal ref IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 72, no. 10, pp. 5708-5721, Oct. 2025

详情
英文摘要

Crossbar memory arrays have been touted as the workhorse of in-memory computing (IMC)-based acceleration of Deep Neural Networks (DNNs), but the associated hardware non-idealities limit their efficacy. To address this, cross-layer design solutions that reduce the impact of hardware non-idealities on DNN accuracy are needed. In Part 1 of this paper, we established the co-optimization strategies for various memory technologies and their crossbar arrays, and conducted a comparative technology evaluation in the context of IMC robustness. In this part, we analyze various design knobs such as array size and bit-slice (number of bits per device) and their impact on the performance of 8T SRAM, ferroelectric transistor (FeFET), Resistive RAM (ReRAM) and spin-orbit-torque magnetic RAM (SOT-MRAM) in the context of inference accuracy at 7nm technology node. Further, we study the effect of circuit design solutions such as Partial Wordline Activation (PWA) and custom ADC reference levels that reduce the hardware non-idealities and comparatively analyze the response of each technology to such accuracy enhancing techniques. Our results on ResNet-20 (with CIFAR-10) show that PWA increases accuracy by up to 32.56% while custom ADC reference levels yield up to 31.62% accuracy enhancement. We observe that compared to the other technologies, FeFET, by virtue of its small layout height and high distinguishability of its memory states, is best suited for large arrays. For higher bit-slices and a more complex dataset (ResNet-50 with Cifar-100) we found that ReRAM matches the performance of FeFET.

2306.06581 2026-04-07 stat.ML cs.DS cs.LG math.OC

Importance Sparsification for Sinkhorn Algorithm

Mengyu Li, Jun Yu, Tao Li, Cheng Meng

Comments Accepted by Journal of Machine Learning Research

详情
英文摘要

Sinkhorn algorithm has been used pervasively to approximate the solution to optimal transport (OT) and unbalanced optimal transport (UOT) problems. However, its practical application is limited due to the high computational complexity. To alleviate the computational burden, we propose a novel importance sparsification method, called Spar-Sink, to efficiently approximate entropy-regularized OT and UOT solutions. Specifically, our method employs natural upper bounds for unknown optimal transport plans to establish effective sampling probabilities, and constructs a sparse kernel matrix to accelerate Sinkhorn iterations, reducing the computational cost of each iteration from $O(n^2)$ to $\widetilde{O}(n)$ for a sample of size $n$. Theoretically, we show the proposed estimators for the regularized OT and UOT problems are consistent under mild regularity conditions. Experiments on various synthetic data demonstrate Spar-Sink outperforms mainstream competitors in terms of both estimation error and speed. A real-world echocardiogram data analysis shows Spar-Sink can effectively estimate and visualize cardiac cycles, from which one can identify heart failure and arrhythmia. To evaluate the numerical accuracy of cardiac cycle prediction, we consider the task of predicting the end-systole time point using the end-diastole one. Results show Spar-Sink performs as well as the classical Sinkhorn algorithm, requiring significantly less computational time.

2604.04928 2026-04-07 math.DG math.AP math.GT

Topology of minimal surfaces in the sphere from capillarity

Benjy Firester, Raphael Tsiamis

详情
英文摘要

We present a general construction of embedded minimal and constant mean curvature surfaces in $\mathbb{S}^n$ and one-phase free boundaries joined by a smooth interpolation by capillary hypersurfaces. This framework recovers all known families and produces new minimal surfaces in the sphere with rich topological structures as sphere bundles over base spaces which include space-form products, projective planes over division algebras, Stiefel manifolds, complex quadrics, and twisted products and quotients of Lie subgroups of $SO(n)$. We show these bundles are non-trivial and study their homotopy types using topological obstructions, including characteristic classes and tools from $K$-theory and stable homotopy theory. Finally, we prove uniqueness results for the rotationally invariant capillary CMC problem.

2604.04926 2026-04-07 cs.CR cs.HC

Comprehensive List of User Deception Techniques in Emails

Maxime Veit, Mattia Mossano, Tobias Länge, Melanie Volkamer

详情
英文摘要

Email remains a central communication medium, yet its long-standing design and interface conventions continue to enable deceptive attacks. This research note presents a structured list of 42 email-based deception techniques, documented with 64 concrete example implementations, organized around the sender, link, and attachment security indicators as well as techniques targeting the email rendering environment. Building on a prior systematic literature review, we consolidate previously reported techniques with newly developed example implementations and introduce novel deception techniques identified through our own examination. Rather than assessing effectiveness or real-world severity, each entry explains the underlying mechanism in isolation, separating the high-level deception goal from its concrete technical implementation. The documented techniques serve as modular building blocks and a structured reference for future work on countermeasures across infrastructure, email client design, and security awareness, supporting researchers as well as developers, operators, and designers working in these areas.

2604.04922 2026-04-07 math.PR cond-mat.stat-mech math-ph math.MP

Elephant random walk on the infinite dihedral group $\mathbb{Z}_2 * \mathbb{Z}_2$

Soumendu Sundar Mukherjee, Himasish Talukdar

Comments 21 pages, 2 figures

详情
英文摘要

Elephant random walks were studied recently in \cite{mukherjee2025elephant} on the groups $\mathbb{Z}^{*d_1} * \mathbb{Z}_2^{*d_2}$ whose Cayley graphs are infinite $d$-regular trees with $d = 2d_1 + d_2$. It was found that for $d \ge 3$, the elephant walk is ballistic with the same asymptotic speed $\frac{d - 2}{d}$ as the simple random walk and the memory parameter appears only in the rate of convergence to the limiting speed. In the $d = 2$ case, there are two such groups, both having the bi-infinite path as their Cayley graph. For $(d_1, d_2) = (1, 0)$, the walk is the usual elephant random walk on $\mathbb{Z}$, which exhibits anomalous diffusion. In this article, we study the other case, namely $(d_1, d_2) = (0, 2)$, which corresponds to the infinite dihedral group $D_\infty \cong \mathbb{Z}_2 * \mathbb{Z}_2$. Unlike the classical ERW on $\mathbb{Z}$, which is a time-inhomogeneous Markov chain, the ERW on $D_{\infty}$ is non-Markovian. We show that the first and second order behaviours of the \emph{signed location} of the walker agree with those of the simple symmetric random walk on $\mathbb{Z}$, with the memory parameter essentially manifesting itself via a lower order correction term that can be written as an explicit functional of the elephant walk on $\mathbb{Z}$. Our result demonstrates that unlike the simple random walk, the elephant walk is sensitive to local algebraic relations. Indeed, although $D_{\infty}$ is virtually abelian, containing $\mathbb{Z}$ as a finite-index subgroup, the involutive nature of its generators effectively neutralises memory, thereby ruling out any potential superdiffusive behaviour, in contrast to the superdiffusion observed on its abelian cousin $\mathbb{Z}$.

2604.04919 2026-04-07 math.CT math.AT

Categorical Perspectives on Chemical Reaction Networks

Justin Curry, Mauricio Montes

详情
英文摘要

We show that the Schur-complement reduction of a chemical reaction network (CRN) from Hirono et al. is the categorical complement of the stoichiometric arrow in the arrow category $[\mathbf{A}_2,\mathbf{Vect}]$. This identifies the ambient category in which topological reduction of chemical reaction networks is functorial and explains the reduced stoichiometric matrix as a universal diagrammatic construction. We further define a reconstruction functor from a restricted subcategory of $[\mathbf{A}_2, \mathbf{Vect}]$ back to CRNs and prove an adjunction with the stoichiometric functor.

2604.04918 2026-04-07 cs.HC

Comparing Human Oversight Strategies for Computer-Use Agents

Chaoran Chen, Zhiping Zhang, Zeya Chen, Eryue Xu, Yinuo Yang, Ibrahim Khalilov, Simret A Gebreegziabher, Yanfang Ye, Ziang Xiao, Yaxing Yao, Tianshi Li, Toby Jia-Jun Li

详情
英文摘要

LLM-powered computer-use agents (CUAs) are shifting users from direct manipulation to supervisory coordination. Existing oversight mechanisms, however, have largely been studied as isolated interface features, making broader oversight strategies difficult to compare. We conceptualize CUA oversight as a structural coordination problem defined by delegation structure and engagement level, and use this lens to compare four oversight strategies in a mixed-methods study with 48 participants in a live web environment. Our results show that oversight strategy more reliably shaped users' exposure to problematic actions than their ability to correct them once visible. Plan-based strategies were associated with lower rates of agent problematic-action occurrence, but not equally strong gains in runtime intervention success once such actions became visible. On subjective measures, no single strategy was uniformly best, and the clearest context-sensitive differences appeared in trust. Qualitative findings further suggest that intervention depended not only on what controls users retained, but on whether risky moments became legible as requiring judgment during execution. These findings suggest that effective CUA oversight is not achieved by maximizing human involvement alone. Instead, it depends on how supervision is structured to surface decision-critical moments and support their recognition in time for meaningful intervention.

2604.04915 2026-04-07 cs.HC

Exploring Expert Perspectives on Wearable-Triggered LLM Conversational Support for Daily Stress Management

Poorvesh Dongre, Sameer Neupane, Priyanka Jadhav, Nikitha Donekal Chandrashekar, Christian Webb, Denis Gračanin

详情
英文摘要

Wearable devices increasingly support stress detection, while LLMs enable conversational mental health support. However, designing systems that meaningfully connect wearable-triggered stress events with generative dialogue remains underexplored, particularly from a design perspective. We present EmBot, a functional mobile application that combines wearable-triggered stress detection with LLM-based conversational support for daily stress management. We used EmBot as a design probe in semi-structured interviews with 15 mental health experts to examine their perspectives and surface early design tensions and considerations that arise from wearable-triggered conversational support, informing the future design of such systems for daily stress management and mental health support.

2604.04912 2026-04-07 cs.DS

Dominating Set with Quotas: Balancing Coverage and Constraints

Sobyasachi Chatterjee, Sushmita Gupta, Saket Saurabh, Sanjay Seetharaman, Anannya Upasana

Comments 24 pages; full version of the paper to appear in IWOCA 2026

详情
英文摘要

We study a natural generalization of the classical \textsc{Dominating Set} problem, called \textsc{Dominating Set with Quotas} (DSQ). In this problem, we are given a graph \( G \), an integer \( k \), and for each vertex \( v \in V(G) \), a lower quota \( \mathrm{lo}_v \) and an upper quota \( \mathrm{up}_v \). The goal is to determine whether there exists a set \( S \subseteq V(G) \) of size at most \( k \) such that for every vertex \( v \in V(G) \), the number of vertices in its closed neighborhood that belong to \( S \), i.e., \( |N[v] \cap S| \), lies within the range \( [\mathrm{lo}_v, \mathrm{up}_v] \). This richer model captures a variety of practical settings where both under- and over-coverage must be avoided -- such as in fault-tolerant infrastructure, load-balanced facility placement, or constrained communication networks. While DS is already known to be computationally hard, we show that the added expressiveness of per-vertex quotas in DSQ introduces additional algorithmic challenges. In particular, we prove that DSQ becomes \W[1]-hard even on structurally sparse graphs -- such as those with degeneracy 2, or excluding \( K_{3,3} \) as a subgraph -- despite these classes admitting FPT algorithms for DS. On the positive side, we show that DSQ is fixed-parameter tractable when parameterized by solution size and treewidth, and more generally, on nowhere dense graph classes. Furthermore, we design a subexponential-time algorithm for DSQ on apex-minor-free graphs using the bidimensionality framework. These results collectively offer a refined view of the algorithmic landscape of DSQ, revealing a sharp contrast with the classical DS problem and identifying the key structural properties that govern tractability.

2604.04910 2026-04-07 math.GT math.CO

Morse functions with regular level sets consisting of $2$-dimensional spheres, $2$-dimensional tori, or Klein Bottles

Naoki Kitazawa

Comments 14 pages. 7 figures

详情
英文摘要

In this paper, we study Morse functions with regular level sets consisting of spheres, tori, or Klein Bottles on $3$-dimensional closed manifolds. We characterize $3$-dimensional manifolds represented by connected sums each of whose summands is the product $S^1 \times S^2$ of the circle $S^1$ and the sphere $S^2$, lens spaces, or non-orientable closed and connected manifolds of genus $1$ by a certain subclass of such Morse functions. This is a kind of extensions of the orientable case, by Saeki, in 2006. This is a variant of its extension by the author for $3$-dimensional orientable manifolds represented by connected sums each of whose summands is the product $S^1 \times S^2$, lens spaces, or torus bundles over $S^1$ by a certain class of Morse-Bott functions. We also classify Morse functions with given regular level sets consisting of $S^2$, $S^1 \times S^1$, or Klein Bottles in a certain sense, generalizing some previous work by the author.

2604.04909 2026-04-07 cond-mat.other cs.NA math.NA physics.chem-ph

Weak Solutions to the Bloch Equations with Distant Dipolar Field

Louis-S. Bouchard

Comments 28 pages, 9 figures, 3 tables

详情
英文摘要

The distant dipolar field (DDF) is a long-range, nonlocal contribution to liquid-state spin dynamics that arises from intermolecular dipolar couplings and can generate multiple-quantum coherences and novel MRI contrast. Its sign-changing kernel makes Bloch-DDF dynamics strongly geometry dependent, and FFT-based dipolar convolutions naturally assume periodic or padded Cartesian domains rather than bounded samples with reflective diffusion boundaries. We study the Bloch equations with the DDF on bounded domains under homogeneous Neumann diffusion conditions. We derive a finite-element weak formulation that supports spatially varying diffusion and relaxation parameters and uses a short-distance regularization of the secular DDF kernel with length a>0. For fixed a we prove boundedness of the DDF operator, establish an L2 energy balance in which precession is neutral while diffusion and transverse relaxation are dissipative, and obtain local well-posedness with continuous dependence on the data, with global existence under energy-neutral transport. For the Galerkin semi-discretization we show a discrete energy identity mirroring the continuum estimate. For computation, we evaluate the DDF in real space with a matrix-free near/far scheme and advance in time using a second-order IMEX splitting method that treats diffusion and relaxation implicitly and precession explicitly. The explicit stage applies a Rodrigues rotation at DDF quadrature points followed by an L2 projection, enabling stable multi-cycle lab-frame simulations. We validate against three closed-form benchmarks and quantify curved-boundary effects by comparing mapped finite elements with a voxel-mask finite-difference baseline on spherical Neumann eigenmode decay. These results provide an analyzable and reproducible route for Bloch-DDF dynamics on bounded domains with complex geometry.

2604.04907 2026-04-07 math.CO

Counting geodesic paths in graphs

Martin Knor, Jelena Sedlar, Riste Škrekovski, Xiao-Dong Zhang

Comments 23 pages, 5 figures

详情
英文摘要

A geodesic is a shortest path which connects a pair of vertices of a graph G. In this paper we define the geodesic subpath number gpn(G) of a graph G as the number of geodesics in G. The number of subtrees and subpaths are already studied in literature, but they are both large quantities. Hence, the geodesic subpath number which is related to these quantities but smaller than both, seems worthy of investigation. We first consider extremal graphs with respect to the geodesic subpath number among all connected graphs on n vertices. This number is minimized by the so called geodetic graphs, i.e. graphs in which each pair of vertices is connected by precisely one geodesic. As for the graphs which maximize the geodesic subpath number, we provide an upper bound on gpn(G) in terms of n and we further consider several graph families which might have a large gpn(G). Yet, their value of gpn(G) still does not attain the established bound, so narrowing the gap remains as an open problem. We also consider the class of cactus graphs on n vertices and k cycles and among them characterize extremal graphs with respect to this new invariant.

2604.04904 2026-04-07 cs.HC cs.CY

Demonstrating SIMA-Play: A Serious Game for Forest Management Decision-Making through Board Game and Digital Simulation

Arka Majhi, Daniel Fernández Galeote, Timo Nummenmaa, Juho Hamari, Aaron Petty, Jari Vauhkonen, Heli Peltola

Comments Accepted to the GamiFIN 2026 conference

详情
英文摘要

Board games have shown promise as educational tools, but their use in engaging learners with the complex, long-term trade-offs of forest management remains strikingly underdeveloped. Addressing this gap, we investigate how forest growth simulation data can inform decision-making through information visualization and gameplay mechanics. We designed a serious game, SIMA-Play, that enables players to make informed forest management decisions under dynamic environmental and market conditions, simulating forest growth over time and comparing player performance across economic and sustainability outcomes. By using visualization to give players feedback on their choices, at the end of the game, it supports systems thinking and makes the trade-offs in forestry practices easier to understand and discuss. The study concludes with a research roadmap that outlines future experiments, longitudinal studies, and digital versions of SIMA-Play to assess its long-term effects on learning and engagement.

2604.04903 2026-04-07 physics.optics physics.comp-ph

Maximally localized modes of a multimode fiber

Nicolas Barré

详情
英文摘要

This article presents an optimization method to find the most spatially concentrated basis of a multimode fiber, obtained by minimizing the sum of the spatial spreads of the individual modes over all unitary transformations of a given orthonormal mode set. The resulting modes are the optical analogue of maximally localized Wannier functions in solid-state physics. We apply the method to the Laguerre-Gaussian basis of a graded-index fiber for mode counts ranging from 6 to 55. In all cases, the modes spontaneously organize into concentric rings without any geometric constraint being imposed. The spot sizes and ellipticities evolve from one ring to the next in ways that geometric packing approaches cannot predict. For large mode counts, the optimizer finds solutions where neither the number of spots per ring nor the spots within a given ring follow a regular pattern, indicating that the fully symmetric arrangement is no longer a minimum of the spread functional. A constrained variant of the method enables the optimizer to target any prescribed bundle geometry while quantifying its localization cost, opening a route to physically grounded photonic lantern design.

2604.04900 2026-04-07 math.CO

On Semisymmetric Height and a Multidimensional Generalization of Weighted Catalan Numbers

Ryota Inagaki, Dimana Pramatarova

Comments 36 pages, 4 figures, 6 tables

详情
英文摘要

Weighted Catalan numbers are a class of weighted sums over Dyck paths. Well-studied for their arithmetic properties and applications to enumerative combinatorics, these numbers were recently generalized to the setting of $k$-dimensional Catalan numbers for $k \geq 2$. In this paper, we introduce the $k$-dimensional semisymmetric weighted Catalan numbers ($k$-dimensional SSWCNs), an alternative $k$-dimensional generalization, along with their variant, the $k$-dimensional $u$-bounded semisymmetric weighted Catalan numbers ($k$-dimensional $u$-bounded SSWCNs). We define these two classes of numbers using the notion of semisymmetric height, a new statistic on points in $\mathbb{Z}^k_{\geq 0}$ motivated by geometric symmetries of $k$-dimensional analogs of Dyck paths and of the fundamental Weyl chamber of type $A_{k-1}$. For our main results, we prove the eventual periodicity of $k$-dimensional SSWCNs and their $u$-bounded variants modulo a suitable integer $m$, and we derive formulas for several classes of $k$-dimensional $u$-bounded SSWCNs. Additionally, using semisymmetric height, we derive novel analogs in the $k$-dimensional setting of the integer sequence counting Dyck paths by height and of the Narayana numbers. We conclude the paper with a future direction for generalizing weighted Catalan numbers to the $k$-dimensional setting.

2604.04899 2026-04-07 quant-ph

Connection between the contextuality breaking and incompatibility breaking qubit channels

Swati Kumari, Sumit Mukherjee, R. Prabhu

Comments 10 pages, 4 figures,

详情
英文摘要

Contextuality and measurement incompatibility are two fundamental aspects of nonclassicality, and their manifestations in observed quantum correlations are often deeply interconnected. Recently, measurement incompatibility has been studied in connection with nonlocality, particularly in terms of their robustness under various quantum channels. This line of investigation helps establish a connection between the channels that break nonlocality and those that break incompatibility. In this study, we focus on an asymmetric bipartite Bell scenario involving three and four inputs on Alice and Bob sides, respectively, with each of these inputs having dichotomous outcomes. Under the assumption of locality, the observed statistics in this asymmetric scenario obeys the Elegant Bell inequality (EBI). Here, we use a different version of the EBI that relies on the assumption of the preparation noncontextuality. By taking the violation of this noncontextual version of EBI as a witness of preparation contextuality we establish a connection between the channels that break contextuality and the channels that break triple-wise measurement incompatibility. Our results suggest that any channel which breaks EBI contextuality will also break Clauser-Horne-Shimony-Holt (CHSH) nonlocality; however, the reverse does not hold. We also show that a depolarising channel that breaks N-wise incompatibility can also break a certain form of contextuality, witnessed by a generalised inequality involving N measurements on one wing of a bipartite Bell scenario.