arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 2079
2605.07490 2026-05-11 cs.CR

Cross-Modal Backdoors in Multimodal Large Language Models

Runhe Wang, Li Bai, Haibo Hu, Songze Li

AI总结 本文研究了多模态大语言模型中轻量级连接器可能存在的跨模态后门攻击问题。作者提出了一种新的攻击方法,仅通过污染连接器模块,即可利用单一模态的种子样本及其增强变体,在其他模态的输入下激活后门行为。实验表明,该攻击在多种代表性多模态模型中具有高成功率和跨模态可迁移性,且隐蔽性强,现有防御手段难以有效应对,揭示了多模态对齐中的根本性安全漏洞。

详情
英文摘要

Developers increasingly construct multimodal large language models (MLLMs) by assembling pretrained components,introducing supply-chain attack surfaces.Existing security research primarily focuses on poisoning backbones such as encoders or large language models (LLMs),while the security risks of lightweight connectors remain unexplored.In this work,we propose a novel cross-modal backdoor attack that exploits this overlooked vulnerability.By poisoning only the connector using a single seed sample and several augmented variants from one modality,the adversary can subsequently activate the backdoor using inputs from other modalities.To achieve this,we first poison the connector to associate a compact latent region with a malicious target output.To activate the backdoor from other modalities,we further extract a malicious centroid from the poisoned latent representations and perform input-side optimization to steer inputs toward this latent anchor,without requiring repeated API queries or full-model access.Extensive evaluations on representative connector-based MLLM architectures,including PandaGPT and NExT-GPT,demonstrate both the effectiveness and cross-modal transferability of the proposed attack.The attack achieves up to 99.9% attack success rate (ASR) in same-modality settings,while most cross-modal settings exceed 95.0% ASR under bounded perturbations.Moreover,the attack remains highly stealthy,producing negligible leakage on clean inputs,and maintaining weight-cosine similarity above 0.97 relative to benign connectors.We further show that existing defense strategies fail to effectively mitigate this threat without incurring substantial utility degradation.These findings reveal a fundamental vulnerability in multimodal alignment: a single compromised connector can establish a reusable latent-space backdoor pathway across modalities,highlighting the need for safer modular MLLM design.

2605.07486 2026-05-11 cs.CR

Spying Across Chiplets: Side-Channel Attacks in 2.5/3D Integrated Systems

Giorgio Di Natale, Christelle Rabache, Pierre-Louis Hellier, Florence Podevin, Sylvain Bourdel, Romain Siragusa, Paolo Maistri

AI总结 本文研究了在2.5D/3D集成系统中,通过芯片间通信接口发起侧信道攻击的新威胁。研究提出,原本用于外部通信的芯片模块可被攻击者利用,作为窃听邻近芯片活动的平台。通过构建现实攻击模型并进行实验验证,作者证明了此类接口能够泄露邻近芯片的运行信息,揭示了先进封装技术在安全方面的新隐患。

详情
英文摘要

Advanced packaging and chiplet-based integration are increasingly adopted to build complex heterogeneous systems beyond the limits of monolithic scaling. While these architectures offer major benefits in terms of modularity, yield, and performance, they also introduce new physical attack surfaces. In this paper, we show that side-channel attacks can be mounted across chiplets within the same package or stack. Our key idea is that a communication-oriented chiplet, originally intended to interact with the external environment through an antenna, an RFID-like element, or another contactless coupling structure, can be repurposed as an internal observation platform. We formalize this threat through a realistic adversary model, describe the corresponding attack principle, and experimentally assess its feasibility. The obtained results demonstrate that signals captured through such a communication-oriented interface can reveal information correlated with the activity of a neighboring victim chiplet.

2605.07484 2026-05-11 math.NT

Anderson generating function of rank-one Drinfeld Module over rational function fields

Chuangqiang Hu, Xiao-Min Huang, Stephen S. -T. Yau

AI总结 本文在有理函数域上的一阶Drinfeld模算术研究中取得突破性进展,通过在积分域 $\A$ 上推导显式公式,将经典多项式环的情形推广到与无限处度数 $N \geqslant 2$ 相关的射影直线情形。研究构建了相应的Anderson生成函数,并通过Pellarin级数、指数挠模和对数形变将其与Carlitz周期联系起来,揭示了与Carlitz理论的关键区别,并引入指数作用以克服Galois群作用带来的障碍,为研究此类Drinfeld模及其关联的$L$-函数提供了有力工具,同时拓展了结果到任意Dedekind环,为Drinfeld模理论开辟了新的研究方向。

详情
英文摘要

We establish a fundamental breakthrough in rank-one Drinfeld module arithmetic by deriving explicit formulas over the integral domain $\A = H^{0}(\mathbb{P}^1-P_ρ, \mathcal{O}_{\mathbb{P}^1})$, which generalizes the classical polynomial ring ($N=1$) to the projective line associated with an infinite place of degree $N \geqslant 2$. This fills a longstanding gap by developing a comprehensive parallel to Carlitz module theory foundational in positive characteristic arithmetic for the understudied case of infinite places of degree $>1$. We construct Anderson generating functions for these modules and link them to the Carlitz period via Pellarin's series, exponential torsion modules, and logarithmic deformations. These constructions provide powerful tools for studying such Drinfeld modules and their associated $L$-series, central to modern number theory. A key result reveals a critical distinction from Carlitz theory: the standard Anderson generating function residue formula fails due to Galois group action. We resolve this obstruction by introducing an exponential action, enabling simultaneous study of all twisted exponential functions a major methodological advance. We further show that Anderson generating function computation involves the dual of Drinfeld modules, leading to an appropriate residue formula modification. Notably, our natural approach generalizes to arbitrary Dedekind domains, extending our results beyond $\A$ and opening new avenues in Drinfeld module theory.

2605.07480 2026-05-11 physics.optics

Static SERS with near-minus-one-epsilon substrate

Alexey P. Vinogradov, Evgeny S. Andrianov

AI总结 本文提出了一种在纳米颗粒-镜面结构中进一步增强表面增强拉曼散射(SERS)的新机制,该机制利用具有近负一介电常数的衬底材料。通过在该衬底上方放置一个扁球形等离激元纳米颗粒,并在两者之间放置拉曼活性分子,利用衬底对纳米颗粒偶极子的镜像增强效应,使得两个偶极子同时辐射,从而将SERS信号强度提升至原来的万倍。这一方法为实现高灵敏度的静态SERS检测提供了新的物理途径。

详情
英文摘要

A mechanism for additional enhancement of SERS in the nanoparticle-on-mirror scheme is proposed. This new mechanism is based on the use of a substrate made of material with a near-minus-one permittivity. The setup involves a plasmonic nanoparticle in the form of an oblate ellipsoid positioned above the substrate and a Raman-active molecule located between them. In the conventional nanoparticle-on-mirror scheme, the plasmonic dipole resonance frequency coincides with the Stokes frequency of the Raman-active molecule. Consequently, due to the Purcell effect, the molecule's near fields mostly excites a dipole mode in nanoparticle. This dipole moment is many times greater than the dipole moment of the molecule by itself. If the real part of substrate permittivity is near minus one, the image of the nanoparticle dipole moment in the mirror-substrate is a dipole moment pointed in the same direction but approximately $ 1/{\rm{Im}} \varepsilon_{_{\rm{ENZ}}}$ times larger in magnitude. The simultaneous radiation of these two dipoles additionally increases the SERS intensity in $ 10^4$ times.

2605.07479 2026-05-11 astro-ph.HE

Diffuse gamma-ray emissions around the stellar cluster Berkeley 59

Ziwei Ou, Xiaolong Yang, Songpeng Pei

AI总结 本文利用费米卫星的伽马射线数据,对年轻恒星团伯克利59周围的扩展伽马射线辐射进行了详细分析,发现其周围存在显著的GeV波段延展辐射,其形态可由半径1.02度的径向盘模型描述,扩展显著性达10.6σ。研究结合分子、中性及电离气体分布,分析了其高能起源,认为该伽马射线辐射源于恒星风加速的宇宙线与周围气体相互作用,并推导了宇宙线加速效率与扩散系数之间的关系。

详情
英文摘要

We report a detailed analysis on the young stellar cluster Berkeley 59 using Fermi-LAT. Using up-to-date source catalog and background models, we found significant extended GeV emission around Berkeley 59, which can be modeled by a radial disk of 1.02 degree radius with a significance of the extension of 10.6 sigma. We investigated the molecular, neutral and ionized gas content and the hadronic origin. The gamma-ray spectrum of Berkeley 59 has a photon index of 2.88. The derived gas mass from H2 and HII around Berkeley 59 is about 289 solar mass. We derived the relationship between cosmic ray acceleration efficiency and diffusion coefficient. Our results suggest that the extended gamma-ray emission originates from cosmic rays accelerated by cluster winds interacting with surrounding gas.

2605.07475 2026-05-11 cs.NE cs.ET eess.SP

Broken-symmetry shape discrimination on a driven Duffing ring

Kaspar Anton Schindler

AI总结 该研究探讨了在驱动的杜芬环(Duffing环)上实现打破对称性的形状辨别问题,分析了两种基本操作——捆绑和绑定在具有连续对称性的环形结构上的行为。研究发现,在线性 regime 中系统能有效提取输入信号的特征,而在非线性 Duffing regime 中则通过对称性约束的三次模混操作生成形状依赖的谐波内容。研究提出一个单值可观测量 $ϕ_0$,能够表征绑定响应对输入形状的依赖,并揭示其对称性结构,为形状识别提供了一种有效的单值表征方法。

Comments ~34 pages, 6 figures. Code: https://github.com/KasparSchindler449/wavecomputing-paper1; archived snapshot: doi:10.5281/zenodo.20055785

详情
英文摘要

Distributed computational substrates rely on two elementary operations: bundling, the act of populating a shared physical medium with independently retrievable components, and binding, the act of composing components into outputs whose identity depends on their relations. We study these two primitives on the simplest closed substrate carrying a continuous symmetry, a cycle graph of N nodes, in two parameter regimes of a single master equation of motion. The linear regime sorts a temporal input across the substrate's U(1)-organised eigenmodes, providing a feature representation that matches a windowed-FFT baseline at high signal-to-noise ratio and modestly outperforms it for transient signals at low SNR. The Duffing regime activates a cubic mode-mixing operation constrained by the substrate's symmetry into a sparse selection rule on integer wavenumbers, generating shape-dependent harmonic content that the linear regime cannot produce. We identify a single-number observable, $ϕ_0$, that summarises the bound representation's response to input shape, and we analyse its symmetry structure: a $π$-periodicity in the shape parameter is exact, while a time-reversal symmetry that would render $ϕ_0$ degenerate is broken by the substrate's dissipation. The asymmetric status of these two symmetries is what licenses $ϕ_0$ as a meaningful single-number observable; its trajectory across the quotient domain encodes the joint response of binding and dissipation to the input shape. Numerical experiments confirm that $ϕ_0$ retains its information content under additive band-limited noise, with seed-averaged means staying clearly above the symmetric-attractor value down to 0 dB input SNR. The framework is developed on synthetic signals only; extensions to richer substrates, more elaborate drives, and real biological signals are open questions for the work that follows.

2605.07469 2026-05-11 econ.TH

Coordination Mechanisms with Partially Specified Probabilities

Francesco Giordano

AI总结 本文研究了在仅公开数据生成过程的部分统计信息而非完整分布的情况下,哪些结果是可以实现的。玩家根据有限随机变量的期望值,通过最大熵推理形成信念。研究发现,当消息空间不受限时,可实现的结果与联合协调结果一致,扩展了相关均衡的范围;而在经典机制下,可实现性归结为一个交叉熵条件。该框架通过多个例子和游戏类展示了其广泛适用性。

详情
英文摘要

We study which outcomes are implementable by disclosing coarse statistics of a data-generating process rather than its full distribution. Players observe data whose joint distribution is only partially known: they know the expectations of finitely many random variables and form beliefs by maximum-entropy inference. We obtain two characterizations. When message spaces are unrestricted, implementable outcomes coincide with jointly coherent outcomes, expanding the set of correlated equilibria. With canonical mechanisms, implementability reduces to a single cross-entropy condition: the target outcome must lie on the cross-entropy level set of some correlated equilibrium that passes through that equilibrium itself. Examples and several classes of games illustrate the reach of the framework.

2605.07468 2026-05-11 cs.DM math.CO

Well-Quasi-Ordering Eulerian Digraphs: Bounded Carving Width

Dario Cavallaro, Ken-ichi Kawarabayashi, Stephan Kreutzer

AI总结 本文证明了所有具有有界 carving 宽度(等价于有界度和树宽)的欧拉有向图类在强浸入序下是良拟序的,并进一步表明即使在顶点带有良拟序标签、边有固定顺序以及对浸入路径有额外限制的情况下,此类图类依然保持良拟序性质。研究还指出,当欧拉有向图的度无界时,即使树宽不超过2,其在强浸入下也不构成良拟序,并给出了强浸入与弱浸入在良拟序性质上的二分结果。

Comments Full Version of the respective paper appearing at ICALP 2026. arXiv admin note: text overlap with arXiv:2509.26260

详情
英文摘要

We prove that every class of Eulerian directed graphs of bounded carving width (equivalently of bounded degree and treewidth) is well-quasi-ordered by strong immersion. In fact, we prove a stronger result, namely that every class of Eulerian directed graphs of bounded carving width, where every vertex is additionally labeled from a well-quasi-order, fixes a linear order on its incident edges, and may impose further restrictions on how the immersion is allowed to route paths through it, is well-quasi-ordered by an adequate notion of strong immersion. To this extent, we develop a framework seemingly suited to prove well-quasi-ordering for classes of Eulerian directed graphs by (strong) immersion and present a first meta theorem in that direction. We complement our results by observing that the class of Eulerian directed graphs of unbounded degree is \emph{not} well-quasi-ordered by \emph{strong} immersion, even if we assume the treewidth of the class to be at most two. We conclude with a dichotomy result, proving for a very restricted class of Eulerian directed graphs of unbounded degree that it is not well-quasi-ordered by strong immersion, but it is well-quasi-ordered by weak immersion.

2605.07464 2026-05-11 hep-th

Broken and restored: a holographic constraint for AdS vacua with orbifolds

Filippo Revello, Vincent Van Hemelryck

AI总结 本文研究了具有 orbifold 紧致化的 AdS 真空在全息对偶中的自洽性问题,指出在某些情况下,特定的三次耦合必须消失以满足对偶一致性要求。研究发现,在 II 型弦论中,基于不同 orbifold 群的 AdS3 和 AdS4 真空通常会违反这一约束,但通过将 orbifold 群扩展为非阿贝尔群,可以消除导致冲突的标量算符,从而恢复一致性。结果表明,全息对偶的自洽性对紧致化几何施加了非平凡限制,尤其限制了 O-平面的缠绕方式。

Comments 24 pages + appendices

详情
英文摘要

It has been suggested that families of weakly-coupled AdS vacua with a large-$N$ holographic dual must satisfy non-trivial consistency requirements, which amount to the vanishing of certain cubic couplings, corresponding to (super-)extremal arrangements of scalar operators. While this constraint is known to hold in the simplest incarnation of the DGKT scenario in massive type IIA string theory, i.e. on the $\mathbb{Z}_3\times \mathbb{Z}_3$ orbifold, we find that it is generically violated for type II AdS$_3$ and AdS$_4$ vacua arising from $\mathbb{Z}_2 \times \mathbb{Z}_2 \times \mathbb{Z}_2$ and $\mathbb{Z}_2 \times \mathbb{Z}_2$ orbifolds respectively, including scale-separated solutions and DGKT-CFI-type models. In most cases, however, this can be cured by enlarging the orbifold group to a suitable (non-abelian) extension that projects out precisely those scalar operators that would otherwise participate in the constrained cubic couplings. Our results suggest that consistency of the putative holographic dual imposes a non-trivial restriction on the compactification geometry, indicating in particular that O-planes cannot wrap cycles in distinct homology classes.

2605.07459 2026-05-11 cs.CC

On the Complexity of Discounted Robust MDPs with $L_p$ Uncertainty Sets

Ali Asadi, Krishnendu Chatterjee, Alipasha Montaseri, Ali Shafiee

AI总结 本文研究了具有 $L_p$ 不确定集的折扣鲁棒马尔可夫决策过程(RMDPs)的计算复杂性。作者分析了 $(s, a)$-矩形 RMDPs 在折扣成本准则下的求解效率,提出了基于策略迭代算法的强多项式时间复杂性结果,并针对不同 $L_p$ 不确定集给出了相应的理论界。此外,还证明了对于 $1 < p < \infty$ 的整数 $p$,相关问题具有计算难度,实验部分验证了策略迭代在实际中的收敛速度。

详情
英文摘要

A basic model in sequential decision making is the Markov decision process (MDP), which is extended to Robust MDPs (RMDPs) by allowing uncertainty in transition probabilities and optimizing against the worst-case transition probabilities from the uncertainty sets. The class of $(s, a)$-rectangular RMDPs with $L_p$ uncertainty sets provides a flexible and expressive model for such problems. We study this class of RMDPs with a discounted-sum cost criterion and a constant discount factor. The existence of an efficient algorithm for this class is a fundamental theoretical question in optimization and sequential decision making. Previous results only establish a strongly polynomial-time algorithm for $L_\infty$ uncertainty sets. In this work, our main results are as follows: (a)~we show that for any compact uncertainty set, the policy iteration algorithm for RMDPs is strongly polynomial with oracle access to solutions of Robust Markov chains (RMCs); (b)~we present strongly polynomial-time bounds on the policy iteration algorithm for RMCs with $L_1$ and $L_\infty$ uncertainty sets; and (c)~we establish hardness results for RMCs with $L_p$ uncertainty sets for integer $p$ satisfying $1<p<\infty$. Finally, motivated by our theoretical bounds, we present experimental results showing how fast policy iteration converges for RMDPs with $L_1$ and $L_\infty$ uncertainty sets.

2605.07450 2026-05-11 cs.GR

LoBoFit: Flexible Garment Refitting via Local Bone Mapping Blending

Meng Zhang, Yu Xin, Feiya Guo, Kaizhang Kang, Mengyu Chu, Ruizhen Hu

AI总结 本文提出了一种名为 LoBoFit 的服装重拟方法,旨在将一件服装从源角色适配到目标角色,同时保持其原始设计特征和细小褶皱。该方法通过引入局部骨骼映射混合(LoBoMap Blending)表示,避免了传统全局坐标下顶点变形带来的复杂优化问题,实现了更鲁棒、高效的服装变形。实验表明,LoBoFit 能够在不同形状和拓扑结构的角色之间高质量地重拟高分辨率单层和多层服装,有效保留了细密褶皱和拟合风格,优于现有先进方法。

Comments 14 pages including references

详情
英文摘要

Garment refitting, the task of adapting a garment from a source to a target avatar, must preserve the original design features and fine-scale wrinkles, a challenge exacerbated by significant shape variations and varying poses without registration to a shared canonical pose. Existing methods struggle to balance robustness, efficiency, and fidelity of detail: physics-based simulation is costly, data-driven approaches lack generalizability, and geometry optimization in the full vertex space is often ill-conditioned and prone to local minima with unsatisfactory quality. We identify that a fundamental limitation lies in the representation: deforming garments directly in global coordinates couples vertices non-locally, creating a complex and poorly-structured optimization landscape. Therefore, we introduce LoBoFit, a robust refitting method built upon a novel Local Bone Mapping Blending (LoBoMap Blending) representation. Instead of manipulating global vertex positions, LoBoMap Blending expresses garment geometry as a linear blend of its mappings into local bone coordinate frames. This representation is highly expressive and flexible: local bone mappings yield a pose-robust initialization and a well-conditioned parameterization, while blending weights smooth the optimization landscape and broaden the space of plausible solutions for stable convergence with fine-scale detail preservation. The subsequent refinement efficiently resolves collisions and preserves details by optimizing localized residuals, effectively decomposing the complex global deformation into manageable subproblems. Our experiments demonstrate that LoBoFit reliably refits high-resolution, single- and multi-layer garments across avatars with large shape and topological differences, while faithfully preserving intricate wrinkles and the intended fit style, outperforming state-of-the-art methods in robustness and output quality.

2605.07449 2026-05-11 quant-ph

Hybrid Qubit-Qutrit Quantum Battery: Nonclassicality and Energy Performance

G. Sharvan Prakash, R. Muthuganesan

AI总结 本文提出并分析了一种基于自旋-1/2和自旋-1混合系统的杂化量子电池模型,该模型通过各向异性海森堡交换耦合在均匀磁场中实现。研究利用量子相干性和纠缠度等非经典特性指标,评估了量子电池的能量存储性能,发现其能效和功率具有振荡特性,而容量保持恒定。研究还表明非经典性在提升能量存储效率中起关键作用,并展示了该理论模型与实验可行的镍-自由基分子复合物的关联,证明在室温下仍能保持量子相干、纠缠和高效的能量存储,为在固态分子平台实现杂化量子电池提供了可行路径。

详情
英文摘要

We propose and analyze a hybrid qubit-qutrit quantum battery (QB) based on a mixed spin-1/2 and spin-1 system interacting via an anisotropic Heisenberg exchange coupling in the presence of a homogeneous magnetic field. The nonclassical properties of the system are characterized using the l1-norm of coherence and negativity, which quantify quantum coherence and entanglement, respectively. The performance of the quantum battery is evaluated through key indicators such as ergotropy, power, and capacity. Our results reveal that both ergotropy and power exhibit oscillatory dynamics, while the capacity remains constant over time. We further investigate the influence of system parameters and magnetic field strength on both quantum correlations and battery performance, demonstrating that nonclassicality plays a crucial role in enhancing energy-storage efficiency. Importantly, we establish a connection between the theoretical model and an experimentally realizable nickel-radical molecular complex, showing that quantum coherence, entanglement, and efficient energy storage persist even at room temperature. These findings provide a realistic pathway toward the implementation of hybrid qubit-qutrit quantum batteries in solid-state molecular platforms.

2605.07448 2026-05-11 stat.ME stat.CO stat.ML

Robust Tensor Regression with Nonconvexity: Algorithmic and Statistical Theory

Zihao Song, Jicai Liu, Heng Lian, Weihua Zhao

AI总结 本文研究了在存在重尾噪声和异常值情况下高维张量数据的鲁棒回归问题,提出了一种基于非凸张量管秩松弛的稳健回归方法。该方法在一般优化框架下同时处理损失函数和惩罚项的非凸性,并开发了可实现的估计算法,证明了其在温和条件下的全局收敛性。此外,论文建立了关于平稳点的通用统计理论,涵盖了线性模型、广义线性模型以及一些非凸损失函数,并通过仿真和实际应用验证了方法的有效性。

详情
英文摘要

Tensor regression is an important tool for tensor data analysis, but existing works have not considered the impact of outliers, making them potentially sensitive to such data points. This paper proposes a low tubal rank robust regression method for analyzing high-dimensional tensor data with heavy-tailed random noise. The proposed method is based on a nonconvex relaxation of the tensor tubal rank within a general optimization framework, which allows for nonconvexity in both the loss and penalty functions. We develop an implementable estimation algorithm and establish its global convergence under some mild assumptions. Furthermore, we provide general statistical theories regarding stationary point, including the rates of convergence and bounds on the prediction error. These theoretical results cover many important models, such as linear models, generalized linear models, and Huber regression, and even encompass some nonconvex losses like correntropy and minimum distance criterion-induced losses. Supportive numerical evidence is provided through simulations and application studies.

2605.07445 2026-05-11 cond-mat.mtrl-sci

Dislocations in (011)-oriented vertical Bridgman $β$-Ga$_2$O$_3$ substrates

Yongzhao Yao, Daiki Katsube, Hirotaka Yamaguchi, Yukari Ishikawa

AI总结 该研究通过X射线透射拓扑术和网纹照相技术,系统分析了垂直布里奇曼法生长的(011)取向β- Ga₂O₃衬底中的位错特性。研究发现,位错主要位于(001)平面并沿[010]方向延伸,形成与畴界相关的阵列结构,同时也在(011)平面上观测到不同类型的位错。研究还揭示了畴界处的微小取向偏差,并为理解缺陷形成机制及其对外延生长和器件性能的影响提供了重要参考。

Comments 18 pages, 4 figures

详情
英文摘要

Dislocation in (011)-oriented $β$-Ga$_2$O$_3$ substrates grown by the vertical Bridgman method was investigated using X-ray topography (XRT), combined with X-ray reticulography. Transmission XRT reveals dislocations lying on the (001) plane and extending along [010], forming arrays associated with domain boundaries. Dislocations on the (011) plane were also identified but differ from those responsible for line-shaped pits on (001) epilayers. Reflection XRT shows good agreement with transmission XRT and enables classification of dislocation types based on contrast features. Reticulography confirms domain boundaries with misorientation on the order of 1E-5 rad, providing insight into defect formation relevant to epi-growth and device performance.

2605.07443 2026-05-11 cs.DC

RcLLM: Accelerating Generative Recommendation via Beyond-Prefix KV Caching

Zhan Zhao, Yuxin Wang, Amelie Chi Zhou

AI总结 本文提出了一种名为 RcLLM 的分布式推理系统,旨在加速生成式推荐任务中的长提示处理。通过引入超越前缀的键值缓存机制,RcLLM 将用户历史和物品信息分解为可复用的块,并结合分层存储设计与选择性注意力机制,显著减少了计算冗余。实验表明,RcLLM 在保持推荐精度的同时,将首 token 响应时间提升了 1.31 到 9.51 倍,有效支持了实时推荐服务。

Comments Accepted by ICDCS 2026

详情
英文摘要

Large Language Models (LLMs) are transforming recommendation from ranking into a generative task, but industrial deployment remains limited by the high latency of processing long, personalized prompts. Standard prefix caching provides limited benefit because reuse in recommendation workloads is often non-contiguous across user histories and item contexts. We present RcLLM, a distributed inference system for generative recommendation with Beyond-Prefix KV Caching. RcLLM decomposes prompts into reusable blocks and supports large item catalogs with a stratified distributed storage design: compact user-history caches are replicated for zero-latency retrieval, while massive item caches are sharded using similarity-aware placement. To reduce redundant quadratic attention computation, RcLLM combines an affinity-based global scheduler that improves data locality with a selective attention mechanism that corrects approximation errors. Experiments on real-world datasets show that RcLLM reduces Time-To-First-Token (TTFT) by 1.31x-9.51x compared with state-of-the-art prefix caching systems, enabling real-time serving with negligible impact on recommendation accuracy.

2605.07441 2026-05-11 math.OC cs.SY eess.SY

Data-Driven Contextual-Aware Uncertainty Set for Robust Dispatch of Power Systems

Zhaojun Ruan, Yulin Liu, Le Fu, Libao Shi

AI总结 本文研究了如何利用数据驱动的方法构建具有上下文感知能力的不确定性集,以提升电力系统鲁棒调度的性能。作者提出了一种基于条件高斯混合模型的方法,利用协变量作为侧信息,针对历史数据中不规则分布的特点设计定制化的不确定性集。该方法将不确定性集表示为子集的并集,并通过混合整数线性规划形式描述最坏情况下的参数实现,最终通过数值实验验证了方法的有效性。

详情
英文摘要

Both the level of conservativeness and the computational burden in robust optimization are critically influenced by uncertainty set design. However, contextual side information is rarely exploited in robust dispatch of power systems characterized by irregular data distributions, which hinders the explicit characterization of the relationship between covariates and uncertain parameters. To address this issue, a data-driven method for constructing contextual-aware uncertainty set is proposed in this letter. Based on a conditional Gaussian mixture model, a set of covariates is leveraged as side information to design uncertainty sets tailored to historical data exhibiting irregular distributions. The resulting set is formulated as a union-of-subsets formulation, and a mixed integer linear reformulation is adopted to describe the worst-case realization across all subsets. Finally, the effectiveness of the proposed method is demonstrated through numerical experiments applied to robust unit commitment.

2605.07440 2026-05-11 physics.chem-ph math-ph math.MP physics.comp-ph quant-ph

On the single-Hessian Gaussian wavepacket dynamics

Davide Barbiero, Jiří J. L. Vaníček

AI总结 本文研究了单Hessian高斯波包动力学(GWD)方法,旨在降低Heller局部谐波GWD的计算成本,同时保持对振动-电子能谱的近似精度。作者提出了该方法的辛几何推导,证明其在光滑势场中能够保持非正则辛结构并避免能量漂移,并开发了高阶时间步进几何积分器以提高数值模拟的效率和精度。实验结果表明,该方法在计算效率大幅提升的同时,保持了与局部谐波GWD相当的误差水平,并在多个光谱计算中表现出优于全局谐波模型的性能。

详情
英文摘要

Single-Hessian Gaussian wavepacket dynamics (GWD) significantly reduces the computational burden of Heller's local harmonic GWD, while maintaining comparable accuracy in approximating vibronic spectra. Here, we provide a new, symplectic derivation of the equations of motion of single-Hessian GWD and show that, unlike the local harmonic version, this method conserves the non-canonical symplectic structure on the manifold of Gaussian wavepackets and$-$for bounded dynamics in smooth potentials$-$avoids the drift of energy. Our numerical results suggest that, despite being much more efficient than the local harmonic variant, the single-Hessian GWD exhibits the same $\mathcal{O}(\hbar)$ asymptotic error in averages of observables. To further accelerate numerical simulations, we implement high-order time-stepping geometric integrators that are time-reversible and conserve the norm and symplectic structure exactly, regardless of the time step. In addition, we present explicit expressions for the exact evolution of the width of a single-Hessian Gaussian wavepacket in a general potential, as well as for the exact evolution of the whole wavepacket in a global harmonic potential. Using on-the-fly ab initio Gaussian wavepacket dynamics on the first excited-state surface of ammonia, we numerically confirm the conservation of geometric properties by these integrators and demonstrate that high-order integrators can enhance both accuracy and computational efficiency. We also compute the photoelectron spectrum of the difluorocarbene anion and the absorption spectrum of methylamine, and find that, in comparison with experiment, single-Hessian GWD outperforms global harmonic models and matches the accuracy of local harmonic GWD. Finally, we identify which spectral features are sensitive to the choice of reference Hessian.

2605.07439 2026-05-11 q-bio.BM

CA-DEL: An Open Multi-Target, Multi-Modal Benchmark for Learning from DNA-Encoded Library Screens

Mutian He, Hanqun Cao, Cheng Tan, Zijun Gao, Xiaojun Yao, Chunbin Gu, Pheng-Ann Heng

AI总结 本文提出了一种名为CA-DEL的开放多靶点、多模态基准数据集,用于从DNA编码文库筛选中学习分子与靶点之间的关系。该数据集聚焦于同源碳酸酐酶亚型(CAII、CAIX、CAXII)的选择性识别问题,通过整合实验测定的结合亲和力数据($K_i$),建立了从噪声筛选数据到高精度生物物理数据的模拟到现实评估范式,为开发鲁棒的药物发现模型提供了重要支持。

详情
英文摘要

The success of machine learning in drug discovery hinges on learning the relationship between a chemical structure and its biological activity. While DNA-Encoded Library (DEL) technology can generate the massive datasets required for this task, its primary signal -- sequencing read counts -- is an indirect and often noisy proxy for true molecular binding affinity. To address the scarcity of public benchmarks for developing robust models that can overcome this data challenge, we introduce CA-DEL, a multi-dimensional public benchmark featuring screens against three homologous carbonic anhydrase isoforms. While recent benchmarks like KinDEL have introduced 3D poses for kinase targets, CA-DEL distinguishes itself by focusing on the selectivity challenge among homologous Carbonic Anhydrase isoforms (CAII, CAIX, CAXII). Unlike benchmarks relying solely on noisy enrichment scores, CA-DEL integrates a rigorous validation set of experimentally determined binding affinities ($K_i$) from ChEMBL, establishing a critical Sim-to-Real evaluation paradigm: training on noisy DEL screens and testing on high-fidelity biophysical data.

2605.07438 2026-05-11 math.LO

Bounded depth in Hilbert algebras

Luca Carai, Miriam Kurtzhals, Tommaso Moraschini

AI总结 本文研究希尔伯特代数中的深度界问题,证明了在希尔伯特代数中,具有至多n层深度的条件是一个等价条件。这一结果推广了在希林代数中已知的类似结论,为希尔伯特代数的结构分析提供了新的代数工具。

详情
英文摘要

Hilbert algebras are the implicative subreducts of Heyting algebras. It is shown that having depth at most n is an equational condition in Hilbert algebras. This generalizes an analogous well-known result in the setting of Heyting algebras.

2605.07436 2026-05-11 math.AP

What is ... Robin harmonic measure?

Max Engelstein, Marcel Filoche, Svitlana Mayboroda

AI总结 本文介绍了Robin边界条件及其在数学与物理中的重要性。作者探讨了Robin边界条件的定义与应用背景,并分析了其在偏微分方程中的作用。研究有助于理解该条件在不同学科中的影响与意义。

详情
英文摘要

We present here the Robin boundary condition and its significance in mathematics and physics.

2605.07435 2026-05-11 cond-mat.mes-hall

Point-gap topology of damped magnon excitations in skyrmion strings

Yusuke Koyama, Yuki Kawaguchi

AI总结 本文研究了因吉尔伯特阻尼导致有限寿命的磁振子在手性磁涡旋弦结构中的非厄米拓扑特性。通过引入包含非局域阻尼项的兰道-李弗希兹-吉尔伯特方程的自旋波理论和微扰理论,作者解析地计算了点隙的谱绕数,揭示了非厄米皮肤效应的存在。研究发现,即使在没有非局域阻尼的情况下,非厄米皮肤效应仍可能发生,并且在存在单向非局域阻尼时,能量带的绕数由带最小值处波矢的符号决定。这些结果通过手性磁涡旋弦晶格模型得到了验证,并通过磁脉冲激发的自旋波传播动力学进一步支持了非平凡绕数的物理意义。

详情
英文摘要

We theoretically study the non-Hermitian topology of magnons with finite lifetimes due to Gilbert damping. By incorporating the spin-wave theory and perturbation theory for the Landau-Lifshitz-Gilbert equation including nonlocal damping terms, we analytically evaluate the spectral winding number for point gaps, which indicates the existence of the non-Hermitian skin effect (NHSE). We find that the NHSE can occur even in the absence of nonlocal damping. In the presence of nonlocal damping along one direction, we show that the winding number for an energy band with a unique minimum is determined from the sign of the wave number at the band minimum. We demonstrate these results using a model that hosts a skyrmion-string lattice as a steady state. We further investigate spin-wave propagation dynamics excited by a magnetic-field pulse and show that the propagation direction changes drastically from band to band depending on the presence of local and nonlocal damping, consistent with the nontrivial winding numbers.

2605.07434 2026-05-11 stat.OT

Adaptive Subspace Signal Detection and Performance Analysis in Nonzero-Mean Clutter

Weijian Liu, Zhenyu Xu, Jun Liu, Hui Chen, Yongxiang Liu

AI总结 本文研究了在非零均值杂波背景下子空间信号的检测问题,提出了基于广义似然比检验(GLRT)、Rao检验、Wald检验等策略的自适应检测器。分析了各检测器的检测概率和虚警概率表达式,揭示了非零均值杂波下自由度和信杂比的性能损失。仿真和实测数据验证了所提检测器的有效性及其在实际雷达系统中的应用价值。

详情
英文摘要

To solve the problem of detecting subspace signals in nonzero-mean clutter, we propose adaptive detectors, based on the strategies of generalized likelihood ratio test (GLRT), Rao test, Wald test, gradient test, and Durbin test. The results show that the detectors based on GLRT, Rao and Wald are structurally consistent with the subspace detectors in zero-means clutter. The analytic expressions for the probability of detection (PD) and probability of false alarm (PFA) of each detector are derived, and two major performance differences in the nonzero-mean clutter scenario are revealed. One is the loss of degree of freedom (DOF), which is reduced by 1 compared with the zero-mean clutter scenario. The second is the loss of signal-to-clutter (SCR) ratio. Simulation and measured data verify the effectiveness of the proposed detectors and demonstrate their practical value in real-world radar systems.

2605.07430 2026-05-11 cs.CR cs.MM

Forensic analysis of video data deletion and recovery in Honeywell surveillance file system

Jinhee Yoon, Sungjae Hwang

AI总结 本文研究了霍尼韦尔视频监控设备中使用的未公开专有文件系统,分析其视频数据删除机制,并探讨了删除后视频恢复的可行性。通过二进制差异分析技术,研究了三种删除方法对文件系统元数据和磁盘数据结构的影响,验证了视频数据恢复的可能性。该研究为霍尼韦尔监控产品的取证分析提供了重要参考,并为专有视频记录文件系统的分析奠定了基础。

Comments The paper has been accepted by The 26th Annual Digital Forensics Research Conference USA (DFRWS USA 2026)

详情
英文摘要

Real-time video surveillance systems store recorded video using digital video recorders (DVRs) and network video recorders (NVRs). To support continuous high-volume video storage, these devices employ specialized, nonstandard file systems that are often proprietary and undocumented. This lack of documentation significantly increases the time and effort required for forensic analysis. In this study, we analyze an undocumented proprietary file system used by Honeywell video surveillance devices-one that, to the best of our knowledge, has not been examined in prior work-and investigate its deletion mechanisms and demonstrate the feasibility of video recovery after deletion. We perform a file system analysis using a binary diffing technique and evaluate three deletion methods supported by the target device: 1) formatting-based deletion, 2) data expiration, and 3) overwrite. For each method, we investigate changes in file system metadata and on-disk data structures and demonstrate the feasibility of video data recovery. Our findings aim to support more efficient and accurate forensic investigations of Honeywell surveillance products and provide foundational insights into the analysis of proprietary file systems used in video recording devices.

2605.07428 2026-05-11 math.DS

Non-intrusive spectral submanifold model reduction for geometrically nonlinear rotating structures with Coriolis and centrifugal forces

Hejun Gao, Yiliang Wang, Yan Qing Wang, Jie Yuan, Mingwu Li

AI总结 该研究针对具有科里奥利力和离心力的几何非线性旋转结构,提出了一种非侵入式的谱子流形(SSM)模型降阶方法,用于高效准确地捕捉其复杂非线性动力学行为。研究基于有限元模型,通过计算由离心力引起的非平凡静态平衡点,并在其基础上构建非侵入式降阶模型,实现了对结构背骨曲线和强迫响应曲线的高效提取。通过多个复杂程度递增的实例验证了该方法的有效性,突显了科里奥利力对旋转结构非线性振动的重要影响。

详情
英文摘要

Rotating structures are widely observed in engineering applications such as turbomachinary and wind turbine. These rotating structures, particularly for blades made by lightweight materials, can undergo large deformation in operations and display complex nonlinear dynamics under the coupling interaction of geometric nonlinearity, Coriolis effect and centrifugal force. Finite element (FE) methods provide a powerful and accurate modeling approach for capturing the complex nonlinear dynamics for realistic rotating structures, yet its high-dimensionality causes significant challenge to efficient prediction for the nonlinear vibration. Here, we present a non-intrusive spectral submanifold (SSM) model reduction for these FE models of rotating structures. We use COMSOL to establish FE models and simulate these FE models to verify the accuracy of SSM-based reduction. We first compute nontrivial static equilibrium induced by the centrifugal force and then construct non-intrusively SSM based reduced-order model (ROM) anchored at the equilibrium. These SSM-based ROMs enable efficient and accurate extraction of backbone and forced response curves. We use a suite of examples with increasing complexity to demonstrate the effectiveness of the SSM reduction, including a rotating beam, a twisted plate, a rotor with two disks, and an internally resonant fan with three blades. The obtained results also highlight the significant effects of Coriolis force on the nonlinear vibration of rotating structures.

2605.07427 2026-05-11 math.NA cs.NA

Kolmogorov $\varepsilon$-entropy of numerical solutions for scalar conservation laws with convex flux

Fabio Ancona, Alessio Basti, Fabio Camilli

AI总结 本文研究标量守恒律凸通量问题数值解的Kolmogorov ε-熵,从信息论视角出发,建立了数值解在满足特定网格约束条件下的上下界紧致性估计。作者证明了满足离散单边Lipschitz条件的保守单调差分格式能够保持精确熵解集的1/ε熵标度,与已有理论结果一致。研究还揭示了典型一阶方法在Lax意义下的高分辨率特性,并提出了一个通用的下界传递原理,为信息恢复和后续研究提供了新思路。

详情
英文摘要

Building on the information-theoretic perspective of P.~D.~Lax [\textit{Proc.\ Sympos., Math.\ Res.\ Center, Univ.\ Wisconsin}, 1978], we establish a two-sided quantitative compactness estimate for numerical solutions of scalar conservation laws with a uniformly convex flux, expressed in terms of Kolmogorov $\varepsilon$-entropy. We prove that, under specific grid constraints, conservative, monotone finite-difference schemes satisfying a discrete one-sided Lipschitz condition (OSLC) preserve the $1/\varepsilon$ Kolmogorov entropy scaling of the corresponding exact entropy solution set, matching the bounds obtained by De~Lellis and Golse [\textit{Comm.\ Pure Appl.\ Math.}\ \textbf{58} (2005)] and by Ancona, Glass, and Nguyen [\textit{Comm.\ Pure Appl.\ Math.}\ \textbf{65} (2012)]. Specifically, the upper bound follows from the discrete OSLC, while the lower bound relies on a uniform approximation argument on a bounded-variation precursor class. Our results show that prototypical first-order methods are high-resolution in Lax's sense. Finally, we abstract the lower bound mechanism into a general transfer principle, discuss implications for information recovery via post-processing, and indicate directions for future work.

2605.07426 2026-05-11 cs.IT math.IT

UMVUE-Type Estimators under Bregman Losses

Akira Kamatsuka, Shun Watanabe

AI总结 本文研究了在Bregman损失函数下无偏估计的问题,并扩展了经典统一最小方差无偏估计(UMVUE)理论。通过引入Bregman散度的偏差-方差分解,作者分析了两种自然损失函数下的无偏性概念,并指出其中一种形式与经典理论一致,而另一种则在由$\nablaφ$诱导的对偶空间中形成了新的框架。针对非平凡情形,作者建立了Rao-Blackwell和Lehmann-Scheffé定理的类比,系统地构建了第一类Bregman UMVUE。

详情
英文摘要

We study unbiased estimation under Bregman losses and develop an extension of the classical theory of uniformly minimum variance unbiased estimators (UMVUEs). Exploiting bias--variance-type decompositions for Bregman divergences, we consider two natural loss functions, $D_φ(θ,\hatθ)$ and $D_φ(\hatθ,θ)$, and their corresponding notions of unbiasedness. We show that the latter formulation reduces to the classical setting, whereas the former yields a different framework in which unbiasedness is characterized in the dual space induced by $\nablaφ$. For the nontrivial case, we establish analogs of the Rao--Blackwell and Lehmann--Scheff{é} theorems, providing a systematic construction of type-I Bregman UMVUEs.

2605.07425 2026-05-11 eess.SP

Geometry-Aided Channel Deduction: A Robust Channel Acquisition Framework Utilizing Coarse Scenario Prompt

Hongning Ruan, Zhaoyang Zhang, Zirui Chen, Ziqing Xing, Zhaohui Yang

AI总结 本文提出了一种基于几何信息辅助的信道推断框架——几何辅助信道推断(GCD),旨在解决传统基于导频的信道估计方法中导频开销大、估计质量低的问题。该方法利用环境地图和基站位置构成的场景几何信息,通过射线追踪获取几何信道特征,并结合预提取的几何特征集进行邻域搜索,生成伪信道作为上下文提示,再与部分估计结果融合,最终生成完整的信道信息。实验表明,该方法在稀疏导频条件下具有领先的信道估计精度,且在新场景和动态环境中表现出良好的泛化能力与鲁棒性。

详情
英文摘要

Channel state information (CSI) is critical for multi-input multi-output (MIMO) orthogonal frequency division multiplexing (OFDM) system. Pilot-based channel estimation methods suffer from high pilot overhead and low channel acquisition quality, while pilot-free approaches typically impose impractical demands on positional or environmental information precision. This paper proposes geometry-aided channel deduction (GCD), which leverages readily available geometric information to assist channel acquisition. The environmental map and base station position together constitute the scenario geometry, which can provide geometric channel features through ray tracing. To obtain the complete channel, the user first retrieves approximate geometric features by performing neighborhood searching within a pre-extracted geometric feature set, and then converts them into pseudo channels through a priori designed feature alignment. These pseudo channels serve as contextual prompt, providing supplementary channel features beyond those derived from pilot-based estimate. Finally, a neural network fuses these pseudo channels with partial estimate to generate the complete channel. Comprehensive experiments validate the superiority of our method, which achieves the leading accuracy in channel acquisition under sparse pilot conditions, demonstrates strong generalization capabilities in new scenarios and dynamic environments, and exhibits robust resilience against user position errors and non-ideal environmental information.

2605.07423 2026-05-11 physics.ins-det hep-ex

Characterization of a Two-Channel Optical and Near-infrared Transition Edge Sensor System for Rare-Event Searches

Manuel Meyer, Katharina-Sophie Isleif, Friederike Januschek, Axel Lindner, Gulden Othman, Elmeri Rivasto, Jose Alejandro Rubiera Gimeno, Christina Schwemmbauer

AI总结 该研究介绍了一种用于稀有事件探测的双通道光学和近红外过渡边传感器(TES)系统,专门优化用于探测1064nm波长的光子。该系统实现了86%的系统探测效率、优于7%的能量分辨率以及低于6mHz的背景暗计数率,并通过能量分辨率实现了对单色信号的高灵敏度探测。研究成果为轴子和轴子类似粒子的实验搜索提供了有力的技术支持,同时也为基于TES的稀有事件探测方法提供了重要参考。

Comments 22 pages, 12 Figures, 8 Tables. Submitted

详情
英文摘要

Transition edge sensors (TESs) are superconducting energy-resolving microcalorimeters that have demonstrated low background rates as well as quantum efficiencies close to unity for photons at optical and near-infrared wavelengths. This makes these detectors well suited for rare-event searches. We report on the comprehensive characterization of a two-channel detector module consisting of two tungsten TESs optimized for the detection of photons with a wavelength of 1064nm. The devices achieve a system detection efficiency of $(86\pm1)$%, an energy resolution better than 7%, and a background dark-count rate of photon-like events below 6mHz when coupled to an optical fiber. Using an unbinned likelihood framework, we find the dark count rate to be compatible with blackbody radiation from the room-temperature laboratory environment. Thanks to the energy resolution of the TESs, we show that it is possible to detect monochromatic signals at 1064nm with photon rates $\geqslant 2.7_{-0.6}^{+0.8} \times10^{-5}$Hz, which corresponds to a power of $\geqslant(5.0_{-1.1}^{+1.4})\times10^{-24}$W, within 20 days of measurement time at the 5$σ$ confidence level. This makes our detectors well suited for searches for hypothetical axions and axion-like particles with experiments such as the Any Light Particle Search II (ALPS II) or axion interferometers. The developed methodologies are not only applicable to axion searches, but are also relevant for rare-event searches with TESs in general.

2605.07421 2026-05-11 stat.AP

There to care; not to kill: medical settings, statistics and wrongful convictions

Richard D. Gill

AI总结 本文探讨了医疗环境中护士被错误定罪的问题,分析了此类案件中常见的证据薄弱情况,如缺乏直接证据、监控记录或供认,而主要依赖统计关联性作为指控依据。研究指出,警方调查往往受医院顾问的影响,而检方可能将护士的日常行为或私人文字曲解为犯罪证据,动机多为推测。文章强调了统计证据在医疗误判中的关键作用,并呼吁对这类案件进行更审慎的法律与医学评估。

Comments Invited contribution to a volume on miscarriages of justice, in preparation

详情
英文摘要

This paper discusses wrongful convictions in a medical setting, focusing on nurses. Common features are lack of strong direct evidence: the nurse was never seen doing anything wrong. There is no DNA evidence of tampering of apparatus or medications by the nurse. There is no CCTV footage showing suspicious actions. Analysis of medical records at the time led coroners to issue certificates of natural deaths, and most events were not, at the time, thought suspicious by hospital staff. There is no confession and the nurse consistently asserts they are completely innocent. There is no evidence of earlier psychopathic behaviour. Instead, private writings (e.g., in a diary) are interpreted by the prosecution as a confession; mundane behaviour is given a sinister interpretation. Motive remains speculation. The main evidence is statistical: a spike in deaths or collapses and a statistical association with a particular nurse. There is forensic evidence which suggests one or two patients might have been harmed by administration of medication much used in the hospital, and even legitimately used earlier in the care of the alleged victims. Police investigations are driven by the hospital consultants who were clinically responsible for the patients allegedly killed or harmed by the nurse.

2605.07419 2026-05-11 cs.GT

Incentivizing User Data Contributions for LLM Improvement under Withdrawal Rights

Di Feng, Chenhao Zhang, Zhanzhan Zhao

AI总结 本文研究了在用户拥有撤回权的情况下,如何激励用户贡献高质量数据以促进大语言模型的改进。面对数据贡献的隐私成本和可逆性问题,作者提出了一种结合补贴与撤回权的机制设计,有效解决了协调失败的问题。通过理论分析表明,该机制能够在确保数据贡献可持续的前提下,避免补贴浪费,并提高了模型改进的成功概率。

Comments 31 pages, 10 figures

详情
英文摘要

The continued improvement of large language models (LLMs) increasingly depends on eliciting high-quality, user-generated data, yet such data are costly to provide and often withheld due to privacy and effort concerns. This creates a fundamental design challenge: how to incentivize data contribution when model improvements require coordinated, threshold-level inputs, while contributions remain privately costly and partially reversible. We develop and theoretically analyze incentive mechanisms for user data contribution that explicitly account for threshold effects and reversibility, focusing on how subsidies and withdrawal rights can be jointly designed to overcome coordination failure. As a natural benchmark, we first consider subsidy-based incentives, under which users respond to posted payments with privately optimal floor contributions. These decentralized responses may fall below the improvement threshold, resulting in subsidy expenditure without model improvements. We then analyze mechanisms with withdrawal rights, in which users report costs, the provider centrally assigns contribution burdens, and users may withdraw before training. We prove that combining cost reporting with personalized assignment can eliminate inefficient provision by ensuring that data are collected only when improvement is sustainable, converting infeasible instances into a null outcome rather than subsidy leakage. Finally, we compare two withdrawal protocols. The simultaneous protocol can achieve lower total cost, while the small-first sequential protocol better incentivizes participation, encouraging greater data provision and thereby increasing the probability of crossing the improvement threshold.