arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.07558 2026-05-12 cs.HC cs.AI

Generative Experiences for Digital Mental Health Interventions: Evidence from a Randomized Study

Ananya Bhattacharjee, Michael Liut, Matthew Jörke, Diyi Yang, Emma Brunskill

AI总结该研究探讨了如何通过生成式体验提升数字心理健康干预的效果，提出了一种在运行时动态生成个性化干预内容和多模态交互结构的新范式。研究构建了名为GUIDE的系统，通过模块化组件的指导生成实现个性化内容与交互方式的组合，并在237名参与者的随机对照实验中验证了其有效性，结果显示GUIDE在降低压力和提升用户体验方面优于基于大语言模型的认知重构方法。该工作为数字环境中动态塑造支持体验提供了新的思路和实践基础。

2604.06518 2026-05-12 eess.IV cs.AI cs.CV

ADP-FL-MedSeg: Adaptive Differential Privacy for Federated Medical Segmentation Across Diverse Modalities

Puja Saha, Eranga Ukwatta

AI总结由于隐私法规和机构限制，医疗数据难以集中利用，而集中训练的模型又常因影像协议和数据分布的异质性而难以跨临床站点泛化。为此，本文提出一种自适应差分隐私联邦学习框架ADP-FL-MedSeg，通过动态调整隐私机制，在保证隐私的前提下提升分割精度与训练稳定性。实验表明，该方法在多种医学影像分割任务中均优于传统联邦学习和标准差分隐私联邦学习，实现了高精度、高稳定性的隐私保护医疗图像分割。

Comments 10 pages, 8 figures. Accepted in SPIE Medical Imaging 2026. Recipient of CAD Best Paper Award: 1st Place, and Robert F. Wagner All-Conference Best Paper Award: Finalist

详情

DOI: 10.1117/12.3075111
Journal ref: Proceedings Volume 13926, SPIE Medical Imaging 2026: Computer-Aided Diagnosis

英文摘要

Large volumes of medical data remain underutilized because centralizing distributed data is often infeasible due to strict privacy regulations and institutional constraints. In addition, models trained in centralized settings frequently fail to generalize across clinical sites because of heterogeneity in imaging protocols and continuously evolving data distributions arising from differences in scanners, acquisition parameters, and patient populations. Federated learning offers a promising solution by enabling collaborative model training without sharing raw data. However, incorporating differential privacy into federated learning, while essential for privacy guarantees, often leads to degraded accuracy, unstable convergence, and reduced generalization. In this work, we propose an adaptive differentially private federated learning (ADP-FL) framework for medical image segmentation that dynamically adjusts privacy mechanisms to better balance the privacy-utility trade-off. The proposed approach stabilizes training, significantly improves Dice scores and segmentation boundary quality, and maintains rigorous privacy guarantees. We evaluated ADP-FL across diverse imaging modalities and segmentation tasks, including skin lesion segmentation in dermoscopic images, kidney tumor segmentation in 3D CT scans, and brain tumor segmentation in multi-parametric MRI. Compared with conventional federated learning and standard differentially private federated learning, ADP-FL consistently achieves higher accuracy, improved boundary delineation, faster convergence, and greater training stability, with performance approaching that of non-private federated learning under the same privacy budgets. These results demonstrate the practical viability of ADP-FL for high-performance, privacy-preserving medical image segmentation in real-world federated settings.

URL PDF HTML ☆

赞 0 踩 0

2604.02564 2026-05-12 eess.IV cs.CV

Why Invariance is Not Enough for Biomedical Domain Generalization and How to Fix It

Sebo Diaz, Polina Golland, Elfar Adalsteinsson, Neel Dey

AI总结本文提出了一种名为MaskGen的3D生物医学图像分割领域泛化方法，旨在解决模型在不同模态、疾病严重程度和临床环境变化下的性能下降问题。该方法通过结合源域图像强度和领域稳定的预训练模型表示，以较低的实现成本训练出鲁棒的分割模型，在全监督和少样本分割任务中均表现出色。与现有方法相比，MaskGen不依赖特定网络结构或损失函数，兼容标准数据增强流程，易于实现，并能适用于任意解剖区域。

Comments Project GitHub https://github.com/sebodiaz/MaskGen

2603.29632 2026-05-12 cs.MA cs.AI

An Empirical Study of Multi-Agent Collaboration for Automated Research

Yang Shen, Zhenyi Yi, Ziyi Zhao, Lijun Sun, Dongyang Li, Chin-Teng Lin, Yuhui Shi

AI总结随着AI代理的发展，研究社区正从单一的大语言模型转向多智能体系统，以克服自动化研究中的认知瓶颈。本文通过严格的实验测试床，对比了单智能体与两种多智能体架构在自动化机器学习优化中的效果，揭示了操作稳定性与理论深度之间的根本权衡。研究发现，子代理架构适合在时间严格限制下进行高效搜索，而代理团队架构则在计算资源充足时更有利于复杂架构的理论优化，为未来自动化研究系统的设计提供了重要指导。

2603.23806 2026-05-12 cs.SE cs.AI

Willful Disobedience: Automatically Detecting Failures in Agentic Traces

Reshabh K Sharma, Shraddha Barke, Benjamin Zorn

AI总结随着AI代理越来越多地嵌入实际软件系统中，验证其执行过程中的行为合规性变得愈发重要。本文提出了一种名为AgentPex的工具，能够自动检测代理执行过程中的违规行为，其核心方法是从代理提示和系统指令中提取行为规则，并据此评估执行轨迹的合规性。实验表明，AgentPex能有效识别仅凭最终结果难以发现的流程性错误，并为不同领域和指标提供细粒度分析，有助于开发者全面了解代理的优劣。

Comments Accepted at ACM CAIS 2026

2603.16231 2026-05-12 math.OC cs.RO cs.SY eess.SY

Featurized Occupation Measures for Structured Global Search in Numerical Optimal Control

Qi Wei, Jianfeng Tao, Haoyang Tan, Hongyu Nie

AI总结本文提出了一种名为“特征化占用度量”（Featurized Occupation Measures, FOM）的新方法，旨在解决数值最优控制中全局结构与计算可扩展性之间的矛盾。该方法通过构建有限维的原-对偶接口，将数值求解器与显式的哈密顿-雅可比-贝尔曼（HJB）子解耦合，从而在保证计算效率的同时实现全局搜索。研究还展示了该框架在处理高维问题时如何将维度诅咒从状态空间转移到连接拓扑结构，并通过实验验证了其在静态避障任务中引导优化器逼近全局最优的有效性。

2603.13536 2026-05-12 quant-ph cs.LG

Active Sampling Sample-based Quantum Diagonalization from Finite-Shot Measurements

Rinka Miura

AI总结本文提出了一种基于主动采样的量子对角化方法（AS-SQD），用于从有限采样测量中高效估计量子系统的基态能量。该方法将量子对角化视为一个主动学习问题，通过引入基于微扰理论的获取函数，动态选择对当前子空间最有价值的基态进行扩展，从而有效减少偏差和激发态污染的影响。实验表明，与传统方法相比，AS-SQD在多种量子系统中表现出更优的基态能量估计精度，并且在实际量子硬件上也展示了良好的鲁棒性。

Comments 7 pages, 5 figures

详情

DOI: 10.1109/QCNC69040.2026.00163
Journal ref: IEEE International Conference on Quantum Communications, Networking, and Computing (QCNC 2026)

英文摘要

Near-term quantum devices provide only finite-shot measurements and prepare imperfect, contaminated states. This motivates algorithms that convert samples into reliable low-energy estimates without full tomography or exhaustive measurements. We propose Active Sampling Sample-based Quantum Diagonalization (AS-SQD), framing SQD as an active learning problem: given measured bitstrings, which additional basis states should be included to efficiently recover the ground-state energy? SQD restricts the Hamiltonian to a selected set of basis states and classically diagonalizes the restricted matrix. However, naive SQD using only sampled states suffers from bias under finite-shot sampling and excited-state contamination, while blind random expansion is inefficient as system size grows. We introduce a perturbation-theoretic acquisition function based on Epstein--Nesbet second-order energy corrections to rank candidate basis states connected to the current subspace. At each iteration, AS-SQD diagonalizes the restricted Hamiltonian, generates connected candidates, and adds the most valuable ones according to this score. We evaluate AS-SQD on disordered Heisenberg and Transverse-Field Ising (TFIM) spin chains up to 16 qubits under a preparation model mixing 80\% ground state and 20\% first excited state. Furthermore, we validate its robustness against real-world state preparation and measurement (SPAM) errors using physical samples from an IBM Quantum processor. Across simulated and hardware evaluations, AS-SQD consistently achieves substantially lower absolute energy errors than standard SQD and random expansion. Detailed ablation studies demonstrate that physics-guided basis acquisition effectively concentrates computation on energetically relevant directions, bypassing exponential combinatorial bottlenecks.

URL PDF HTML ☆

赞 0 踩 0

2603.10051 2026-05-12 cs.NI cs.AI cs.CR cs.LG

Where Do Flow Semantics Reside? A Protocol-Native Tabular Pretraining Paradigm for Encrypted Traffic Classification

Sizhe Huang, Zitong Li, Shujie Yang

AI总结该论文研究了如何更有效地对加密网络流量进行分类，指出当前基于自监督掩码建模的方法在依赖标注数据方面仍存在问题。作者认为问题的根源在于将流量数据扁平化为字节序列时破坏了协议定义的语义结构，导致语义信息丢失和嵌入混淆。为此，他们提出了一种基于协议原生的表格预训练范式，引入了FlowSem-MAE模型，通过保留协议字段语义结构，显著提升了加密流量分类的性能。

2603.05653 2026-05-12 cs.CY cs.AI cs.IR cs.SI

The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok

Sara Solarova, Matej Mosnar, Matus Tibensky, Jan Jakubcik, Adrian Bindas, Simon Liska, Filip Hossner, Matúš Mesarčík, Ivan Srba

AI总结该研究针对 TikTok 平台上的广告和未成年人定向推荐机制进行了算法审计，揭示了《数字服务法》（DSA）在保护未成年人免受算法推荐广告影响方面的盲点。研究通过模拟未成年人和成人的用户账户，发现尽管 TikTok 表面上遵守了 DSA 的规定，但未成年人仍能接触到基于兴趣高度定制的商业内容，其推荐强度远高于成人的正式广告。研究指出，现行法律对“广告”的狭义定义未能涵盖网红合作和品牌推广等内容，导致监管存在漏洞，亟需扩大广告定义并禁止对未成年人的定向推荐。

Comments In The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT'26), June 25-28, 2026, Montreal, QC, Canada. ACM

详情

DOI: 10.1145/3805689.3812355

英文摘要

Adolescents spend an increasing amount of their time in digital environments where their still-developing cognitive capacities leave them unable to recognize or resist commercial persuasion. Article 28(2) of the DSA responds to this vulnerability by prohibiting profiling-based advertising to minors. However, the regulation's narrow definition of "advertisement" excludes current advertising practices including influencer paid partnerships and brand promotional content that serve functionally equivalent commercial purposes. We provide the first empirical evidence of how this definitional gap operates in practice through an algorithmic audit of TikTok. Our approach deploys sock-puppet accounts simulating a pair of minor and adult users with matching interest profiles. The content recommended to these users is automatically annotated, enabling systematic statistical analysis. Our findings reveal a stark regulatory paradox. TikTok demonstrates formal compliance with Article 28(2) by shielding minors from profiled formal advertisements, yet both disclosed and undisclosed ads exhibit significant profiling aligned with user interests (5-8 times stronger than for adult formal advertising). The strongest profiling emerges within undisclosed commercial content, where creators/brands fail to label paid partnership/promotional content and the platform neither corrects this omission nor prevents its personalized delivery to minors. These results demonstrate that minors remain exposed to algorithmically targeted commercial content through the same recommendation mechanisms the DSA seeks to constrain. We argue that protecting minors requires expanding the definition of advertisement in EU law to encompass influencer and brand promotional content, and ensuring that any such expansion is accompanied by a corresponding prohibition on profiling-based targeting of minors.

URL PDF HTML ☆

赞 0 踩 0

2603.03971 2026-05-12 cs.CY cs.AI cs.LG cs.LO

Upholding Epistemic Agency: A Brouwerian Assertibility Constraint for Responsible AI

Michael Jülich

AI总结本文探讨了生成式人工智能在高风险领域中可能削弱民主知识主体性的问题，提出了一种基于布劳威尔思想的断言约束机制，要求系统在无法提供可公开检验和争议的证明时，必须返回“未确定”状态，而非直接断言或否定。该方法引入了包含“断言”、“否定”和“未确定”三种状态的接口语义，强调系统对断言的资格而非断言内容本身的真实性。通过在决策层引入阈值和参数选择的门控机制，该方法确保系统输出具有可挑战的依据，从而维护知识主体性，防止自动化话语对公共论证的过度影响。

Comments Preprint. 64 pages, 5 figures, 2 tables

2602.01022 2026-05-12 econ.GN cs.AI q-fin.EC

Calibrating Behavioral Parameters with Large Language Models

Brandon Yee, Pairie Koh

AI总结本文研究如何利用大语言模型（LLM）校准行为参数，如损失厌恶、从众和过度推断等，这些参数在资产定价模型中具有核心地位但难以准确测量。作者构建了一个框架，将LLM作为校准工具，通过大量实验发现LLM在行为理性方面存在系统性偏差，并通过基于角色的校准方法显著提升了其行为参数的合理性和稳定性。研究还验证了校准后参数在资产定价模型中的有效性，揭示了八种典型行为偏差的测量范围和校准函数。

2601.22638 2026-05-12 cs.MA cs.AI cs.LG

ScholarPeer: A Context-Aware Multi-Agent Framework for Automated Peer Review

Palash Goyal, Mihir Parmar, Yiwen Song, Hamid Palangi, Tomas Pfister, Jinsung Yoon

AI总结随着机器学习论文数量的激增，传统的同行评审流程面临效率低下和审稿负担加重的问题。为此，研究提出 ScholarPeer，一个基于多智能体的框架，旨在辅助审稿人进行技术严谨性审查，并在论文提交前帮助作者快速迭代。该框架通过分离上下文理解与批评过程，引入领域历史分析、前沿对比挖掘和多维度问答引擎，提升了评审的深度与效率。实验表明，ScholarPeer 在 ICLR 论文上的表现优于现有先进模型。

2601.21410 2026-05-12 stat.ML cs.LG

Learning When to Trust LLM Priors: A Validated Framework for Semantic Prior Integration

Erica Zhang, Naomi Sagan, Danny Tse, Fangzhao Zhang, Mert Pilanci, Jose Blanchet

AI总结该研究探讨了如何在监督学习中可靠地利用大语言模型（LLM）的语义先验知识。作者提出了一种名为Statsformer的验证框架，能够动态判断何时信任LLM生成的语义先验，并将其融入到不同类型的预测模型中。通过交叉验证机制，Statsformer自动调整各模型对先验信息的依赖程度，从而在提升预测性能的同时抑制不可靠的先验信号，为LLM辅助的统计学习提供了一种可靠性导向的解决方案。

2601.20251 2026-05-12 stat.ML cs.LG

Efficient Evaluation of LLM Performance with Statistical Guarantees

Skyler Wu, Yash Nair, Emmanuel J. Candès

AI总结本文研究如何在有限查询预算下高效且准确地评估大量大语言模型的性能。提出了一种名为Factorized Active Querying（FAQ）的方法，结合贝叶斯因子模型、自适应采样策略和有限总体主动推理，以在保证统计置信度的同时减少所需的评估样本数量。实验表明，FAQ在多个基准测试中相比现有方法可提升有效样本量达5倍，显著提高了评估效率。

Comments 27 pages, 12 figures

2512.20012 2026-05-12 eess.SP cs.LG

Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems

Qiushuo Hou, Sangwoo Park, Matteo Zecchin, Yunlong Cai, Guanding Yu, Osvaldo Simeone, Tommaso Melodia

AI总结本文研究了一种基于大语言模型（LLM）的边缘-云-专家级联知识系统，用于电信领域中的自动化决策支持。该系统通过问答流程实现决策，其中边缘模型处理常规查询，云模型处理复杂问题，仅在必要时引入人工专家。研究提出了一种基于多重假设检验的阈值选择方法，以在保证答案与专家判断一致性的前提下，最小化处理成本，并在电信专用数据集TeleQnA上验证了该方法在成本效率和可靠性方面的优越性。

Comments This paper has been submitted to a journal

2512.11077 2026-05-12 cond-mat.mtrl-sci cs.AI

A probabilistic framework for crystal structure denoising, phase classification, and order parameters

Hyuna Kwon, Babak Sadigh, Sebastien Hamel, Vincenzo Lordi, John Klepeis, Fei Zhou

AI总结该研究提出了一种统一的概率框架，用于从噪声原子构型中去噪、分类晶体相并计算序参量。该方法通过预测每个原子对每个晶体原型的置信度，并将其聚合为一个标量对数概率景观，从而实现去噪场的构建与局部相标签、序参量和不确定性度量的获取。模型在多种噪声和缺陷条件下表现出良好的泛化能力，为复杂原子模拟的分析提供了集成且可扩展的工具。

2511.21600 2026-05-12 cs.CR cs.LG

Robust Spectral Watermark for Synthetic Tabular Data

Yizhou Zhao, Xiang Li, Peter Song, Qi Long, Weijie Su

AI总结随着生成式人工智能的发展，合成表格数据在医疗、金融等领域广泛应用，但数据来源和潜在滥用问题日益突出。为解决这一问题，本文提出了一种高效且鲁棒的频域水印方法TAB-DRW，通过归一化异构特征、应用离散傅里叶变换并调整选定条目的虚部来嵌入水印信号，同时引入基于排序的伪随机位生成方法提升鲁棒性与效率。实验表明，TAB-DRW在保持高数据保真度的同时，有效抵抗了多种后处理攻击，适用于混合类型数据。

Comments Accepted to Statistical Learning and Data Science

2511.14045 2026-05-12 cs.CR cs.AI cs.CL

Auditing Data Membership in Reinforcement Learning With Verifiable Rewards

Yule Liu, Heyi Zhang, Jinyi Zheng, Zhen Sun, Zifan Peng, Jiaheng Wei, Tianshuo Cong, Yilong Yang, Xinlei He

AI总结该研究针对强化学习与可验证奖励（RLVR）训练过程中可能存在的数据泄露问题，提出了一种可验证的审计方法。研究指出，传统成员推理攻击难以适用于RLVR，因其响应由模型自身生成并不断优化。为此，作者提出了一种白盒级别的行为偏差审计框架DIBA，通过对比微调模型与预训练模型在奖励和策略层面的行为差异，实现对数据暴露的稳定检测。实验表明，DIBA在多种设置下均优于现有方法，具有较高的检测准确率和鲁棒性。

2511.13415 2026-05-12 cs.IR cs.CL cs.CV

Attention Grounded Enhancement for Visual Document Retrieval

Wanqing Cui, Wei Huang, Yazhi Guo, Yibo Hu, Meiguang Jin, Junfeng Ma, Keping Bi

AI总结视觉文档检索需要理解异构和多模态内容以满足隐含的信息需求。尽管现有方法通过细粒度的晚期交互提升了检索性能，但它们仍依赖于粗粒度的全局相关性标签，难以捕捉文档中具体区域与查询之间的语义关联。为此，本文提出AGREE框架，利用多模态大语言模型的跨模态注意力作为监督信号，引导检索器识别与查询相关的文档区域，结合局部区域信号与全局标签进行联合优化，从而在ViDoRe V2基准测试中显著提升了检索效果。

Comments Published as a conference paper at SIGIR 2026

详情

英文摘要

Visual document retrieval requires understanding heterogeneous and multi-modal content to satisfy implicit information needs. Recent advances use screenshot-based document encoding with fine-grained late interaction to encode holistic information and capture nuanced alignments, significantly improving retrieval performance. However, retrievers are still trained with coarse global relevance labels, without revealing which regions support the match. As a result, retrievers tend to rely on surface-level cues and struggle to capture implicit semantic connections, hindering their ability to handle non-extractive queries.To improve fine-grained relevance modeling, we propose a Attention-Grounded REtriever Enhancement (AGREE) framework. AGREE leverages cross-modal attention from multimodal large language models (MLLMs) as proxy supervision to guide the retriever in identifying relevant document regions. Specifically, AGREE extracts attention maps from the MLLM that highlight which document regions are attended to based on the query. These attention scores serve as local, region-level relevance signals. During training, AGREE combines local signals with the global document-level relevance label to jointly optimize the retriever. This dual-level supervision enables the model to learn not only whether documents match, but also which content drives relevance. Experiments on the challenging visual document retrieval benchmark, ViDoRe V2, show that AGREE significantly outperforms the global-supervision-only baseline by 12.82\% and 5.03\% in terms of average nDCG@1 and nDCG@5. Quantitative and qualitative analyses further demonstrate that AGREE promotes deeper alignment between query terms and document regions, moving beyond surface-level matching toward more accurate and interpretable retrieval. Our code is available at: https://github.com/VickiCui/AGREE.

URL PDF HTML ☆

赞 0 踩 0

2511.08644 2026-05-12 cs.SE cs.AI cs.PF

Energy Consumption of Dataframe Libraries for End-to-End Deep Learning Pipelines:A Comparative Analysis

Punit Kumar, Asif Imran, Tevfik Kosar

AI总结本文对三种主流Python数据处理库——Pandas、Polars和Dask——在端到端深度学习流水线中的性能进行了详细对比分析，重点考察它们在数据加载、预处理和批量输入等关键阶段与大规模GPU工作负载的交互情况。研究测量了包括运行时间、内存使用、磁盘使用以及CPU和GPU能耗在内的关键性能指标，填补了现有文献在该领域的研究空白，并为选择适合深度学习任务的数据处理库提供了参考依据。

2511.01196 2026-05-12 stat.ML cs.AI cs.LG

An Interdisciplinary and Cross-Task Review on Missing Data Imputation

Jicong Fan

AI总结本文系统综述了缺失数据填补这一跨学科、跨任务的研究领域，探讨了缺失机制、填补方法及在不同应用场景下的问题特性。文章全面梳理了从传统统计方法到现代深度学习模型（如自编码器、生成对抗网络、图神经网络等）的各类填补技术，并重点分析了复杂数据类型（如张量、时间序列、图结构数据等）的处理方法。此外，还探讨了填补方法与下游任务（如分类、聚类、异常检测）的结合方式，并指出了未来研究的关键挑战与发展方向。

详情

DOI: 10.1108/FTSIG-11-2025-0139
Journal ref: Foundations and Trends in Signal Processing, Vol. 20, No. 3, pp. 185-317, 2026

英文摘要

Missing data is a fundamental challenge in data science, significantly hindering analysis and decision-making across a wide range of disciplines, including healthcare, bioinformatics, social science, e-commerce, and industrial monitoring. Despite decades of research and numerous imputation methods, the literature remains fragmented across fields, creating a critical need for a comprehensive synthesis that connects statistical foundations with modern machine learning advances. This work systematically reviews core concepts-including missingness mechanisms, single versus multiple imputation, and different imputation goals-and examines problem characteristics across various domains. It provides a thorough categorization of imputation methods, spanning classical techniques (e.g., regression, the EM algorithm) to modern approaches like low-rank and high-rank matrix completion, deep learning models (autoencoders, GANs, diffusion models, graph neural networks), and large language models. Special attention is given to methods for complex data types, such as tensors, time series, streaming data, graph-structured data, categorical data, and multimodal data. Beyond methodology, we investigate the crucial integration of imputation with downstream tasks like classification, clustering, and anomaly detection, examining both sequential pipelines and joint optimization frameworks. The review also assesses theoretical guarantees, benchmarking resources, and evaluation metrics. Finally, we identify critical challenges and future directions, emphasizing model selection and hyperparameter optimization, the growing importance of privacy-preserving imputation via federated learning, and the pursuit of generalizable models that can adapt across domains and data types, thereby outlining a roadmap for future research.

URL PDF HTML ☆

赞 0 踩 0

2510.13896 2026-05-12 q-bio.QM cs.AI cs.CV cs.MA

GenCellAgent: Generalizable, Training-Free Cellular Image Segmentation via Large Language Model Agents

Xi Yu, Yang Yang, Qun Liu, Yonghua Du, Sean McSweeney, Yuewei Lin

AI总结本文提出了一种名为GenCellAgent的训练-free细胞图像分割框架，通过结合专家级分割工具和通用视觉语言模型，实现了对异构显微图像的高效分割。该方法采用计划-执行-评估的循环机制，具备自动选择最佳工具、适应不同成像条件、支持文本引导分割新细胞器以及自我演化等功能。实验表明，GenCellAgent在多个细胞分割基准测试中表现优异，尤其在面对分布外数据和新型细胞结构时，显著优于传统专用模型，为无需重新训练的鲁棒细胞图像分割提供了实用方案。

Comments 43 pages

2509.25926 2026-05-12 cs.CR cs.LG

Preventing Prompt Injection with Type-Directed Privilege Separation

Dennis Jacob, Emad Alghamdi, Zhanhao Hu, Basel Alomair, David Wagner

AI总结本文研究了如何防止现代语言模型中的提示注入攻击问题，提出了一种基于类型引导的特权分离新方法。该方法通过将不可信数据转换为受控的数据类型，限制其内容和范围，从而消除提示注入的可能性。实验表明，该方法在保持系统安全性的前提下，仍能实现强大的实用功能，且易于理解和适配各类语言模型。

Comments Revised manuscript

2509.23391 2026-05-12 eess.SY cs.LG cs.SY nlin.CD

Optimizing the Network Topology of a Linear Reservoir Computer

Sahand Tangerami, Nicholas A. Mecholsky, Francesco Sorrentino

AI总结本文研究如何优化线性水库计算机（RC）的网络拓扑结构，以提升其性能和可解释性。作者通过将RC的动力学分解为多个独立模式，并对每个模式进行优化，从而确定最优的连接结构，对应于水库邻接矩阵的特定特征值集。实验表明，优化后的RC在训练和测试阶段均显著优于随机生成的RC，甚至在某些情况下超越了相同规模的非线性RC，为设计高效、任务特定且分析透明的RC架构提供了理论指导和实践优势。

2509.22531 2026-05-12 stat.ML cs.LG

Debiased Front-Door Learners for Heterogeneous Effects

Yonghan Jung

AI总结在观察性研究中，当处理变量和结果变量存在未观测的混杂因素，但中介变量不受混杂影响时，可通过前门（FD）调整识别因果效应。本文研究了在FD识别框架下异质处理效应（HTE）的估计问题，提出了两种去偏学习方法：FD-DR-Learner和FD-R-Learner。在明确的样本分割、重叠界、矩条件和分阶段学习假设下，这两种方法分别满足乘积误差界和阶段误差分解，从而在 nuisance 项较小时实现条件准oracle性质。实验表明，这些方法在合成数据和基于FARS数据集的真实案例中均表现出良好的稳健性和估计效率。

Comments 26 pages, 3 figures. Revised theory statements, notation, and proof presentation; conclusions unchanged. Code available at https://github.com/yonghanjung/FD-CATE

2509.02372 2026-05-12 cs.CR cs.AI cs.SE

Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs

Zhiyang Chen, Tara Saba, Xun Deng, Xujie Si, Fan Long

AI总结随着大型语言模型（LLMs）在软件开发中扮演越来越重要的角色，其训练过程中依赖未经过滤的网络数据所带来的安全风险日益突出。本文提出了一种可扩展的自动化审计框架 Scam2Prompt，用于检测生产环境中 LLM 是否会在特定提示下生成包含恶意链接的代码。研究发现，四款主流 LLM 在面对精心设计的提示时，有 4.24% 的概率生成恶意内容，且该问题在 2025 年发布的七款新模型中依然存在，恶意代码生成率最高达 47.3%。

2508.08441 2026-05-12 q-bio.QM cs.CE cs.LG

SpectraLLM: Uncovering the Ability of LLMs for Molecular Structure Elucidation from Multi-Spectral Data

Yunyue Su, Jiahui Chen, Zao Jiang, Zhenyi Zhong, Liang Wang, Qiang Liu, Zhaoxiang Zhang

AI总结本文提出了一种名为SpectraLLM的大语言模型，用于从多谱数据中进行分子结构解析。该模型通过统一表示多种光谱模态（如IR、Raman、UV-Vis、NMR和MS）的信息，在共享的语言空间中进行端到端的结构预测，从而捕捉不同光谱类型之间的互补性特征。实验表明，SpectraLLM在多个公开基准数据集上取得了优于单模态方法的最先进性能，并展示了在多模态联合推理中的优越性，为基于语言模型的光谱分析提供了可扩展的范式。

Comments 42 pages, 6 figures, 30 tables; Accepted to ICLR 2026

2507.06850 2026-05-12 cs.CR cs.AI

The Dark Side of LLMs: Agent-based Attack Vectors for System-level Compromise

Matteo Lupinacci, Francesco Aurelio Pironti, Francesco Blefari, Francesco Romeo, Luigi Arena, Angelo Furfaro

AI总结本文研究了大型语言模型（LLM）作为自主智能体推理引擎时可能引发的系统级安全风险，揭示了其在面对攻击时的脆弱性。通过实验评估，作者展示了攻击者如何利用直接提示注入和RAG后门攻击等手段，使LLM自主安装并执行恶意软件，并发现几乎所有主流LLM都存在不同程度的漏洞。研究还指出，在多智能体系统中，即使某些模型能抵御直接攻击，仍可能因智能体间的信任关系被利用而被攻破，暴露出严重的系统安全隐患。

2506.10305 2026-05-12 physics.geo-ph cs.LG physics.data-an

Self-learning signal classifier for decameter coherent scatter radars

Oleg Berngardt, Ivan Lavygin

AI总结本文提出了一种用于十米波段相干散射雷达数据的自学习信号分类方法，仅基于雷达观测数据、电离层无线电波传播的自动建模结果以及模型质量评估的数学准则构建分类器。该分类器通过两年内12部超级双子座（SuperDARN）和SECIRA雷达的数据训练而成，包含2669个模型参数，结合了电离层传播模型计算参数与雷达直接测量参数进行分类。研究分析了37个数据类别中的14个具有明显区分度的类别，并展示了其观测动态与地理纬度、太阳和地磁活动水平的关系，结果与已知物理机制一致。

Comments 30 pages, 10 figures, 4 tables. To be submitted to Advances in Space Research

详情

DOI: 10.1016/j.asr.2025.11.074
Journal ref: Advances in Space Research Volume 77, Issue 3, 1 February 2026, Pages 3527-3548

英文摘要

The paper presents a method for automatic constructing a classifier for processed data obtained by decameter coherent scatter radars. Method is based only on the radar data obtained, the results of automatic modeling of radio wave propagation in the ionosphere, and mathematical criteria for estimating the quality of the models. The final classifier is the model trained at data obtained by 12 radars of the SuperDARN and SECIRA networks over two years for each radar. The number of the model coefficients is 2669. For the classification, the model uses both the calculated parameters of radio wave propagation in the model ionosphere and the parameters directly measured by the radar. Calibration of radiowave elevation measurements at each radar was made using meteor trail scattered signals. The analysis showed that the optimal number of classes in the data is 37, of which 25 are frequently observed. The analysis made it possible to choose 14 classes from them, which are confidently separated in other variants of model training. A preliminary interpretation of 10 of them was carried out. The dynamics of observation of various classes and their dependence on the geographical latitude of radars at different levels of solar and geomagnetic activity were presented, it was shown that it does not contradict with known physical mechanisms. The analysis showed that the most important parameters to identify the classes are the shape of the signal ray-tracing trajectory in its second half, the ray-traced scattering height and the Doppler velocity measured by the radar.

URL PDF HTML ☆

赞 0 踩 0

2506.06038 2026-05-12 eess.SY cs.RO cs.SY

Trajectory Optimization for UAV-Based Medical Delivery with Temporal Logic Constraints and Convex Feasible Set Collision Avoidance

Kaiyuan Chen, Yuhan Suo, Shaowei Cui, Yuanqing Xia, Wannian Liang, Shuo Wang

AI总结本文研究了在城市环境中使用无人机进行时间敏感医疗物资配送的轨迹优化问题，考虑了多个医院的配送时间窗口和优先级。通过信号时序逻辑（STL）对任务目标进行形式化描述，并结合凸可行集（CFS）方法实现三维城市建筑的避障，将整个规划问题转化为凸优化问题，从而保证了求解的高效性和可行性。仿真结果验证了该方法能够生成满足时间约束、避障且动态可行的无人机轨迹，为自主无人机医疗物流提供了可扩展的解决方案。

Comments 11 pages, 4 figures