arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1971
专题追踪
2406.16439 2026-06-18 cs.CV

Continual Test-Time Adaptation for Object Detection with Adaptive Monitoring and Randomized Restoration

持续测试时间适应用于目标检测的自适应监控与随机恢复

Shilei Cao, Juepeng Zheng, Yan Liu, Baoquan Zhao, Ziqi Yuan, Weijia Li, Runmin Dong, Haohuan Fu

发表机构 * School of Artificial Intelligence, Sun Yat-Sen University(中山大学人工智能学院) School of Information Science and Technology, University of Science and Technology of China(中国科学技术大学信息科学与技术学院) State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University(清华大学智能技术与系统国家重点实验室) Tsinghua Shenzhen International Graduate School, Tsinghua University(清华大学深圳国际研究生学院) National Supercomputing Center in Shenzhen(深圳国家超算中心) Ministry of Education Key Laboratory for Earth System Modeling and the Department of Earth System Science, Tsinghua University(清华大学地球系统模型教育部重点实验室)

AI总结 本文提出AMROD方法,通过对比学习、自适应监控和随机恢复机制提升持续测试时间适应的目标检测性能,实验证明其在多个任务中优于现有方法,尤其在Cityscapes-to-Cityscapes-C任务中提升3.2 mAP并提高20%效率。

详情
AI中文摘要

现实应用模型通常部署在动态环境中,目标域分布随时间变化。持续测试时间适应(CTTA)最近作为一种有前景的技术,用于逐渐适应源训练模型以适应不断变化的目标域。尽管在解决CTTA方面取得了进展,但仍存在两个关键问题:1)现有方法中固定的伪标签阈值导致低质量伪标签,因为模型置信度在不同类别和域中变化;2)随机参数恢复方法用于缓解灾难性遗忘,由于其内在随机性,无法有效保留关键信息。为解决CTTA场景中检测模型的这些挑战,我们提出AMROD,包含三个核心组件。首先,对象级对比学习模块提取对象级特征用于对比学习,以细化目标域的特征表示。其次,自适应监控模块根据预测置信度分数动态跳过不必要的适应并更新类别特定阈值,以实现效率并提高伪标签质量。最后,自适应随机恢复机制选择性地重置高可能性的非活动参数,确保关键知识的保留。我们在四个CTTA目标检测任务上展示了AMROD的有效性,其中AMROD在现有方法上表现更优,尤其在Cityscapes-to-Cityscapes-C CTTA任务中实现了3.2 mAP的提升和20%的效率提升。本工作的代码可在https://github.com/ShileiCao/AMROD上获得。

英文摘要

Real-world application models are commonly deployed in dynamic environments, where the target domain distribution undergoes temporal changes. Continual Test-Time Adaptation (CTTA) has recently emerged as a promising technique to gradually adapt a source-trained model to continually changing target domains. Despite recent advancements in addressing CTTA, two critical issues remain: 1) Fixed thresholds for pseudo-labeling in existing methodologies lead to low-quality pseudo-labels, as model confidence varies across categories and domains; 2) Stochastic parameter restoration methods for mitigating catastrophic forgetting fail to preserve critical information effectively, due to their intrinsic randomness. To tackle these challenges for detection models in CTTA scenarios, we present AMROD, featuring three core components. Firstly, the object-level contrastive learning module extracts object-level features for contrastive learning to refine the feature representation in the target domain. Secondly, the adaptive monitoring module dynamically skips unnecessary adaptation and updates the category-specific threshold based on predicted confidence scores to enable efficiency and improve the quality of pseudo-labels. Lastly, the adaptive randomized restoration mechanism selectively reset inactive parameters with higher possibilities, ensuring the retention of essential knowledge. We demonstrate the effectiveness of AMROD on four CTTA object detection tasks, where AMROD outperforms existing methods, especially achieving a 3.2 mAP improvement and a 20% increase in efficiency on the Cityscapes-to-Cityscapes-C CTTA task. The code of this work is available at https://github.com/ShileiCao/AMROD.

2502.15376 2026-06-18 cs.LG cond-mat.mes-hall

Learning Chern Numbers of Topological Insulators with Gauge Equivariant Neural Networks

利用规范等变神经网络学习拓扑绝缘体的陈数

Longde Huang, Oleksandr Balabanov, Hampus Linander, Mats Granath, Daniel Persson, Jan E. Gerken

发表机构 * Department of Mathematical Sciences, Chalmers University of Technology and University of Gothenburg(数学科学系,查尔姆斯理工大学和哥德堡大学) Department of Physics, Stockholm University, AlbaNova University Center(物理系,斯德哥尔摩大学,阿尔巴诺瓦大学中心) VERSES AI Research Lab, Los Angeles, USA(VERSES AI研究实验室,美国洛杉矶) Department of Physics, University of Gothenburg(物理系,哥德堡大学)

AI总结 本文提出利用规范等变网络预测多带拓扑绝缘体的陈数,通过引入新的规范等变归一化层和通用逼近定理,证明模型能泛化至非平凡陈数样本。

Journal ref Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

详情
AI中文摘要

等变网络架构是预测不变或等变量的已知工具。然而,几乎所有在此背景下考虑的学习问题都涉及全局对称性,即底层空间的每个点都用相同的群元素变换,而非局部“规范”对称性,后者使每个点用不同的群元素变换,从而指数级扩大对称群的规模。规范等变网络迄今为止主要应用于量子色动力学问题。在此,我们引入了规范等变网络在拓扑凝聚态物理理论中的新应用领域。我们利用规范等变网络预测多带拓扑绝缘体的拓扑不变量(陈数)。网络的规范对称性保证了预测的量是拓扑不变量。我们引入了新的规范等变归一化层以稳定训练,并证明了我们设置的通用逼近定理。我们仅在陈数为平凡的样本上训练,但证明模型能泛化至陈数为非平凡的样本。我们提供了各种设置的消融实验。我们的代码可在https://github.com/sitronsea/GENet/tree/main获取。

英文摘要

Equivariant network architectures are a well-established tool for predicting invariant or equivariant quantities. However, almost all learning problems considered in this context feature a global symmetry, i.e. each point of the underlying space is transformed with the same group element, as opposed to a local ``gauge'' symmetry, where each point is transformed with a different group element, exponentially enlarging the size of the symmetry group. Gauge equivariant networks have so far mainly been applied to problems in quantum chromodynamics. Here, we introduce a novel application domain for gauge-equivariant networks in the theory of topological condensed matter physics. We use gauge equivariant networks to predict topological invariants (Chern numbers) of multiband topological insulators. The gauge symmetry of the network guarantees that the predicted quantity is a topological invariant. We introduce a novel gauge equivariant normalization layer to stabilize the training and prove a universal approximation theorem for our setup. We train on samples with trivial Chern number only but show that our models generalize to samples with non-trivial Chern number. We provide various ablations of our setup. Our code is available at https://github.com/sitronsea/GENet/tree/main.

2410.23503 2026-06-18 cs.LG

Development and Comparative Analysis of Machine Learning Models for Hypoxemia Severity Triage in CBRNE Emergency Scenarios Using Physiological and Demographic Data from Medical-Grade Devices

基于生理和人口数据的机器学习模型在CBRNE紧急场景中用于缺氧严重程度分诊的发展与比较分析

Santino Nanini, Mariem Abid, Yassir Mamouni, Arnaud Wiedemann, Philippe Jouvet, Stephane Bourassa

发表机构 * SADC-CDSS IA PEDIATRICS, CHU Sainte-Justine, Montreal, Canada(SADC-CDSS IA儿科,圣-朱斯特医院,蒙特利尔,加拿大) Solutions Applicare AI Inc., Montreal, Canada(应用爱智AI公司,蒙特利尔,加拿大) Université de Montréal, Canada(蒙特利尔大学,加拿大) MEDINT CBRNE Group, Montreal, Canada(MEDINT CBRNE组,蒙特利尔,加拿大)

AI总结 本文开发了机器学习模型预测紧急分诊中的缺氧严重程度,利用生理数据提升预测准确性,GBM在训练速度和可解释性上优于序列模型,未来将整合多医院数据提升模型泛化能力。

Comments 12 figures, 12 tables and 39 pages

Journal ref Diagnostics 14 (2024) 2763

详情
AI中文摘要

本文开发了机器学习模型用于预测紧急分诊中的缺氧严重程度,特别是在化学、生物、辐射、核和爆炸(CBRNE)事件中,利用医疗级传感器的生理数据。梯度提升模型(XGBoost、LightGBM、CatBoost)和序列模型(LSTM、GRU)在MIMIC-III和IV数据集上进行了训练。一个稳健的预处理管道处理了缺失数据、类别不平衡,并整合了带有遮罩的合成数据。梯度提升模型(GBM)在训练速度、可解释性和可靠性方面优于序列模型,使其适合实时决策。尽管序列模型在处理时间数据方面表现良好,但其性能提升未能 justify 更高的计算成本。选择了5分钟的预测窗口以实现及时干预,以分钟级插值标准化数据。特征重要性分析突显了遮罩和评分特征在提高透明度和性能中的重要作用。时间依赖性被证明是次要的,因为梯度提升模型能够有效捕捉关键模式,而无需依赖时间依赖性。本研究突显了机器学习在改善分诊和减少警报疲劳方面的潜力。未来的工作将整合多个医院的数据以提高模型在临床环境中的泛化能力。

英文摘要

This paper presents the development of machine learning (ML) models to predict hypoxemia severity during emergency triage, especially in Chemical, Biological, Radiological, Nuclear, and Explosive (CBRNE) events, using physiological data from medical-grade sensors. Gradient Boosting Models (XGBoost, LightGBM, CatBoost) and sequential models (LSTM, GRU) were trained on physiological and demographic data from the MIMIC-III and IV datasets. A robust preprocessing pipeline addressed missing data, class imbalances, and incorporated synthetic data flagged with masks. Gradient Boosting Models (GBMs) outperformed sequential models in terms of training speed, interpretability, and reliability, making them well-suited for real-time decision-making. While their performance was comparable to that of sequential models, the GBMs used score features from six physiological variables derived from the enhanced National Early Warning Score (NEWS) 2, which we termed NEWS2+. This approach significantly improved prediction accuracy. While sequential models handled temporal data well, their performance gains did not justify the higher computational cost. A 5-minute prediction window was chosen for timely intervention, with minute-level interpolations standardizing the data. Feature importance analysis highlighted the significant role of mask and score features in enhancing both transparency and performance. Temporal dependencies proved to be less critical, as Gradient Boosting Models were able to capture key patterns effectively without relying on them. This study highlights ML's potential to improve triage and reduce alarm fatigue. Future work will integrate data from multiple hospitals to enhance model generalizability across clinical settings.

2410.03151 2026-06-18 cs.CL cs.SI

Media Framing through the Lens of Event-Centric Narratives

通过事件中心叙述的镜头进行媒体框架分析

Rohan Das, Aditya Chandra, I-Ta Lee, Maria Leonor Pacheco

发表机构 * University of Colorado Boulder(科罗拉多大学博尔德分校) Independent Researcher(独立研究者)

AI总结 本文提出通过提取事件及其关系构建高阶叙述,用于分析新闻中移民和枪支管控领域的框架机制。

Comments Accepted to the 6th Workshop on Narrative Understanding, co-located with EMNLP 2024

详情
AI中文摘要

从通讯学角度看,框架定义了语言包装方式,以鼓励某些解释并抑制其他解释。例如,新闻文章可将移民 framing 为经济 boon 或 drain,从而传达不同现象解读。本文主张通过叙述构建方式解释框架装置。我们提出一个框架,提取事件及其与其他事件的关系,并将其分组为高阶叙述,以解释新闻文章中的框架。我们展示该框架可用于分析美国新闻中移民和枪支管控两个领域的框架。

英文摘要

From a communications perspective, a frame defines the packaging of the language used in such a way as to encourage certain interpretations and to discourage others. For example, a news article can frame immigration as either a boost or a drain on the economy, and thus communicate very different interpretations of the same phenomenon. In this work, we argue that to explain framing devices we have to look at the way narratives are constructed. As a first step in this direction, we propose a framework that extracts events and their relations to other events, and groups them into high-level narratives that help explain frames in news articles. We show that our framework can be used to analyze framing in U.S. news for two different domains: immigration and gun control.

1909.13203 2026-06-18 cs.LG stat.ML

Learning transport cost from subset correspondence

从子集对应关系中学习运输成本

Ruishan Liu, Akshay Balsubramani, James Zou

发表机构 * Department of Electrical Engineering(电气工程系) Department of Genetics(遗传学系) Stanford University(斯坦福大学) Department of Biomedical Data Science(生物医学数据科学系)

AI总结 本文提出OT-SI算法,通过子集对应关系学习运输成本函数,有效提升多数据集对齐性能,在图像、婚姻匹配和单细胞RNA测序等任务中优于现有方法。

Journal ref International Conference on Learning Representations (ICLR 2020)

详情
AI中文摘要

学习对齐多个数据集是一个重要的问题,具有广泛的应用,尤其是在需要整合多个实验或纠正混杂因素时尤其有用。最优运输(OT)是一种系统的方法来对齐数据集,但应用OT的关键挑战在于需要指定一个准确捕捉两个数据集之间关系的运输成本函数。可靠的成本函数通常不可用,从业者常使用手工设计或欧几里得成本,即使这可能不恰当。在本文中,我们研究如何利用少量的侧信息来学习成本函数,这些侧信息通常可用。我们考虑的侧信息捕捉子集对应关系——即两个数据集中的某些点集已知相关。例如,我们可能在两个数据集中都有一些被标记为汽车的图像;或者我们可能在两个批次的单细胞数据中有一个共同注释的细胞类型。我们开发了一个端到端的优化器(OT-SI),能够通过Sinkhorn算法进行微分,并有效从侧信息中学习合适的成本函数。在系统性的图像、婚姻匹配和单细胞RNA测序实验中,我们的方法显著优于最先进的基准方法。

英文摘要

Learning to align multiple datasets is an important problem with many applications, and it is especially useful when we need to integrate multiple experiments or correct for confounding. Optimal transport (OT) is a principled approach to align datasets, but a key challenge in applying OT is that we need to specify a transport cost function that accurately captures how the two datasets are related. Reliable cost functions are typically not available and practitioners often resort to using hand-crafted or Euclidean cost even if it may not be appropriate. In this work, we investigate how to learn the cost function using a small amount of side information which is often available. The side information we consider captures subset correspondence -- i.e. certain subsets of points in the two data sets are known to be related. For example, we may have some images labeled as cars in both datasets; or we may have a common annotated cell type in single-cell data from two batches. We develop an end-to-end optimizer (OT-SI) that differentiates through the Sinkhorn algorithm and effectively learns the suitable cost function from side information. On systematic experiments in images, marriage-matching and single-cell RNA-seq, our method substantially outperform state-of-the-art benchmarks.

2606.19157 2026-06-18 eess.AS cs.CL 新提交

IndicContextEval: A Benchmark for Evaluating Context Utilisation in Audio Large Language Models Across 8 Indic Languages

IndicContextEval:评估8种印度语言音频大语言模型上下文利用能力的基准

Sakshi Joshi, Dhruv Subhash Rathi, Sanskar Singh, Eldho Ittan George, R J Hari, Kaushal Bhogale, Mitesh M. Khapra

发表机构 * AI4Bharat, Indian Institute of Technology Madras, India(AI4Bharat,印度理工学院马德拉斯分校) Sarvam AI, India(Sarvam AI,印度)

AI总结 提出IndicContextEval基准,包含8种印度语言555位说话人的56小时自然语音,通过7级提示框架评估音频大语言模型是否真正利用上下文而非依赖参数化知识。

Comments Accepted at Interspeech 2026

详情
AI中文摘要

音频大语言模型(AudioLLMs)能够基于文本提示(如领域描述或实体列表)进行语音识别。然而,尚不清楚这些模型是真正利用此类上下文,还是依赖预训练期间学到的参数化知识。现有基准无法回答这个问题,因为它们仅在固定提示条件下评估转录,且很少包含明确的上下文输入。我们引入IndicContextEval,这是一个56小时的多语言基准,包含来自8种印度语言和23个专业领域的555位说话人的自然语音。我们设计了一个7级提示框架,逐步引入上下文信号,包括元数据、自然语言描述、英语和本地文字的实体列表,以及包含错误实体的对抗性提示。评估五个模型揭示了上下文利用行为的显著差异,凸显了对音频大语言模型中上下文基础进行显式评估的必要性。

英文摘要

AudioLLMs enable speech recognition conditioned on textual prompts such as domain descriptions or entity lists. However, it remains unclear whether these models genuinely utilise such context or rely on parametric knowledge learned during pretraining. Existing benchmarks cannot answer this question because they evaluate transcription under fixed prompting conditions and rarely include explicit contextual inputs. We introduce IndicContextEval, a 56-hour multilingual benchmark of natural speech from 555 speakers across 8 Indian languages and 23 professional domains. We design a 7-level prompting framework that progressively introduces contextual signals, including metadata, natural-language descriptions, entity lists in English and native script, and adversarial prompts with incorrect entities. Evaluating five models reveals substantial differences in context utilisation behaviour, highlighting the need for explicit evaluation of contextual grounding in AudioLLMs.

2606.19092 2026-06-18 stat.AP cs.LG 新提交

Context-Aware Optimization of Follow-Up Intervals for Type 2 Diabetes Care Using Markov Decision Processes

使用马尔可夫决策过程对2型糖尿病护理随访间隔进行上下文感知优化

Parisa Lotfibagha, Kristen Miller, William J. Gallagher, Elizabeth B. Selden, Muge Capan

发表机构 * University of Massachusetts Amherst(马萨诸塞大学阿默斯特分校) National Center for Human Factors in Healthcare(医疗人因国家中心) Georgetown University School of Medicine(乔治城大学医学院) Medstar Georgetown University Hospital(Medstar乔治城大学医院)

AI总结 提出上下文马尔可夫决策过程模型,利用电子健康记录数据为2型糖尿病患者优化个性化随访间隔,识别低风险和高风险亚群,相比固定间隔策略显著降低预期累积成本。

详情
AI中文摘要

慢性病管理依赖于定期的医患互动来跟踪疾病进展和控制。对于2型糖尿病,当前指南对所有患者规定固定的初级保健随访间隔,忽略了临床轨迹和患者特征的异质性。本研究引入上下文马尔可夫决策过程模型,利用来自10个初级保健诊所的22,154名2型糖尿病患者的电子健康记录数据,优化亚群特定的随访间隔决策。上下文通过以下方式识别:i) 利用主成分分析对代表个体健康轨迹的变量进行降维,以及ii) 通过主成分和额外的患者层面特征使用聚类将患者分配到上下文中。出现了两个不同的上下文,分别代表低风险和高风险亚群。CMDP导出的策略建议:(i) 如果当前就诊的实验室值未测量,则在1个月内随访;(ii) 对于实验室值升高或近期住院,最多3个月;(iii) 对于持续血糖控制,6至12个月,高风险上下文患者的随访间隔更短。最优策略实现了比基准更低的预期累积成本(例如,在高共病上下文中,相对于美国糖尿病协会类似的固定间隔随访策略,CMDP策略降低了约34.8%的成本;在低共病上下文中降低了约6.4%)。这些发现展示了上下文感知方法如何为适应性随访策略提供信息,并有可能通过综合机器学习和概率决策模型来推进初级保健中的慢性病管理。

英文摘要

Chronic disease management relies on regular patient-provider interactions to follow-up on disease progression and control. For Type 2 Diabetes (T2D), current guidelines prescribe fixed time intervals between subsequent primary care visits for all patients, overlooking heterogeneity in clinical trajectories and patient characteristics. This study introduces a Contextual Markov Decision Process (CMDP) model to optimize subpopulation-specific follow-up interval decisions using Electronic Health Record (EHR) data from 22,154 T2D patients across 10 primary care clinics. Contexts are identified by: i) dimensionality reduction of variables representing the individual health trajectories utilizing Principal Component Analysis, and ii) assigning patients to contexts via principal components and additional patient-level features using clustering. Two distinct contexts emerged, representing a lower- and a higher-risk subpopulation. CMDP-derived policies recommend: (i) follow-up within 1 month if lab value at current visit is unmeasured; (ii) up to 3 months for elevated lab values or recent hospitalizations; and (iii) 6 to 12 months for sustained glycemic control, with shorter follow-up intervals for patients in high-risk context. The optimal policies achieved lower expected cumulative cost than benchmarks (e.g., in the higher-comorbidity context, the CMDP policy reduced cost by about 34.8%, and in the lower-comorbidity context by about 6.4%, relative to an American Diabetes Association-like fixed interval follow-up policy. These findings demonstrate how context-aware approaches can inform adaptive follow-up strategies, and have the potential to advance chronic care management in primary care by synthesizing machine learning and probabilistic decision models.

2606.19057 2026-06-18 stat.ML cs.LG stat.CO stat.ME 新提交

Quantifying and Auditing LLM Evaluation via Positive--Unlabeled Learning

通过正-无标签学习量化与审计大语言模型评估

Zilong Zhang, Yi-Ting Hung, Lei Ding, Chi-Kuang Yeh

发表机构 * Department of Mathematics and Statistics(数学与统计学系) Georgia State University(佐治亚州立大学) Department of Statistics(统计学系) University of Manitoba(曼尼托巴大学)

AI总结 针对大语言模型作为评估者存在的系统性偏差(如冗长偏好),提出基于部分最优传输的几何审计框架,利用少量人工验证正样本校正偏差,无需重训练即可提升与人类偏好的一致性。

详情
AI中文摘要

大语言模型(LLM)越来越多地被用作可扩展评估的评判者,然而这种LLM作为评判者的系统表现出与语义质量脱节的系统性偏差,最显著的是冗长偏差。同时,人工监督成本高昂且通常具有选择性,产生可靠的正向判断,但大多数输出未被标记且质量可能参差不齐。我们将选择性人工监督下的LLM评估形式化为一个正-无标签学习问题,并提出了一个基于部分最优传输的几何审计框架。通过在固定嵌入空间中将一小部分人工验证的正样本与可靠的无标签输出子集对齐,我们的方法识别出与人类一致的偏好,并在无需重新训练的情况下纠正有偏的评判者。实验表明,该方法提高了与人类偏好的一致性,增强了对呈现偏差的鲁棒性,并提供了可解释的置信度估计,为现有的LLM作为评判者流程提供了一种可扩展且统计上有依据的替代方案。

英文摘要

Large Language Models (LLMs) are increasingly used as judges for scalable evaluation, yet such LLM--as--a--Judge systems exhibit systematic biases that are decoupled from semantic quality, most notably verbosity bias. Meanwhile, human supervision is costly and typically selective, yielding reliable positive judgments but leaving most outputs unlabelled and potentially mixed in quality. We formulate LLM evaluation under selective human supervision as a positive--unlabelled learning problem and propose a geometric auditing framework based on Partial Optimal Transport. By aligning a small set of human--verified positives with a reliable subset of unlabelled outputs in a fixed embedding space, our method identifies human--consistent preferences and corrects biased judges without retraining. Experiments demonstrate improved alignment with human preferences, increased robustness to presentation biases, and interpretable confidence estimates, offering a scalable and statistically grounded alternative to existing LLM--as--a--judge pipelines.

2606.18734 2026-06-18 eess.SP cs.LG 新提交

Point-Cloud-Assistant Localized Statistical Channel Prediction by Tangent Gaussian Splatting

点云辅助的切线高斯溅射局部统计信道预测

Ye Xue, Yiheng Wang, Xinhua Shao, Qi Yan, Shutao Zhang, Tsung-Hui Chang

发表机构 * China Telecom(中国电信)

AI总结 提出点云辅助切线高斯溅射(PC-TGS)框架,通过融合稀疏无线电测量与密集LiDAR几何数据,将角功率谱外推到未测量网格,实现大规模无线数字孪生中的高效信道预测。

详情
AI中文摘要

准确、特定地点的信道信息对于优化下一代无线网络至关重要。在各种方法中,局部统计信道建模(LSCM)通过从参考信号接收功率(RSRP)测量中建模信道多径角功率谱(APS),已成为一种针对高效网络优化的最先进方法。然而,尽管其有效性,LSCM无法在绝大多数没有测量值的位置预测APS,这严重限制了其在大规模真实场景中的适用性。为了解决这一挑战,我们提出了\emph{点云辅助切线高斯溅射}(PC-TGS),这是第一个通过将稀疏无线电测量与密集的基于LiDAR的几何信息相结合,将APS\emph{外推}到未测量室外网格的框架。PC-TGS将环境散射体表示为各向异性的3D高斯分布,通过原始点云的松弛均值重新参数化进行初始化和细化。切线平面投影将每个高斯分布精确映射到局部角度域,而深度感知的电磁溅射过程聚合它们的贡献。为了确保实际部署,我们推导了用于APS bin积分的闭式高斯加权平均(GWA),并提供了可证明的误差界。在LiDAR扫描的城市规模数据集(500万个点,6310个RSRP样本)上的评估表明,与最先进的基线相比,PC-TGS在APS和RSRP预测性能上更优,并且在外推APS任务中推理时间更快。这些结果突显了PC-TGS在大规模无线数字孪生中实现几何感知和数据高效信道预测的潜力。

英文摘要

Accurate, site-specific channel information is crucial for optimizing next-generation wireless networks. Among various approaches, localized statistical channel modeling (LSCM), which models the channel multipath angular power spectrum (APS) from the reference signal received power (RSRP) measurement, has emerged as a state-of-the-art method tailored for efficient network optimization. However, despite its effectiveness, LSCM cannot predict APS at the vast majority of locations where no measurements are available, which significantly restricts its applicability in large-scale, real-world scenarios. To address this challenge, we present \emph{point-cloud-assisted tangent Gaussian splatting} (PC-TGS), the first framework to \emph{extrapolate} APS to unmeasured outdoor grids by integrating sparse radio measurements with dense LiDAR-based geometry. PC-TGS represents environmental scatterers as anisotropic 3D Gaussians, initialized and refined through a relaxed-mean reparameterization of the raw point cloud. A tangent-plane projection accurately maps each Gaussian into the local angular domain, while a depth-aware electromagnetic splatting process aggregates their contributions. To ensure practical deployment, we derive a closed-form Gaussian-weighted average (GWA) for APS bin integration and provide a provable error bound. { Evaluations on a LiDAR-scanned city-scale dataset (5M points, 6,310 RSRP samples) demonstrate that PC-TGS achieves better APS and RSRP prediction performance compared to state-of-the-art baselines and faster inference time for APS extrapolation task. These results highlight the potential of PC-TGS to enable geometry-aware and data-efficient channel prediction in large-scale wireless digital twins.

2606.18729 2026-06-18 stat.ML cs.LG 新提交

TimeLAVA: Learning-Agnostic Data Valuation for Time Series

TimeLAVA: 时间序列的学习无关数据估值

Wenqin Liu, Weizhi Quan, Aoqi Zuo, Erdun Gao, Vu Nguyen, Dino Sejdinovic, Howard Bondell, Mingming Gong

发表机构 * School of Mathematics and Statistics, The University of Melbourne(墨尔本大学数学与统计学学院) Statistics, The University of Melbourne(墨尔本大学统计学系) Statistics, University of Sydney(悉尼大学统计学系) Responsible AI Research Centre, Australian Institute for Machine Learning(澳大利亚机器学习研究所负责任人工智能研究中心) Amazon(亚马逊) School of Mathematical Sciences, Adelaide University(阿德莱德大学数学科学学院) Department of Machine Learning, MBZUAI(MBZUAI机器学习系)

AI总结 提出TimeLAVA,一种学习无关框架,通过小波变换和最优传输评估时间序列片段对分布差异的边际贡献,无需模型训练,在异常检测、数据剪枝和标签噪声检测中优于现有方法。

Comments 34pages

Journal ref ICML2026

详情
AI中文摘要

数据估值量化单个样本的内在质量,以实现原则性的数据整理、质量控制和鲁棒学习。对于医疗、金融和工业监控等关键领域的时间序列,有效的估值方法至关重要但基本缺乏。现有方法要么依赖于模型,限制了其泛化性,要么针对独立同分布数据设计,因此无法捕捉序列数据固有的时间依赖性、多尺度模式和非平稳动态。我们引入了TimeLAVA,一种学习无关框架,通过评估时间片段对最小化评估数据与参考数据之间分布差异的边际贡献来估值。其核心是一种新颖的基于选择性小波的Wasserstein差异,结合了用于时间定位的多尺度小波变换和用于对分布偏移具有鲁棒性的非平衡最优传输。通过敏感性分析高效计算片段值,无需模型训练,并聚合成逐点得分。我们提供了将估值与模型无关泛化联系起来的理论保证,并证明了对异常值污染的有界敏感性。在异常检测、数据剪枝和标签噪声检测上的大量实验表明,TimeLAVA在多样化的真实世界数据集上产生了比现有方法显著更具信息量的价值分数。

英文摘要

Data valuation quantifies the intrinsic quality of individual samples to enable principled data curation, quality control, and robust learning. For time series in critical domains such as healthcare, finance, and industrial monitoring, effective valuation methods are essential yet fundamentally lacking. Existing approaches are either model-dependent, limiting their generalizability, or designed for i.i.d. data and thus fail to capture temporal dependencies, multi-scale patterns, and non-stationary dynamics inherent to sequential data. We introduce TimeLAVA, a learning-agnostic framework that values temporal segments by their marginal contribution to minimizing distributional discrepancy between evaluated and reference data. At its core is a novel Selective Wavelet-based Wasserstein discrepancy combining multi-scale wavelet transforms for temporal localization with unbalanced optimal transport for robustness to distributional shifts. Segment values are efficiently computed via sensitivity analysis without requiring model training and aggregated into point-wise scores. We provide theoretical guarantees linking valuation to model-agnostic generalization and prove bounded sensitivity to outlier contamination. Extensive experiments across anomaly detection, data pruning, and label noise detection demonstrate that TimeLAVA produces significantly more informative value scores than existing methods on diverse real-world datasets.

2606.18645 2026-06-18 eess.AS cs.AI 新提交

Augmenting Dysarthric Speech Severity Assessment with MOS Supervision

通过MOS监督增强构音障碍语音严重程度评估

Kaimeng Jia, Minzhu Tu, Zengrui Jin, Siyin Wang, Chao Zhang

发表机构 * Tsinghua University(清华大学) Beijing University of Posts(北京邮电大学)

AI总结 提出利用语音合成评估数据(QualiSpeech语料库的MOS标签)增强构音障碍语音评估,微调提升可懂度和自然度预测,联合训练主要提升自然度,减少对临床标注的依赖。

Comments arXiv admin note: substantial text overlap with arXiv:2511.02270

详情
AI中文摘要

构音障碍是一种以可懂度和交际有效性降低为特征的言语障碍。自动的构音障碍语音话语级评估可以支持可扩展的语音监测和治疗相关分析。然而,训练此类系统受到临床标注构音障碍语音稀缺的瓶颈限制。本工作提出利用语音合成评估数据,特别是来自QualiSpeech语料库的带有平均意见得分(MOS)标签的人工标注话语,来增强构音障碍语音评估。实验表明,在语音合成评估数据上微调持续提高了可懂度和自然度预测的性能,而联合训练主要在自然度上带来提升。这些结果表明,合成伪影和构音障碍语音共享感知共性,语音合成评估语料库提供了一种实用的增强来源,减少了对稀缺临床标注的依赖。

英文摘要

Dysarthria is a speech disorder marked by reduced intelligibility and communicative effectiveness. Automatic utterance-level assessment of dysarthric speech can support scalable speech monitoring and therapy-related analysis. Yet training such systems is bottlenecked by the scarcity of clinically annotated dysarthric speech. This work proposes to augment dysarthric speech assessment using data from speech synthesis evaluations, specifically human-annotated utterances with Mean Opinion Score (MOS) labels from the QualiSpeech corpus. Experiments show that fine-tuning on speech synthesis assessment data consistently improves performance on both intelligibility and naturalness prediction, while joint training yields gains primarily on naturalness. These results suggest that synthesis artifacts and dysarthric speech share perceptual commonalities, and speech synthesis evaluation corpora offer a practical augmentation source that reduces reliance on scarce clinical annotations.

2606.18567 2026-06-18 stat.ML cs.LG stat.AP stat.ME 新提交

Bridging Data Gaps in Structural Fragility Modeling through Transfer Learning: Methodology and Case Studies

通过迁移学习弥合结构易损性建模中的数据空白:方法与案例研究

Narges Saeednejad, Jamie Ellen Padgett

发表机构 * Department of Civil and Environmental Engineering, Rice University(Rice大学土木与环境工程系) Ken Kennedy Institute, Rice University(Rice大学肯尼迪研究所)

AI总结 提出以方法为中心的迁移学习框架,解决领域偏移、类别不平衡和目标标签稀缺问题,通过三个案例验证其在低数据场景下提升失效检测与预测稳定性的有效性。

Comments 24 pages, 12 figures

详情
AI中文摘要

本文提出了一个以方法为中心的迁移学习框架,用于在领域偏移、类别不平衡和目标标签稀缺的情况下进行易损性自适应,同时保持工程可解释性并支持不确定性下的决策。通过三个互补的案例研究展示了四种迁移学习策略(基于实例、基于参数、分层贝叶斯和多源):(i) 基于实例的迁移学习通过重要性加权,利用卡特里娜飓风观测数据演示了沿海桥梁易损性;(ii) 基于参数的迁移学习结合分层贝叶斯迁移学习,实现了跨层的部分合并和后验不确定性量化,利用伊恩飓风观测数据演示了住宅建筑易损性;(iii) 多源迁移学习融合多个分析易损性模型,学习源权重并进行正则化的目标域自适应,利用2001年尼斯夸利地震观测数据演示了地震桥梁易损性。在这些案例研究中,直接迁移源模型(即使用现有最先进模型)在领域偏移和严重类别不平衡下失败,而有针对性的自适应在低数据场景下显著提高了失效检测和预测稳定性。这些发现强调了在开发和自适应易损性模型时,需要对诊断、策略选择和不确定性报告提供系统指导。

英文摘要

This paper presents a methodology-centered transfer learning framework for fragility adaptation under domain shift, class imbalance, and scarce target labels while preserving engineering interpretability and supporting decision-making under uncertainty. Four transfer learning strategies (instance-based, parameter-based, hierarchical Bayesian, and multi-source) are demonstrated through three complementary case studies: (i) instance-based transfer learning via importance weighting, demonstrated on coastal bridge fragility using Hurricane Katrina observations; (ii) parameter-based transfer learning together with hierarchical Bayesian transfer learning, enabling partial pooling across strata and posterior uncertainty quantification, demonstrated on residential building fragility using Hurricane Ian observations; and (iii) multi-source transfer learning that fuses multiple analytical fragility models with learned source weights and regularized target-domain adaptation, demonstrated on seismic bridge fragility using observations from the 2001 Nisqually earthquake. Across these case studies, direct transfer of source models (i.e. using existing state-of-the-art models) fails under domain shift and severe class imbalance, while targeted adaptation substantially improves failure detection and predictive stability in low-data regimes. These findings highlight the need for systematic guidance on diagnostics, strategy selection, and uncertainty reporting when developing and adapting fragility models.

2606.18531 2026-06-18 stat.ML cs.LG 新提交

When Does Trajectory-Level Supervision Permit Efficient Offline Reinforcement Learning?

轨迹级监督何时允许高效的离线强化学习?

Xuanfei Ren, Tengyang Xie

发表机构 * University of Wisconsin-Madison(威斯康星大学麦迪逊分校)

AI总结 本文研究离线强化学习中仅使用轨迹级结果(如累积回报或偏好)进行策略优化的统计理论,提出OPAC算法并证明其样本复杂度,同时揭示在非线性聚合目标下存在的统计障碍。

Comments 69 pages

详情
AI中文摘要

离线强化学习通常在过程级奖励监督下进行分析,然而许多序列决策数据集仅记录轨迹级结果。我们发展了从这种结果级监督进行离线策略优化的统计理论。首先研究规范设置,其中目标仍是期望累积奖励,但每个离线轨迹仅提供一个标量标签,其条件均值是累积回报。我们提出OPAC,一种悲观演员-评论家算法,它学习潜在奖励模型并从轨迹级标签优化策略。我们证明了阶为$\widetilde O(H^2\sqrt{C_{sa}(\pi^\star)/n})$的高概率保证和匹配的下界,刻画了用单个轨迹级标签替代过程级奖励的尖锐统计代价。然后我们将该原理扩展到基于偏好的反馈,在偏好模型常数范围内保留了领先的视界和可集中性依赖。最后,我们研究广义基于结果的离线强化学习,其中监督和目标都是由潜在每步奖励的非线性聚合引起的轨迹级量。该问题通常不可学习:对于全成功目标,即使具有确定性转移和常数可集中性,任何离线学习器可能需要$\Omega(2^H)$个轨迹。然后我们通过两个结构系数$\kappa_\mu(\sigma)$和$\chi_\mu(\sigma)$识别出一个可处理的区域,这两个系数捕捉了结果聚合和广义贝尔曼更新中的信息损失,在此区域广义OPAC实现了多项式样本复杂度。我们的结果共同描绘了何时结果级监督能够实现样本高效的离线控制,以及何时缺失过程级奖励会带来根本性的统计障碍。

英文摘要

Offline reinforcement learning is typically analyzed under process-level reward supervision, yet many sequential decision datasets record only trajectory-level outcomes. We develop a statistical theory for offline policy optimization from such outcome-level supervision. We first study the canonical setting where the target remains the expected cumulative reward, but each offline trajectory provides only a scalar label whose conditional mean is the cumulative return. We propose OPAC, a pessimistic actor-critic algorithm that learns a latent reward model and optimizes a policy from trajectory-level labels. We prove a high-probability guarantee of order $\widetilde O(H^2\sqrt{C_{sa}(π^\star)/n})$ and a matching lower bound, characterizing the sharp statistical cost of replacing process-level rewards with one trajectory-level label. We then extend the principle to preference-based feedback, preserving the leading horizon and concentrability dependence up to preference-model constants. Finally, we study generalized outcome-based offline RL, where both the supervision and the objective are trajectory-level quantities induced by a nonlinear aggregation of latent per-step rewards. This problem is not learnable in general: for all-success objectives, any offline learner may require $Ω(2^H)$ trajectories even with deterministic transitions and constant concentrability. We then identify a tractable regime through two structural coefficients, $κ_μ(σ)$ and $χ_μ(σ)$, capturing information loss in outcome aggregation and generalized Bellman updates, under which generalized OPAC achieves polynomial sample complexity. Together, our results delineate when outcome-level supervision enables sample-efficient offline control and when missing process-level rewards create fundamental statistical barriers.

2606.18527 2026-06-18 stat.ML cs.LG 新提交

Toward Simultaneously Optimal Regret in U-Calibration

面向同时最优遗憾的U-校准

Rafael Frongillo, Haipeng Luo, Nishant A. Mehta, Jon Schneider

发表机构 * University of Colorado Boulder(科罗拉多大学波德穆尔分校) University of Southern California(南加州大学) Google Research(谷歌研究)

AI总结 提出一种基于自和谐噪声的FTPL变体,实现对所有有界适当损失的最优$\tilde O(\sqrt{T})$遗憾和对光滑损失的对数遗憾。

Comments 30 pages; to appear at COLT 2026

详情
AI中文摘要

U-校准研究在线预测算法,其预测可被任何未知下游智能体使用,同时保证对所有适当损失函数的次线性遗憾。现有U-校准算法对每个有界适当损失实现了最坏情况最优的$O(\sqrt{T})$遗憾,但它们未能适应更简单的损失:如我们所示,即使对于平方损失等光滑损失,它们也会产生$\Omega(\sqrt{T})$遗憾,而不是最优的$O(\log T)$遗憾。在这项工作中,我们表明这一局限性并非固有。具体来说,我们设计了一个单一的预测算法,同时对所有有界适当损失实现$\tilde O(\sqrt{T})$遗憾,并对所有有界光滑适当损失实现$O(\log T)$遗憾。更一般地,我们的算法还对于相对于对数障碍光滑的损失(包括几个非Lipschitz例子)实现了对数遗憾。我们的方法基于一种新颖的跟随扰动领导者(FTPL)变体,其中使用自和谐噪声直接在预测空间中应用扰动。由于这种噪声的复杂性质,所得分析也大大偏离了先前的FTPL分析,可能具有独立意义。

英文摘要

U-calibration studies online forecasting algorithms whose predictions can be consumed by any unknown downstream agent, guaranteeing sublinear regret simultaneously for all proper loss functions. Existing U-calibration algorithms achieve worst-case optimal $O(\sqrt{T})$ regret for every bounded proper loss, but they fail to adapt to easier losses: as we show, even for smooth losses such as squared loss, they incur $Ω(\sqrt{T})$ regret instead of the optimal $O(\log T)$ regret. In this work, we show that this limitation is not inherent. Specifically, we design a single forecast algorithm that simultaneously achieves $\tilde O(\sqrt{T})$ regret for every bounded proper loss and $O(\log T)$ regret for every bounded smooth proper loss. More generally, our algorithm also attains logarithmic regret for losses that are smooth relative to the log-barrier, which include several non-Lipschitz examples. Our approach is based on a novel variant of Follow-the-Perturbed-Leader (FTPL) in which perturbations are applied directly in the prediction space using self-concordant noise. The resulting analysis also departs substantially from prior FTPL analyses due to the complex nature of this noise and may be of independent interest.

2606.18523 2026-06-18 q-bio.QM cs.CV 新提交

DART: A design-aware microfluidic chip paradigm for real-time live-cell image analysis

DART: 一种设计感知的微流控芯片范式用于实时活细胞图像分析

Johannes Seiffarth, Matthias Pesch, Lukas Scholtes, Dietrich Kohlheyer, Hanno Scharr, Katharina Nöh

发表机构 * Institute for Bio- and Geosciences, IBG-1: Biotechnology(生物与地质科学研究所,IBG-1:生物技术) Computational Systems Biotechnology (AVT.CSB), RWTH Aachen University(计算系统生物技术(AVT.CSB),亚琛工业大学) Institute for Advanced Simulation, IAS-8: Data Analytics and Machine Learning(先进模拟研究所,IAS-8:数据分析与机器学习)

AI总结 提出DART范式,通过嵌入式标记和深度学习检测对齐CAD蓝图与物理芯片,实现高通量微流控芯片中所有感兴趣区域的快速定位和全自动图像处理,支持实时分析。

详情
AI中文摘要

高通量微流控活细胞成像产生丰富的单细胞数据。然而,用于定位每个包含一个细胞群体的感兴趣区域(RoI)并从记录图像中移除周围微流控结构的半自动化流程随RoI数量扩展,这阻碍了实时图像分析并将洞察时间延迟数小时至数天。我们提出了用于微流控培养芯片的设计感知和实时能力(DART)范式,该范式将CAD蓝图与物理芯片对齐,从而实现了对所有RoI的通量无关定位以及跨不同RoI几何形状和芯片布局的全自动图像处理。DART通过嵌入式基准标记和基于深度学习的标记检测建立这种对齐。我们使用瑞士军刀芯片验证DART,该芯片在1164个RoI位置上组合了八种结构不同的RoI设计。DART在五分钟内定位所有RoI,在40毫秒内从原始显微镜图像中移除微流控结构,并在每张图像1.1秒内执行全自动图像分析,包括细胞分割。这些能力共同使DART成为一个端到端的硬件-软件范式,具有实时分析能力,为闭环和结果驱动的智能显微镜铺平了道路。

英文摘要

High-throughput microfluidic live-cell imaging generates rich single-cell data. Yet semi-automated procedures for locating regions of interest (RoIs), each containing one cell population, and removing surrounding microfluidic structures from recorded images, scale with the number of RoIs. This prevents real-time image analysis and delays time-to-insight by hours to days. We introduce the Design-Aware and Real-Time capable (DART) paradigm for microfluidic cultivation chips, which aligns the CAD blueprint with the physical chip and thereby enables throughput-independent localization of all RoIs and fully automated image processing across diverse RoI geometries and chip layouts. DART establishes this alignment through embedded fiducial markers and deep-learning-based marker detection. We validate DART using the Swiss Army Knife chip, which combines eight structurally distinct RoI designs across 1164 RoI locations. DART localizes all RoIs in five minutes, removes microfluidic structures from raw microscopy images in 40 ms, and performs fully automated image analysis, including cell segmentation, in under 1.1 s per image. Together, these capabilities establish DART as an end-to-end hardware-software paradigm with real-time-capable analysis that paves the way toward closed-loop and outcome-driven smart microscopy.

2606.18520 2026-06-18 stat.ML cs.CG cs.CL cs.DS cs.IR cs.LG 新提交

Compact Geometric Representations of Hierarchies

层次结构的紧凑几何表示

Prashant Gokhale, Piotr Indyk, Yuhao Liu, Sandeep Silwal, Tony Chang Wang, Haike Xu

发表机构 * UW-Madison(威斯康星大学麦迪逊分校) MIT(麻省理工学院)

AI总结 研究如何用低维几何嵌入表示有向无环图中的祖先-后代关系,提出基于树宽等结构参数的维度上界和下界,并在真实数据集上验证了紧凑性。

Comments Published at the 39th Annual Conference on Learning Theory (COLT) 2026. 22 Pages

详情
AI中文摘要

计算数据的几何表示是现代机器学习的基石,通常通过训练双编码器将查询和文档映射到共享嵌入空间来实现。You等人[NeurIPS '25]的最新工作将这种方法扩展到层次检索,其中相关性由有向无环图(DAG)中的祖先-后代关系决定。虽然先前的工作表明当后代数量较少时存在有效嵌入,但这些界限对于深层层次结构会严重退化,所需维度与节点总数相当。在本文中,我们研究了更一般图类的紧凑可达性嵌入,并提供了使用维度依赖于结构图参数的嵌入来表示层次结构的理论保证。我们证明,对于任何有向树,存在常数维度3的可达性嵌入,与树的大小或深度无关。我们将这一结果推广到以树宽$t$为特征的图,构造了维度为$O(t \log n)$的嵌入,其中$n$是节点数。作为这些上界的补充,我们提供了匹配或接近匹配的下界,表明对于一般DAG,维度$\Omega(n)$是必要的,而对于树宽为$t$的图,需要$\Omega(t/\log(n/t))$的维度。我们还获得了由DAG中交叉边数量参数化的上界和下界。此外,我们展示了我们的嵌入可以在真实世界数据集上构建,并且与先前具有理论保证的嵌入相比,在高召回率情况下维度小得多。

英文摘要

Computing geometric representations of data is a cornerstone of modern machine learning, typically achieved by training dual encoders which map queries and documents into a shared embedding space. Recent work of You et al. [NeurIPS '25] has extended this approach to hierarchical retrieval, where relevance is determined by the ancestor-descendant relationships in a Directed Acyclic Graph (DAG). While previous work has shown that valid embeddings exist when the number of descendants is small, these bounds degrade significantly for deep hierarchies, requiring dimensions as large as the total number of nodes. In this paper, we investigate compact reachability embeddings for more general graph classes and provide theoretical guarantees for representing hierarchies using embeddings whose dimension depends on structural graph parameters. We prove that for any directed tree, there exists a reachability embedding in constant dimension 3, independent of the tree's size or depth. We generalize this result to graphs characterized by treewidth $t$, constructing embeddings of dimension $O(t \log n)$, where $n$ is the number of nodes. Complementing these upper bounds, we provide matching or near-matching lower bounds, showing that dimension $Ω(n)$ is necessary for general DAGs and $Ω(t/\log(n/t))$ is required for graphs of treewidth $t$. We also obtain upper and lower bounds parameterized by the number of cross-edges in the DAG. We additionally show that our embeddings can be constructed on real world datasets, and that they give much smaller dimensions in high recall regimes compared to prior embeddings with theoretical guarantees.

2606.18480 2026-06-18 eess.AS cs.SD 新提交

Generalised Transcoding Framework for Arbitrary Spatial Audio Capture and Playback Formats

任意空间音频采集与回放格式的通用转码框架

Archontis Politis, Janani Fernandez, Leo McCormack

发表机构 * Faculty of Information Technology and Communication Sciences, Tampere University(信息科技与通讯科学学院,塔尔库大学) Department of Information and Communications Engineering, Aalto University(信息与通讯工程系,阿尔托大学)

AI总结 提出一种统一框架,通过估计时频域空间元数据(包括主成分和环境成分的角功率分布),实现从Ambisonic或原始麦克风阵列信号到任意目标回放格式的转码,支持独立旋转,实验证明其优于现有参数化渲染器。

Comments This work has been submitted to IEEE/ACM Transactions on Audio, Speech, and Language Processing for possible publication

详情
AI中文摘要

本文介绍了一种统一框架,用于对以Ambisonic信号或原始麦克风阵列信号形式捕获的空间声场景进行参数化分析和再现。所提出的方法估计时频相关的空间元数据,该元数据表征可变数量的主源分量和具有自身角功率分布的环境分量,其参数拟合捕获信号的观测空间协方差。该元数据用于构建目标回放格式的空间协方差,然后用于推导最优混合矩阵,以将场景转码用于目标再现系统上的回放。该方法还独立处理采集和回放设置的旋转。在听力测试中,使用来自Ambisonic、球形和头戴式阵列的模拟场景,比较了该方法的实时实现和其他现有的最先进参数化渲染器。结果突出了所提出框架在多种内容和接收器配置下的感知优势,特别是对于低阶和几何约束的麦克风阵列。

英文摘要

This article introduces a unified framework for the parametric analysis and reproduction of spatial sound scenes captured either as Ambisonic signals or as raw microphone array signals. The proposed method estimates time-frequency-dependent spatial metadata that characterises a variable number of primary source components and an ambience component with its own angular power distribution, whose parameters fit the observed spatial covariances of the captured signals. This metadata is used to construct spatial covariances of the target playback formats, which are then used to derive optimal mixing matrices for transcoding the scene for playback over the target reproduction system. The method additionally handles independent rotations of both capture and playback setups. Real-time implementations of the method and other existing state-of-the-art parametric renderers are compared in a listening test using simulated scenes from Ambisonic, spherical, and head-worn arrays. The results highlight perceptual benefits of the proposed framework across a diverse range of content and receiver configurations, particularly for lower-order and geometrically constrained microphone arrays.

2606.18467 2026-06-18 stat.ML cs.LG 新提交

ToolChain-CRC: Conformal Risk Control for Agentic AI Under Retrieval and Tool-Use Drift

ToolChain-CRC: 检索与工具使用漂移下代理型AI的共形风险控制

Jeffery Opoku, David Banahene

发表机构 * The University of Texas Rio Grande Valley(德克萨斯大学里奥格兰德谷分校) Florida International University(佛罗里达国际大学)

AI总结 针对检索增强和工具使用代理在漂移下的风险控制问题,提出ToolChain-CRC方法,通过构建轨迹级风险评分并校准接受或干预规则,实现可证明的轨迹级风险控制。

Comments 26 pages, 11 figures

详情
AI中文摘要

现代AI代理检索文档、调用工具、检查中间信息,然后产生最终答案或行动。这产生了一个仅从最终答案无法察觉的风险控制问题。即使检索薄弱、工具输出错误或早期步骤缺乏支持,最终响应也可能看起来可接受。我们提出ToolChain-CRC,一种针对漂移下检索增强和工具使用代理的共形风险控制方法。该方法将每次代理运行视为动作、观察和最终输出的完整轨迹。它构建步骤级风险评分,将其组合成轨迹风险评分,校准接受或干预规则,并添加一个随时报警,可在最终答案前停止风险运行。我们在可交换校准运行下证明了轨迹级风险控制,给出了具有可审计常数的漂移感知扩展,并通过超鞅构造证明了随时升级规则。实验涵盖合成工具链漂移、RAG/工具使用压力测试、基于SQuAD的公共检索任务、无API代理问答案例研究、消融实验、目标风险敏感性检查、20种子鲁棒性检查、漂移边界审计以及实时RAG/工具使用代理基准。在这些设置中,仅基于最终答案的校准可能遗漏检索和工具故障,而轨迹级校准将接受轨迹的风险保持在目标之下。

英文摘要

Modern AI agents retrieve documents, call tools, check intermediate information, and then produce a final answer or action. This creates a risk-control problem that is not visible from the final answer alone. A final response may look acceptable even when the retrieval was weak, a tool output was wrong, or an earlier step was unsupported. We propose ToolChain-CRC, a conformal risk-control method for retrieval-augmented and tool-using agents under drift. The method treats each agent run as a full trajectory of actions, observations, and final output. It builds step-level risk scores, combines them into a trajectory risk score, calibrates an accept-or-intervene rule, and adds an anytime alarm that can stop risky runs before the final answer. We prove trajectory-level risk control under exchangeable calibration runs, give a drift-aware extension with auditable constants, and prove an anytime escalation rule through a supermartingale construction. Experiments cover synthetic tool-chain drift, RAG/tool-use stress tests, public SQuAD-derived retrieval tasks, an API-free agentic QA case study, ablations, target-risk sensitivity checks, 20-seed robustness checks, a drift-margin audit, and a live RAG/tool-use agent benchmark. Across these settings, final-answer-only calibration can miss retrieval and tool failures, while trajectory-level calibration keeps accepted-trajectory risk below the target.

2606.18402 2026-06-18 eess.SP cs.AI cs.AR cs.SY eess.SY 新提交

Deep-Learning-Based Pixelated Microwave Filter Design and Characterization using Electro-Optical Electric-Field Measurements

基于深度学习的像素化微波滤波器设计与表征:利用电光电场测量

Han Zhou, Richard Bannister, Caspar Pierce, Haojie Chang, David Widen, Ludvig Fornstedt, Gabriel Melin, Alexander Bohlin, Pontus Lindeberg Fredriksson, Dilbagh Singh, Christian Fager, Koen Buisman

发表机构 * Chalmers University of Technology(查尔姆斯理工大学) Advanced Technology Institute, University of Surrey(萨里大学先进科技研究所) National Physical Laboratory(国家物理实验室)

AI总结 提出结合卷积神经网络与遗传算法的深度学习方法,自动合成像素化微波滤波器,通过S参数和空间电场测量实验验证,实现7 GHz通带和9.5 GHz以上超过20 dB抑制,首次用电光测量揭示AI生成设计的电场模式。

详情
AI中文摘要

传统微波滤波器设计通常依赖迭代参数调整和预定义拓扑,这限制了设计空间并增加了开发时间。本研究采用深度学习方法,结合卷积神经网络与遗传算法,自动合成像素化微波滤波器。为实验验证该方法,分析了S参数和空间电场测量。合成的低通滤波器在仿真与实测性能之间表现出极好的一致性,实现了7 GHz通带,并在9.5 GHz以上具有超过20 dB的抑制。电光测量首次揭示了类似于耦合传输线或短截线结构的电场模式,为AI生成设计的涌现特性提供了见解。

英文摘要

Traditional microwave filter design typically relies on iterative parameter tuning and predefined topologies, which limits design space and increases development time. This study uses a deep learning approach combining convolutional neural networks with genetic algorithms to automate pixelated microwave filter synthesis. To validate the approach experimentally, both S-parameter and spatial electric-field measurements were analyzed. The synthesized low-pass filter demonstrated excellent agreement between simulated and measured performance, achieving a 7 GHz passband with over 20 dB suppression beyond 9.5 GHz. Electro-optical measurements, for the first time, revealed electric field patterns that resemble coupled transmission-lines or stub structures, providing insight into the emergent characteristics of AI-generated designs.

2606.18395 2026-06-18 eess.SP cs.AI cs.AR cs.SY eess.SY 新提交

Deep Learning-Driven Inverse Design of Doherty Power Amplifiers Using Pixelated Combiners and Dual-State Impedance Synthesis

基于深度学习的Doherty功率放大器逆向设计:使用像素化合成器和双态阻抗合成

Han Zhou, Haojie Chang, David Widen, Christian Fager

发表机构 * Tampere University(塔尔皮奥大学) Chalmers University of Technology(挑战者技术大学)

AI总结 提出一种结合深度卷积神经网络、像素化布局和遗传算法的三端口Doherty合成器设计方法,实现峰值和回退功率条件下的双态阻抗合成,在2.6-2.8 GHz频段内饱和输出功率>44.2 dBm,峰值漏极效率>71.2%。

详情
AI中文摘要

Doherty功率放大器(PA)的输出合成器将负载调制、阻抗匹配和相位补偿集成在一个网络中,使其设计和合成极具挑战性。本文提出了一种三端口Doherty合成器设计方法,结合深度卷积神经网络(CNN)、像素化布局表示和遗传算法(GA)与双态阻抗合成,以同时处理峰值和回退功率条件。作为概念验证,设计并制作了两款采用三端口像素化合成器的GaN HEMT Doherty PA原型。两款原型在2.6-2.8 GHz范围内均实现了超过44.2 dBm的实测饱和输出功率,峰值漏极效率高于71.2%。此外,在6-dB回退水平下测得的漏极效率高达64%。应用数字预失真后,每个原型的邻道泄漏比(ACLR)优于-51.3 dBc。

英文摘要

The output combiner of a Doherty power amplifier (PA) integrates load modulation, impedance matching, and phase compensation within a single network, making its design and synthesis highly challenging. In this paper, we propose a three-port Doherty combiner design methodology that combines deep convolutional neural networks (CNNs), pixelated layout representations, and genetic algorithms (GA) with dual-state impedance synthesis to address both peak and back-off power conditions. As a proof of concept, two GaN HEMT Doherty PA prototypes incorporating three-port pixelated combiners are designed and fabricated. Both prototypes achieve a measured saturated output power exceeding 44.2 dBm with peak drain efficiency above 71.2% within 2.6-2.8 GHz. Furthermore, a drain efficiency as high as 64% is measured at the 6-dB back-off level. After applying digital predistortion, each prototype achieves an adjacent channel leakage ratio (ACLR) better than -51.3 dBc.

2606.18354 2026-06-18 eess.IV cs.LG 新提交

Structural MRI Synthesis for Alzheimer's Disease via Conditional Diffusion on Anatomical Masks

基于解剖掩膜条件扩散的阿尔茨海默病结构MRI合成

Muge Zhang, Muhammad Ali Khaliq, Jamal Alsakran, Byeong Kil Lee, Jeeho Ryoo

发表机构 * Fairleigh Dickinson University(Fairleigh Dickinson大学) University of Colorado at Colorado Springs(科罗拉多州立大学)

AI总结 针对阿尔茨海默病结构MRI合成中细微解剖变化难以捕捉的问题,本文扩展Med-DDPM条件扩散模型,以解剖分割掩膜为条件生成3D结构MRI,实验表明合成数据训练的模型Dice分数与真实数据相当,混合数据训练则显著提升性能。

Journal ref 2025 IEEE 8th International Conference on Multimedia Information Processing and Retrieval (MIPR)

详情
AI中文摘要

生成式机器学习模型的最新进展显著改善了医学成像,为数据增强、隐私保护和模型泛化提供了有前景的解决方案。然而,由于神经退行性病变相关的细微、区域特异性和渐进性解剖变化,合成阿尔茨海默病(AD)的高质量结构MRI数据仍然具有挑战性。在本文中,我们将最初为脑肿瘤合成设计的Med-DDPM条件扩散模型扩展,以生成专门针对AD的3D结构MRI。我们采用Med-DDPM,因为与其他生成模型相比,它具有稳定的结构和保真度,特别适合捕捉AD特征的细微解剖变化。我们的方法以来自ADNI数据集的解剖分割掩膜为条件,将关键的AD相关脑结构纳入生成过程。我们通过在真实、合成和混合数据集上训练分割模型,系统评估了合成图像的质量和实用性。实验结果表明,仅在合成数据上训练的分割模型达到了与真实数据训练(0.6513)相当的Dice分数(0.6532),同时召回率显著提高。值得注意的是,在混合数据集(混合真实和合成图像)上训练的模型优于真实和纯合成基线,Dice分数达到0.7244。这些发现强调了条件扩散模型在生成解剖准确、AD特异性合成MRI方面的成功应用,并突出了它们在增强训练数据可用性、提高诊断准确性和促进神经影像研究可重复性方面的潜力。

英文摘要

Recent advances in generative machine learning models have significantly improved medical imaging, offering promising solutions for data augmentation, privacy preservation, and improved model generalization. However, synthesizing high-quality structural MRI data for Alzheimer's Disease (AD) remains challenging due to the subtle, region-specific, and progressive anatomical changes associated with neurodegeneration. In this paper, we extend the Med-DDPM conditional diffusion model -- originally designed for brain tumor synthesis -- to generate 3D structural MRIs specifically tailored to AD. We adopted Med-DDPM due to its established stability and structural fidelity compared to other generative models, which makes it particularly suitable for capturing the subtle anatomical changes characteristic of AD. Our approach conditions the diffusion process on anatomical segmentation masks derived from the ADNI dataset, incorporating key AD-relevant brain structures into the generation process. We systematically evaluate the quality and utility of the synthetic images by training segmentation models on real, synthetic, and hybrid (mixed) datasets. Experimental results demonstrate that segmentation models trained exclusively on synthetic data achieve comparable Dice scores (0.6532) to those trained on real data (0.6513), while exhibiting significantly enhanced recall. Notably, models trained on hybrid datasets (mixing real and synthetic images) outperform both real and synthetic-only baselines, achieving a Dice score of 0.7244. These findings underscore the successful use of conditional diffusion models for generating anatomically accurate, AD-specific synthetic MRIs, and highlight their potential for enhancing training data availability, improving diagnostic accuracy, and promoting research reproducibility in neuroimaging studies.

2606.18302 2026-06-18 q-bio.OT cs.LG 新提交

Protein-Based Fish Species Identification: Dataset, Models, and Insights from Native Bangladeshi Fish

基于蛋白质的鱼类物种识别:孟加拉本土鱼类的数据集、模型与见解

Md Nasiat Hasan Fahim, Md. Abid Ullah Muhib, Mohammad Shahidur Rahman

发表机构 * Shahjalal University of Science

AI总结 本研究构建了首个孟加拉本土鱼类蛋白质序列数据集,并系统评估了七种架构,提出了一种轻量级混合模型MotifCNN-Transformer+TA-PE,在资源受限场景下优于大型蛋白质语言模型ProtBERT。

Comments Published in 2026 IEEE 2nd International Conference on Quantum Photonics, Artificial Intelligence & Networking (QPAIN). \c{opyright} 2026 IEEE. Personal use of this material is permitted

Journal ref 2026 IEEE 2nd International Conference on Quantum Photonics, Artificial Intelligence & Networking (QPAIN)

详情
AI中文摘要

在孟加拉国,正确识别鱼类物种对于粮食安全、经济发展和气候适应性至关重要。蛋白质序列直接反映功能和进化约束,对物种认证和生物多样性监测具有重要意义。然而,目前尚无针对孟加拉本土鱼类物种的蛋白质序列识别基准。本研究通过引入首个包含9种孟加拉本土鱼类2845条高质量蛋白质序列的精选数据集来填补这一空白。我们还通过对七种架构范式进行系统基准测试,建立了该领域首个蛋白质序列分类基线。此外,我们提出了一种实用的新型混合架构——MotifCNN与具有末端感知位置编码的Transformer(MotifCNN-Transformer+TA-PE)。该新架构实现了79.80%的准确率和0.80的宏F1分数。最高准确率83.04%由微调的蛋白质语言模型ProtBERT取得,该模型有4.2亿参数,需要双16GB GPU进行推理。根据McNemar检验,ProtBERT相比我们的MotifCNN-Transformer+TA-PE的3.24%准确率提升在统计上不显著(p = 0.1120)。在九类中的六类上,我们的新架构在每类识别中优于ProtBERT。此外,我们的MotifCNN-Transformer+TA-PE比ProtBERT快约5倍,小42倍,支持16倍更大的批处理大小,且无需GPU推理,使其在资源受限地区(如孟加拉农村)部署更为实用。除此之外,我们的基础性工作展示了系统发育关系对序列相似性的影响,并为南亚蛋白质依赖型经济中的渔业管理、食品认证和生物多样性保护建立了途径。

英文摘要

Correct identification of fish species is highly significant for food security, economic development, and climate resilience in Bangladesh. Protein sequences directly reflect functional and evolutionary constraints which are important for species authentication and biodiversity monitoring. Yet there exists no benchmark for native Bangladeshi fish species identification from protein sequence. In this study, we addressed this gap by introducing the first curated dataset for nine native Bangladeshi fish species of 2845 high quality protein sequences. We also established the first protein sequence classification baseline for this domain through a systematic benchmarking of seven architectural paradigms. Moreover, we propose a realistic deployable novel hybrid architecture of MotifCNN and Transformer with Terminal-Aware Positional-Encoding (MotifCNN-Transformer+TA-PE). Our novel architecture achieves 79.80% accuracy with macro-F1 of 0.80. The highest 83.04% accuracy is achieved by finetuned protein language model ProtBERT that has 420M parameters and requires dual 16GB GPUs for inference. According to McNemar's test, ProtBERT's 3.24% accuracy gain over our MotifCNN-Transformer+TA-PE is statistically insignificant (p = 0.1120). Our novel architecture beats it among six of the nine classes in per class identification. Also our MotifCNN-Transformer+TA-PE is approximately 5x faster, 42x smaller, and supports 16x larger batch size than ProtBERT and has GPU free inference, making it more practical for deployment in resources constrained areas such as rural Bangladesh. Beyond this, our foundational work shows effects of phylogenetic relationships on sequence similarity and establishes pathways for fisheries management, food authentication and biodiversity conservation in South Asia's protein dependent economy.

2606.18288 2026-06-18 econ.GN cs.AI econ.TH q-fin.EC 新提交

A Knowledge Theory of Capital:The Value of Natural and Artificial Intelligence

资本的知识理论:自然与人工智能的价值

Jeffrey Gardiner

发表机构 * Morgan Stanley(摩根大通)

AI总结 提出资本的知识理论,将知识视为资本的核心形式,分析其生成、转化、治理与测量,区分五种知识形态,并引入新概念解释现代财富来源。

Comments 458 pages, 8 figures. Theory-building monograph developing a conditional framework for knowledge-bearing capitalism, with formal concepts, mechanisms, measurement apparatus, and falsification conditions

详情
AI中文摘要

本卷为生产能力日益存在于软件、数据、模型、常规、专业知识、平台、组织、公共资源和公共认知基础设施的经济体,发展了一种资本的知识理论。从亚当·斯密的劳动、资本、专业化和市场范围理论出发,探讨当知识变得像资本一样可积累、可跨形式流动、可扩展、可治理、可重组且在会计中不完全可见时,会发生什么变化。本书将知识承载资本作为核心对象,分析其如何生成、转化为可治理形式、部署、通过反馈改进、封闭或共享、衡量、减值以及用作未来生产的投入。它区分了具身、非具身、制度化、公共资源和公共知识形式,并发展了诸如首次转化、认知封闭、反馈捕获、暗资本和预期知识损失等概念。该论证是有条件且可检验的:现代财富不仅取决于资本积累,还取决于生产性知识如何被治理。

英文摘要

This volume develops a knowledge theory of capital for economies in which productive capacity increasingly resides in software, data, models, routines, expertise, platforms, organizations, commons, and public epistemic infrastructure. Beginning from Adam Smith's theory of labour, stock, specialization, and market extent, it asks what changes when knowledge becomes stock-like, mobile across forms, scalable, governable, recombinable, and imperfectly visible in accounting. The book introduces knowledge-bearing stock as the central object and analyses how it is generated, converted into governable form, deployed, improved through feedback, enclosed or shared, measured, impaired, and used as input to future production. It distinguishes embodied, disembodied, institutionalized, commons, and public knowledge forms and develops concepts such as first conversion, cognitive enclosure, feedback capture, dark capital, and expected knowledge loss. The argument is conditional and testable: modern wealth depends not only on capital accumulation, but on how productive knowledge is governed.

2606.18281 2026-06-18 stat.AP cs.LG stat.ML 新提交

A Guide to Estimating Conditional Average Treatment Effects in Competing Risks Settings

竞争风险背景下条件平均处理效应估计指南

Daniel Klippert, Sarah Friedrich, Markus Pauly

发表机构 * Department of Statistics, TU Dortmund University(图恩-多特蒙德大学统计学系) Research Center Trustworthy Data Science and Security, University Alliance Ruhr (UA Ruhr)(鲁尔大学联盟可信数据科学与安全研究中心) Institute for Mathematics, University of Augsburg(艾希施泰特大学数学研究所)

AI总结 针对竞争风险生存数据,比较六种元学习器估计条件平均处理效应,提供R包crsurvlearners指导模型选择。

详情
AI中文摘要

条件平均处理效应(CATE)是个性化医疗中治疗决策的核心。在竞争风险背景下,从生存数据估计CATE允许对特定感兴趣事件的治疗效果进行患者特异性评估,同时适当考虑替代事件类型。在存在合并症的情况下,这种区分至关重要,因为竞争死亡原因可能混淆治疗效果。本文聚焦于右删失生存时间和二元治疗,研究CATE定义为在固定时间点上感兴趣事件绝对风险的协变量条件差异。为此,我们研究了元学习器,这些学习器将机器学习算法适应于竞争风险场景中的CATE估计。我们系统比较了六种元学习器,结合Cox回归或随机生存森林进行风险建模,以及弹性网回归或随机森林进行直接CATE建模。为提供模型选择的实践指导,我们在多种模拟设置中评估其性能,这些设置在风险复杂性、治疗异质性、治疗分配、事件类型分布和删失方面有所不同。为促进应用,我们提供R包crsurvlearners,实现了所有考虑的方法。

英文摘要

Conditional average treatment effects (CATEs) are central to treatment decision-making in personalized medicine. In competing risks settings, estimating CATEs from survival data allows for patient-specific assessments of treatment effectiveness for a specific event of interest while properly accounting for alternative event types. This distinction is essential in the presence of comorbidities, where competing causes of death may otherwise confound the therapeutic benefit. Focusing on right-censored survival times with binary treatment, we examine CATEs defined as covariate-conditional differences in the absolute risk for the event of interest at a fixed time. To this end, we study meta-learners which adapt machine learning algorithms for CATE estimation in competing risks scenarios. We systematically compare six meta-learners, combining Cox regression or random survival forests for risk modeling with elastic net regression or random forests for direct CATE modeling. To provide practical guidance on model selection, we evaluate their performance in multiple simulation settings, that differ in hazard complexity, treatment heterogeneity, treatment assignment, event type distribution and censoring. To facilitate applied use, we provide the R package, crsurvlearners, which implements all considered approaches.

2606.18280 2026-06-18 stat.AP cs.AI 新提交

IOAH3: Importance-Driven Adaptive Spatial Partitioning

IOAH3: 重要性驱动的自适应空间划分

Ehsaneddin Jalilian

发表机构 * Interdisciplinary Transformation University Austria(跨学科转型大学奥地利)

AI总结 提出IOAH3方法,通过多源特征提取、马尔可夫随机场图割优化和数据驱动层次细化,构建自适应空间划分,解决可修改面积单元问题。

详情
AI中文摘要

我们提出IOAH3(重要性导向的自适应H3划分),一种用于构建地理参考观测域的数据驱动空间划分的计算方法。标准的空间聚合方法采用固定面积单元,例如行政边界或单一分辨率的均匀六边形网格,而不考虑每个区域中底层观测的信息内容。这导致了著名的可修改面积单元问题:统计和推断结果依赖于划分的任意选择,空间集中的现象在粗网格中被平均化,从而掩盖了精细尺度的结构。IOAH3通过三个阶段构建自适应划分来解决这一问题:多源特征提取和重要性评分,通过主成分分析对道路密度、POI密度、建筑密度和地形粗糙度信号进行,人口和洪水灾害数据作为辅助输入用于单元过滤和空间平滑;通过马尔可夫随机场图割优化进行空间单元选择,该优化在强制空间连续性的同时联合最大化每个单元的重要性;以及数据驱动的高重要性区域层次细化到更精细的H3分辨率级别,并通过邻居传播支持以避免孤立的精细分辨率孤岛。所得划分作为空间推断流程的输入,并在任何建模步骤之前提供了对划分敏感性问题的原则性解决方案。

英文摘要

We present IOAH3 (Importance-Oriented Adaptive H3 partitioning), a computational method for constructing data-driven spatial partitions of geo-referenced observation domains. Standard approaches to spatial aggregation adopt fixed areal units, such as administrative boundaries or uniform hexagonal grids at a single resolution, without regard to the informational content of the underlying observations in each region. This leads to the well-known modifiable areal unit problem: statistical and inferential results depend on the arbitrary choice of partition, and spatially concentrated phenomena are averaged out in coarse cells that obscure fine-scale structure. IOAH3 addresses this by constructing an adaptive partition in three stages: multi-source feature extraction and importance scoring via principal component analysis over road density, POI density, building density, and terrain roughness signals, with population and flood-hazard data entering as auxiliary inputs to cell filtering and spatial smoothness; spatial cell selection via Markov Random Field graph-cut optimisation, which jointly maximises per-cell importance while enforcing spatial contiguity; and data-driven hierarchical refinement of high-importance regions to finer H3 resolution levels, with neighbour-propagated support to avoid isolated fine-resolution islands. The resulting partitions serve as input to spatial inference pipelines and provide a principled resolution of the partition-sensitivity problem prior to any modelling step.

2606.19319 2026-06-18 cs.MA cs.AI cs.DB 新提交

Data Intelligence Agents: Interpreting, Modeling, and Querying Enterprise Data via Autonomous Coding Agents

数据智能代理:通过自主编码代理解释、建模和查询企业数据

Anoushka Vyas, Aarushi Dhanuka, Sina Khoshfetrat Pakazad, Henrik Ohlsson

发表机构 * C3 AI

AI总结 提出Data Intelligence Agents (DIA)系统,由三个自主编码代理组成,通过执行、验证和修复工件来压缩数据集成工作流,在七个SQL基准测试中达到或超越最佳结果。

详情
AI中文摘要

生产数据集成受限于数据所有者、工程师和分析师之间重复且有损的手动交接,他们必须协作发现、构建和查询企业数据。我们提出数据智能代理(DIA),一个由三个代理(数据解释器、模式创建器和查询生成器)组成的系统,通过将自主编码代理(ACA)作为一等抽象来压缩这一工作流:代理不是生成文本,而是生成、执行、验证和修复具体工件,利用共享内存进行经验重用,并将每个工件呈现给领域专家审查。DIA已部署在生产环境中供企业客户使用。我们深入研究了查询生成器,并在完全自主模式下跨七个SQL基准测试(涵盖四个任务类别和四种方言)进行评估。它在所有七个基准测试中达到或超越了最佳已发表结果,表明基于执行、构建在ACA和共享内存之上的架构能够泛化到数据智能工作负载,且适应仅限于自然语言指令。

英文摘要

Production data integration is bottlenecked by repeated, lossy handoffs between data owners, engineers, and analysts who must collaboratively discover, structure, and query enterprise data. We present Data Intelligence Agents (DIA), a system of three agents (Data Interpreter, Schema Creator, and Query Generator) that compresses this workflow by treating autonomous coding agents (ACAs) as a first-class abstraction: rather than emitting text, the agents generate, execute, validate, and repair concrete artifacts, draw on a shared memory for experience reuse, and surface each for review by domain experts. DIA is deployed in production for enterprise customers. We study the Query Generator in depth and evaluate it in fully autonomous mode across seven SQL benchmarks spanning four task categories and four dialects. It matches or surpasses the best published results on all seven, demonstrating that an architecture grounded in execution, built on ACAs and a shared memory, generalizes across the data intelligence workload with adaptation confined to natural-language instructions.

2606.19286 2026-06-18 cs.HC cs.AI cs.CY 新提交

Correct Yourself, Keep My Trust: How Self-Correction and Social Connection Shape Credibility in Social Chatbots

纠正自己,保持信任:自我纠正和社会联系如何塑造社交聊天机器人的可信度

Biswadeep Sen, Yi-Chieh Lee

发表机构 * School of Computing National University of Singapore Singapore Singapore(计算学院新加坡国立大学新加坡新加坡) Computer Science National University of Singapore Singapore Singapore(计算机科学新加坡国立大学新加坡新加坡) National University of Singapore(新加坡国立大学)

AI总结 通过实验比较三种错误纠正策略,发现自我纠正不损害聊天机器人可信度,且用户社会联系强度仅在自我纠正时显著预测信念改变。

详情
AI中文摘要

当社交聊天机器人犯错时——它们确实会犯错——它们的恢复方式决定了用户是否会再次信任它们。社交聊天机器人正日益融入日常生活,但它们仍然容易生成令人信服但不准确的信息。它们与用户建立的社会联系使得此类错误尤其具有后果性。我们进行了一项受试者间实验(N=120),比较了三种错误纠正策略:网页撤回、同一社交聊天机器人的自我纠正以及专家聊天机器人的纠正。我们的结果揭示了两个关键发现。首先,所有三种策略都能同样好地纠正错误,但只有自我纠正不会损害聊天机器人的可信度:参与者对自我纠正的聊天机器人在可信度和感知专业性上的评分显著高于其错误由外部来源纠正的聊天机器人。其次,通过社会吸引力和自我披露测量的用户与聊天机器人的社会联系强度,仅在聊天机器人自我纠正时显著预测信念改变的大小。将纠正外包给外部来源完全切断了这种联系。这些发现表明,社交聊天机器人应该纠正自己的错误,而不是外包纠正,并且投资于社会联系是一种功能性机制,能增强纠正效果,而不仅仅是一种设计特征。我们讨论了设计能够保持长期可信度同时有效处理自身错误的聊天机器人的启示。

英文摘要

When social chatbots make mistakes, and they do, how they recover determines whether users trust them again. Social chatbots are increasingly integrated into everyday life, yet they remain prone to generating convincing but inaccurate information. The social connection they build with users makes such errors particularly consequential. We conducted a between-subjects experiment (N=120) comparing three error correction strategies: a webpage retraction, self-correction by the same social chatbot, and correction by an expert chatbot. Our results reveal two key findings. First, all three strategies corrected the error equally well, but only self-correction did so without damaging the chatbot's credibility: participants rated self-correcting chatbots significantly higher in both trustworthiness and perceived expertise than chatbots whose errors were corrected by external sources. Second, the strength of the user's social connection with the chatbot, measured through social attraction and self-disclosure, significantly predicted the magnitude of belief change, but only when the chatbot corrected itself. Outsourcing corrections to an external source severed this link entirely. These findings suggest that social chatbots should correct their own mistakes rather than outsource corrections, and that investing in social connection is a functional mechanism that amplifies correction effectiveness, not merely a design feature. We discuss implications for designing chatbots that maintain long-term credibility while effectively addressing their own errors.

2606.19247 2026-06-18 cs.HC cs.AI cs.CY 新提交

A Taxonomy of Mental Health and Technology Needs for Alzheimer's and Dementia Caregivers

阿尔茨海默病和痴呆症护理人员的心理健康与技术需求分类

Keran Wang, Drishti Goel, Jiayue Melissa Shi, Violeta J. Rodriguez, Daniel S. Brown, Dong Whi Yoo, Ravi Karkar, Koustuv Saha

发表机构 * Siebel School of Computing and Data Science(Siebel计算与数据科学学院) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Department of Psychology(心理学系) Illinois Neurological Institute(伊利诺伊神经科学研究所) Department of Human-Centered Computing(以人为中心计算系) Manning College of Information and Computer Sciences(马歇尔大学信息与计算机科学学院)

AI总结 本研究提出护理人员心理健康与技术分类法,系统关联AD/ADRD护理人员需求与技术干预类别,识别护理优先事项与现有技术支持的错配,并强调关系紧张和同情疲劳等未充分服务的领域。

详情
AI中文摘要

照顾阿尔茨海默病及相关痴呆症(AD/ADRD)患者的家庭成员构成了全球长期护理的基础。2023年,超过1100万美国亲友贡献了180亿小时的无偿护理,往往以牺牲自身身心健康为代价。这些非正式护理人员——也被称为“隐形第二患者”——经历着更高的心理健康问题发生率。然而,研究通常将其复杂的心理社会经历简化为单一的护理负担概念,掩盖了哪些具体需求未得到满足或得到有效支持。与此同时,数字和人工智能技术正在迅速扩展,从智能手机应用和视频会议到传感器平台和AI聊天机器人。然而,医学、心理学和技术研究之间缺乏共享框架,限制了累积进展。本研究引入了一个护理人员心理健康与技术分类法,系统地将AD/ADRD护理人员的需求与相应的技术干预类别联系起来。基于跨学科文献综述和两项针对护理人员的定性研究,该分类法识别了护理优先事项与现有技术支持之间的错配,突出了关系紧张和同情疲劳等未充分服务的领域,并提出了自适应、响应式系统的设计方向。该框架提供了一个共享词汇,以指导临床医生、研究人员和技术设计师在痴呆症护理中开发更以人为中心和临床基础的创新。

英文摘要

Family members caring for individuals with Alzheimer's disease and related dementias (AD/ADRD) provide the foundation of long-term care worldwide. In 2023, more than 11 million U.S. family and friends contributed 18 billion hours of unpaid care, often at the cost of their own physical and mental health. These informal caregivers -- also referred as the "invisible second patients" -- experience elevated rates of mental health problems. Yet research commonly reduces their complex psychosocial experiences to a single construct of caregiver burden, obscuring which specific needs are unmet or effectively supported. At the same time, digital and AI-enabled technologies are rapidly expanding, from smartphone apps and videoconferencing to sensor platforms and AI chatbots. However, the absence of shared frameworks across medicine, psychology, and technology research limits cumulative progress. This study introduces a Caregiver Mental Health and Technology Taxonomy that systematically links AD/ADRD caregiver needs with corresponding classes of technology-based interventions. Drawing from an interdisciplinary literature review and two qualitative studies with caregivers, the taxonomy identifies mismatches between caregiver priorities and existing technological support, highlights under-served domains such as relational strain and compassion fatigue, and proposes design directions for adaptive, responsive systems. The framework offers a shared vocabulary to guide clinicians, researchers, and technology designers in developing more person-centered and clinically grounded innovation in dementia care.

2606.19197 2026-06-18 cs.LO cs.AI 新提交

The More the Merrier: Combining Properties for ABox Abduction under Repair Semantics for ELbot

越多越好:ELbot 修复语义下结合属性的 ABox 溯因

Anselm Haak, Patrick Koopmann, Yasir Mahmood, Anni-Yasmin Turhan

发表机构 * Knowledge Representation Group, Paderborn University, Germany Knowledge in Artificial Intelligence, Vrije Universiteit Amsterdam, The Netherlands Data Science Group, Paderborn University, Germany

AI总结 研究 EL_bot 在勇敢和 AR 语义下,满足多个属性或最优准则的 ABox 溯因假设,发现增加属性要求通常不增加复杂度。

详情
AI中文摘要

溯因是一种通过提供假设来解释知识库中缺失蕴含的核心方法,该假设若添加到知识库中,将使缺失的蕴含变为真。最近,修复语义下的溯因被详细研究,其中考虑了若干理想属性和最优准则,如签名限制、大小最小化和引入冲突的最小化。自然地,满足多个这些属性或将属性与最优准则相结合的假设在应用中更受欢迎。迄今为止,文献中尚未研究此类假设。在本文中,我们考虑 EL_bot 在勇敢和 AR 语义下,满足多个属性或额外最优准则的 ABox 溯因问题。我们的主要观察是,通常对假设要求额外属性不会导致复杂度增加。

英文摘要

Abduction is a central approach to explain missing entailments from a knowledge base by providing a hypothesis, that would, if added to the knowledge base, make the missing entailment become true. Abduction under repair semantics has recently been investigated in detail, where several desirable properties and optimality criteria were considered, such as signature-restrictions and minimality in size and of introduced conflicts. Naturally, hypotheses that satisfy more than one of these properties or combine a property with an optimality criterion would be even more desirable for applications. So far, such hypotheses have not been investigated in the literature. In the present paper, we consider the ABox abduction problem for hypotheses satisfying more than one property or additional optimality criteria, for EL_bot under brave and AR semantics. Our main observation is that often requiring additional properties for hypotheses does not lead to an increase of complexity.

2606.19174 2026-06-18 cs.HC cs.AI 新提交

A Clinician-Centered Pipeline for Annotation and Evaluation in Ultrasound AI Studies

面向临床医生的超声AI研究注释与评估流程

Fangyijie Wang, Jianjun Yu, Wentao Shi, Haixia Huang, Ran Shi, Guénolé Silvestre, Kathleen M. Curran

发表机构 * Research Ireland Centre for Research Training in Machine Learning(爱尔兰研究机器学习研究中心) School of Medicine, University College Dublin, Dublin, Ireland(都柏林大学医学院) The Third People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第三人民医院) Zhenjiang Maternal and Child Health Hospital, Zhenjiang, China(镇江 maternal and child health hospital) The Fifth People's Hospital of Zhenjiang City, Zhenjiang, China(镇江市第五人民医院) School of Computer Science, University College Dublin, Dublin, Ireland(都柏林大学计算机科学学院)

AI总结 提出一个基于中央服务器和轻量级浏览器的临床医生中心化流程,支持远程注释、盲评和多评分者参与,在胎儿超声分割研究中验证了其可重复性和统计一致性。

Comments Accepted to MIUA 2026

详情
AI中文摘要

临床医生中心的评估对于验证医学AI系统至关重要,尤其是在超声成像中,定量指标并不总能捕捉临床可用性。现有的医学图像平台主要关注数据集标注,缺乏对盲法模型比较和可重复评估工作流的集成支持。我们提出了一个面向临床医生的超声AI研究远程注释与评估流程。该流程使用中央服务器和轻量级浏览器界面,使临床医生无需下载本地数据集即可进行注释、盲法排序和审查。该流程还支持多评分者参与、集中结果聚合和自动统计分析。我们在一个胎儿超声分割研究中验证了该流程,涉及六名评分者,涵盖专家、全科医生和非专家经验水平。系统自动生成了Spearman相关性、Kendall's τ和top-1选择统计量。结果显示专家与其他组之间存在中等到强的一致性。盲法评估结果表明,后期主动学习模型更受青睐。这些结果表明,该流程可以支持超声成像中临床医生中心的注释和可重复的人机AI评估研究。该流程可在GitHub上获取。

英文摘要

Clinician-centered evaluation is critical for validating medical AI systems, especially in ultrasound imaging where quantitative metrics do not always capture clinical usability. Existing medical image platforms primarily focus on dataset labeling. They lack integrated support for blinded model comparison and reproducible evaluation workflows. We present a clinician-centered pipeline for remote annotation and evaluation in ultrasound AI studies. The proposed pipeline uses a centralized server and lightweight browser interfaces to enable clinicians to perform annotation, blinded ranking, and review without local dataset downloads. The pipeline also supports multi-rater participation, centralized result aggregation, and automated statistical analysis. We validate the pipeline in a fetal ultrasound segmentation study with six raters spanning expert, generalist, and non-expert experience levels. The system automatically generated Spearman correlation, Kendall's $τ$, and top-1 selection statistics. Results indicated moderate to strong agreement across experts and other groups. The blinded evaluation results showed a tendency for later active learning models to be preferred. These outcomes suggest that the pipeline can support clinician-centered annotation and reproducible human-\ac{AI} evaluation studies in ultrasound imaging. The proposed pipeline is available on \href{https://github.com/13204942/SonoRate}{GitHub}.