arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2075
专题追踪
2504.16584 2026-05-14 cs.CR cs.AI

Case Study: Fine-tuning Small Language Models for Accurate and Private CWE Detection in Python Code

Md. Azizul Hakim Bappy, Hossen A Mustafa, Prottoy Saha, Rajinus Salehat

发表机构 * Institute of Information and Communication Technology, Bangladesh University of Engineering Technology(孟加拉工程科技大学信息与通信技术研究所) Hajee Mohammad Danesh Science and Technology University(海杰莫哈默德丹什科学与技术大学)

AI总结 本文研究了在Python代码中使用小型语言模型(SLM)进行准确且隐私保护的CWE漏洞检测的可行性。通过半监督方法构建了一个包含500个样本的数据集,并对一个3.5亿参数的预训练代码模型进行指令遵循的微调,最终在测试集上实现了近99%的准确率和召回率。实验表明,经过微调的SLM能够在本地环境中高效、精确地检测CWE漏洞,为安全分析提供了一种隐私友好的解决方案。

Comments 11 pages, 2 figures, 3 tables. Dataset available at https://huggingface.co/datasets/floxihunter/synthetic_python_cwe. Model available at https://huggingface.co/floxihunter/codegen-mono-CWEdetect. Keywords: Small Language Models (SLMs), Vulnerability Detection, CWE, Fine-tuning, Python Security, Privacy-Preserving Code Analysis

详情
英文摘要

Large Language Models (LLMs) have demonstrated significant capabilities in understanding and analyzing code for security vulnerabilities, such as Common Weakness Enumerations (CWEs). However, their reliance on cloud infrastructure and substantial computational requirements pose challenges for analyzing sensitive or proprietary codebases due to privacy concerns and inference costs. This work explores the potential of Small Language Models (SLMs) as a viable alternative for accurate, on-premise vulnerability detection. We investigated whether a 350-million parameter pre-trained code model (codegen-mono) could be effectively fine-tuned to detect the MITRE Top 25 CWEs specifically within Python code. To facilitate this, we developed a targeted dataset of 500 examples using a semi-supervised approach involving LLM-driven synthetic data generation coupled with meticulous human review. Initial tests confirmed that the base codegen-mono model completely failed to identify CWEs in our samples. However, after applying instruction-following fine-tuning, the specialized SLM achieved remarkable performance on our test set, yielding approximately 99% accuracy, 98.08% precision, 100% recall, and a 99.04% F1-score. These results strongly suggest that fine-tuned SLMs can serve as highly accurate and efficient tools for CWE detection, offering a practical and privacy-preserving solution for integrating advanced security analysis directly into development workflows.

2504.03158 2026-05-14 stat.ML cs.LG

Accelerating Particle-based Energetic Variational Inference

Xuelian Bao, Lulu Kang, Chun Liu, Yiwei Wang

发表机构 * School of Mathematics, South China University of Technology(华南理工大学数学学院) Department of Mathematics and Statistics, University of Massachusetts Amherst(马萨诸塞大学阿默斯特分校数学与统计系) Department of Applied Mathematics, Illinois Institute of Technology(伊利诺伊理工学院应用数学系) Department of Mathematics, University of California Riverside(加州大学河滨分校数学系)

AI总结 本文提出了一种基于粒子的变分推断方法,旨在加速已有隐式方案的能变分推断(EVI-Im)。该方法借鉴能量二次化和算子分裂技术,通过高效引导粒子向目标分布移动,并保留稳定性机制。与EVI-Im不同,新方法避免了每步中重复计算粒子间相互作用项,显著降低了计算成本,同时框架也可扩展至其他基于梯度的采样技术。实验表明,该方法在效率和鲁棒性方面具有优势,性能与现有粒子变分推断方法相当。

Comments 22 pages, 6 figures, 2 tables

详情
英文摘要

In this work, we propose a new particle-based variational inference (ParVI) method for accelerating the Energetic Variational Inference with Implicit scheme (EVI-Im) introduced in Ref. \cite{wang2021particle}. Inspired by energy quadratization (EQ) and operator splitting techniques for gradient flows, the proposed method efficiently drives particles towards the target distribution, while retaining a meaningful stability mechanism. Unlike EVI-Im, which employs the implicit Euler method to solve variational-preserving particle dynamics obtained from a "discretization-then-variation" approach for minimizing the Kullback--Leibler divergence, the proposed algorithm avoids repeated evaluation of inter-particle interaction terms within each time step, significantly reducing computational cost. The framework is also extensible to other gradient-based sampling techniques. Through several numerical experiments, we demonstrate that the proposed method achieves competitive performance compared with existing ParVI approaches, while offering advantages in efficiency and robustness in certain regimes.

2502.15761 2026-05-14 cs.DC cs.AI cs.GR cs.HC

AIvaluateXR: An Evaluation Framework for on-Device AI in XR with Benchmarking Results

Dawar Khan, Xinyu Liu, Omar Mena, Donggang Jia, Alexandre Kouyoumdjian, Ivan Viola

发表机构 * King Abdullah University of Science and Technology (KAUST)(卡布斯大学)

AI总结 本文提出AIvaluateXR,一个用于评估在扩展现实(XR)设备上运行的大语言模型(LLM)的综合框架。研究通过在四款XR设备上部署17个LLM,从性能一致性、处理速度、内存使用和电池消耗四个方面进行系统评估,并基于三维帕累托最优理论提出统一的评估方法,以选择最优的设备-模型组合。该研究为在XR设备上部署LLM提供了有价值的指导,并为未来相关优化和研究奠定了基础。

Comments AIvaluateXR is updated version of LoXR

详情
英文摘要

The deployment of large language models (LLMs) on extended reality (XR) devices has great potential to advance the field of human-AI interaction. In the case of direct, on-device model inference, selecting the appropriate model and device for specific tasks remains challenging. In this paper, we present AIvaluateXR, a comprehensive evaluation framework for benchmarking LLMs running on XR devices. To demonstrate the framework, we deploy 17 selected LLMs across four XR platforms: Magic Leap 2, Meta Quest 3, Vivo X100s Pro, and Apple Vision Pro, and conduct an extensive evaluation. Our experimental setup measures four key metrics: performance consistency, processing speed, memory usage, and battery consumption. For each of the 68 model-device pairs, we assess performance under varying string lengths, batch sizes, and thread counts, analyzing the trade-offs for real-time XR applications. We propose a unified evaluation method based on the 3D Pareto Optimality theory to select the optimal device-model pairs from quality and speed objectives. Additionally, we compare the efficiency of on-device LLMs with client-server and cloud-based setups, and evaluate their accuracy on two interactive tasks. We believe our findings offer valuable insight to guide future optimization efforts for LLM deployment on XR devices. Our evaluation method can be used as standard groundwork for further research and development in this emerging field. The source code and supplementary materials are available at: www.nanovis.org/AIvaluateXR.html

2407.11518 2026-05-14 stat.ML cs.LG stat.OT

Ensemble Transport Filter via Optimized Maximum Mean Discrepancy

Dengfei Zeng, Lijian Jiang

发表机构 * School of Mathematical Sciences, Tongji University(同济大学数学科学学院)

AI总结 本文提出了一种基于最优最大均值差异(MMD)的集合传输滤波方法,通过构建传输映射直接将先验粒子转移到后验粒子,从而改进粒子滤波中的分析步骤。该方法利用MMD损失函数优化传输映射,以匹配近似后验与参考后验的期望信息,并引入方差惩罚项以增强鲁棒性,有效提升了高维数据同化问题中的后验分布估计精度。数值实验表明,该方法在性能上优于传统的集合卡尔曼滤波。

Comments 27 pages, 14 figures

详情
英文摘要

In this paper, we present a new ensemble-based filter method by reconstructing the analysis step of the particle filter through a transport map, which directly transports prior particles to posterior particles. The transport map is constructed through an optimization problem described by the Maximum Mean Discrepancy loss function, which matches the expectation information of the approximated posterior and reference posterior. The proposed method inherits the accurate estimation of the posterior distribution from particle filtering while gives an extension to high dimensional assimilation problems. To improve the robustness of Maximum Mean Discrepancy, a variance penalty term is used to guide the optimization. It prioritizes minimizing the discrepancy between the expectations of highly informative statistics for the reference posteriors. The penalty term significantly enhances the robustness of the proposed method and leads to a better approximation of the posterior. A few numerical examples are presented to illustrate the advantage of the proposed method over ensemble Kalman filter.

2406.13619 2026-05-14 stat.ML cs.LG

Generative Modeling by Minimizing the Wasserstein-2 Loss

Yu-Jui Huang, Zachariah Malik

发表机构 * Department of Applied Mathematics, University of Colorado(应用数学系,科罗拉多大学)

AI总结 本文提出了一种通过最小化二阶Wasserstein损失($W_2$损失)的生成模型,利用与真实数据分布及当前估计相关的Kantorovich势构建分布依赖的常微分方程(ODE)。研究证明该ODE的时间边缘分布形成$W_2$损失的梯度流,并以指数速率收敛于真实数据分布。基于该ODE设计了欧拉数值方案,并通过持续训练策略构建算法,在低维和高维实验中均优于传统Wasserstein生成对抗网络。

详情
英文摘要

This paper develops a generative model by minimizing the second-order Wasserstein loss (the $W_2$ loss) through a distribution-dependent ordinary differential equation (ODE), whose dynamics involves the Kantorovich potential associated with the true data distribution and a current estimate of it. A main result shows that the time-marginal laws of the ODE form a gradient flow for the $W_2$ loss, which converges exponentially to the true data distribution. An Euler scheme for the ODE is proposed and it is shown to recover the gradient flow for the $W_2$ loss in the limit. An algorithm is designed by following the scheme and applying persistent training, which naturally fits our gradient-flow approach. In both low- and high-dimensional experiments, our algorithm outperforms Wasserstein generative adversarial networks by increasing the level of persistent training appropriately.

2312.04110 2026-05-14 stat.ML cs.LG physics.soc-ph

Small Area Estimation of Case Growths for Timely COVID-19 Outbreak Detection

Zhaowei She, Zilong Wang, Jagpreet Chhatwal, Turgay Ayer

发表机构 * Singapore Management University(新加坡管理学院) Georgia Institute of Technology(佐治亚理工学院) Massachusetts General Hospital(麻省总医院) Harvard Medical School(哈佛医学院)

AI总结 本文提出了一种基于迁移学习的随机森林框架(TLRF),用于在小样本区域中准确估计新冠疫情病例增长率,从而实现对疫情爆发的及时检测。该方法通过将增长率估计问题转化为回归任务,并利用随机森林的自适应加权机制实现跨时空的迁移学习,有效平衡了估计精度与计算速度之间的矛盾。实验表明,TLRF在预测性能上优于现有方法,并在科罗拉多州的案例研究中将疫情爆发的检测效率提升了224%。

Comments Equal contributions by co-first authors Zhaowei She, Zilong Wang (in alphabetical order)

详情
英文摘要

The COVID-19 pandemic has exerted a profound impact on the global economy and continues to exact a significant toll on human lives. The COVID-19 case growth rate stands as a key epidemiological parameter to estimate and monitor for effective detection and containment of the resurgence of outbreaks. A fundamental challenge in growth rate estimation and hence outbreak detection is balancing the accuracy-speed tradeoff, where accuracy typically degrades with shorter fitting windows. In this paper, we provide a transfer learning framework, which we call Transfer Learning Random Forest (TLRF), for an effective implementation of the random forests algorithm that balances this accuracy-speed tradeoff. Specifically, we develop an identification strategy that converts the growth rate estimation problem into a regression task, which enables effective transfer learning across space and time through random forests' adaptive weighting mechanism. As such, through adaptively choosing fitting window sizes based on relevant day-level and county-level features affecting the disease spread, TLRF can accurately estimate case growth rates for counties with small sample sizes. Out-of-sample prediction analysis shows that TLRF outperforms established growth rate estimation methods. Furthermore, we conducted a case study based on outbreak case data from the state of Colorado and showed that TLRF could improve timely detections of outbreaks up to 224% when compared to the decisions made by Colorado's Department of Health and Environment (CDPHE). To demonstrate practical implementation, we developed a publicly available outbreak detection tool that operated from September 2020 through March 2023, receiving substantial attention from policymakers across all 50 states.

2605.12619 2026-05-14 q-bio.NC cs.CV

Human face perception reflects inverse-generative and naturalistic discriminative objectives

Wenxuan Guo, Heiko H. Schütt, Kamila Maria Jozwik, Katherine R. Storrs, Nikolaus Kriegeskorte, Tal Golan

发表机构 * Department of Psychology(心理学系) Department of Behavioural and Cognitive Sciences(行为与认知科学系) MRC Cognition and Brain Sciences Unit(认知与脑科学单位) School of Psychology(心理学系) Department of Neuroscience(神经科学系) Department of Industrial Engineering and Management(工业工程与管理系) School of Brain Sciences and Cognition(脑科学与认知系)

AI总结 该研究探讨了人类面孔识别的感知机制,通过比较六种结构相同但训练任务不同的深度神经网络模型,揭示了人类面孔感知的计算特性。研究发现,强调高层不变结构的模型(如逆渲染、人脸识别或物体分类训练的模型)最符合人类对人脸差异的判断,且基于自然图像训练的模型表现优于合成图像训练的模型。这些结果表明,人类面孔感知可能依赖于推断面部外观潜在原因、排除干扰变量,并受自然图像统计特性调节的机制。

Comments 33 pages, 10 figures, 4 tables

详情
英文摘要

The perceptual representations supporting our ability to recognize faces remain a computational mystery. Deep neural networks offer mechanistic hypotheses for human face perception, but theoretically distinct models often make indistinguishable representational predictions for randomly sampled faces. To expose diagnostic differences among these hypotheses, we compared six neural network models sharing an architecture but trained on distinct tasks, using face pairs optimized to elicit contrasting model predictions ("controversial" pairs) alongside randomly sampled pairs. We tested model predictions against face-dissimilarity judgments from 864 human participants across stimulus sets differing in realism and pose variation. Models prioritizing high-level, invariant structures (trained via inverse rendering, face identification, or object classification) most robustly matched human judgments. Furthermore, models trained on natural images typically outperformed synthetic-trained counterparts. Together, these findings suggest that human face perception is shaped by mechanisms that infer latent causes of facial appearance, discount nuisance variation, and are tuned by natural image statistics.

2605.12578 2026-05-14 eess.SP cs.IT cs.LG math.IT

Recurrent Transformer-Based Near- and Far-Field THz Wideband Channel Estimation for UM-MIMO

Dmitry Artemasov, Alexander Shmatok, Kirill Andreev, Alexey Frolov, Manjesh K. Hanawal, Nikola Zlatanov

发表机构 * Center for Next Generation Wireless and IoT(下一代无线与物联网中心) Skolkovo Institute of Science and Technology(斯克尔科夫科学与技术研究所) Department of IEOR(工业工程与运营研究系) Indian Institute of Technology Bombay(孟买印度理工学院) Faculty of Computer and Engineering Sciences(计算机与工程科学学院) Innopolis University(因诺波利斯大学)

AI总结 本文研究了在6G网络中集成太赫兹通信与超大规模MIMO系统时面临的信道估计问题,特别是在近场与远场混合存在的场景下。为了解决这一挑战,作者提出了一种基于块循环变压器的信道估计方法,该方法通过引入状态记忆机制,能够一次训练后多次迭代使用,适用于不同散射距离、传播路径数量和宽带操作的无线信道。实验表明,该方法在窄带和宽带场景下的归一化均方误差(NMSE)分别比现有最佳方法提升了约5 dB和7.5 dB。

Comments 15 pages, 15 figures

Journal ref IEEE Access, vol. 13, pp. 205396-205411, 2025

详情
英文摘要

The integration of terahertz communications and ultra-massive multiple-input multiple-output (UM-MIMO) systems in 6G networks is motivated by their ability to enable unprecedented data rates, mitigate spectrum congestion, and enhance overall network performance. However, the enlarged antenna apertures and higher carrier frequencies in these systems increase the Rayleigh distance, causing users to span both the near-field and conventional far-field regions. Accurate spatial precoding thus requires exact channel estimation at the base station - a task made more challenging by the hybrid coexistence of near- and far-field effects and the limited number of digital chains available in hybrid beamforming architectures. In this paper, we propose a block recurrent transformer model to address this challenge. We demonstrate that a single transformer block equipped with state memory can be trained once and then iteratively applied for hybrid-field channel estimation. Furthermore, we train the model such that it generalizes to wireless channels with varying scatterer distances, different numbers of propagation paths, and wideband operation. Simulation results show that the proposed method achieves performance gains of approximately 5 dB and 7.5 dB in normalized mean squared error (NMSE) over state-of-the-art solutions in narrowband and wideband scenarios, respectively.

2605.12575 2026-05-14 eess.IV cs.AI cs.CV

Are Compact Rationales Free? Measuring Tile Selection Headroom in Frozen WSI-MIL

Hyun Do Jung, Jungwon Choi, Soojung Choi, Yujin Oh, Hwiyoung Kim

发表机构 * Department of Artificial Intelligence, Yonsei University(延世大学人工智能系) Kim Jaechul Graduate School of AI, KAIST(金 Jaechul人工智能研究生院,韩国科学技术院) Department of Integrative Medicine, College of Medicine, Yonsei University(延世大学医学院整合医学系) Department of Biomedical Systems Informatics, College of Medicine, Yonsei University(延世大学医学院生物医学系统信息学系) H-Data Strategy Center, Hallym University Chuncheon Sacred Heart Hospital(翰林大学春川圣心医院H-Data战略中心)

AI总结 本文研究了在冻结的全切片图像(WSI)多实例学习(MIL)分类器中,能否从少量输出一致的图像块中恢复出滑动级预测结果,从而生成紧凑的后验解释。为此,作者提出了一种轻量级的解释层FOCI,通过训练使其能够从保留或删除的图像块子集中提取足够信息,并引入选择余量指数(SHI)进行评估。实验表明,不同MIL模型对紧凑解释的支持程度不同,FOCI能够有效减少所需图像块数量,并为模型解释和审计提供了一种新的工具。

详情
英文摘要

Whole-slide image (WSI) multiple instance learning (MIL) classifiers can achieve strong slide-level AUC while leaving the full-bag prediction opaque. Attention scores are widely reused as post-hoc explanations, but high attention can reflect aggregation preference rather than a compact, model-sufficient rationale. We study post-hoc rationale highlighting for frozen WSI-MIL: given a trained classifier, can its slide-level prediction be recovered from a compact, output-consistent tile subset without retraining the backbone? We instantiate this with Finding Optimal Contextual Instances (FOCI), a lightweight rationale-readout layer over a frozen MIL backbone. FOCI is trained with model-output sufficiency and exclusion objectives over keep/drop tile subsets, evaluated with an insertion-style Sequential Reveal Protocol (SRP) adapted to WSI-MIL, and summarized by the Selection Headroom Index (SHI). Across three WSI benchmarks and seven MIL backbones, FOCI reveals that compact rationales are selection-headroom dependent: transformer and multi-branch attention aggregators can admit compact rationales, near-minimal attention-pooling baselines enter a selection-saturation regime, and hard-selection backbones can conflict with an external readout. For TransMIL, relative to its documented CLS-proxy ranking, FOCI reduces the Minimum Sufficient K (MSK) tile count by 32-56% across benchmarks, while ACMIL+FOCI attains the highest mean SHI (+0.465). Deletion-based perturbation and selected-only downstream evaluation provide complementary checks. These results position FOCI as a model-level interpretability and audit layer: selected tiles are not claims of clinical or pathologist-level diagnostic sufficiency, but candidate rationales that offer a compact, reviewable view of when a frozen MIL prediction can be localized to a small output-consistent subset.

2605.12569 2026-05-14 eess.SP cs.AI

Active Sensing with Meta-Reinforcement Learning for Emitter Localization from RF Observations

M. Shamail J. Khan, Nisha L. Raichur, Lucas Heublein, Christian Wielenberg, Alexander Mattick, Tobias Feigl, Christopher Mutschler, Felix Ott

发表机构 * Fraunhofer Institute for Integrated Circuits IIS(弗劳恩霍夫集成电路研究所) Positioning Systems Lab, University of Technology Nürnberg (UTN)(定位系统实验室,图恩大学) Center for Artificial Intelligence, Technical University of Applied Sciences Würzburg-Schweinfurt(人工智能中心,应用技术大学魏玛-施魏尔堡)

AI总结 本文研究了在复杂传播环境下利用射频观测进行GNSS干扰源定位的问题,将其建模为一种主动感知问题,并提出了一种结合深度强化学习与递归策略学习的框架,以从2×2天线阵列获取的射频信号中推断干扰源位置。该方法通过模拟数据集进行训练与评估,实验表明其定位成功率达到80.1%,展示了强化学习在适应性干扰定位中的潜力。

详情
英文摘要

Global navigation satellite system (GNSS) interference poses a serious threat to reliable positioning, especially in indoor and multipath-rich environments where source localization is highly challenging. In this paper, we formulate GNSS interference localization as an active sensing problem and propose a reinforcement learning (RL) framework in which an agent sequentially explores the environment to infer the position of an emitter source from radio frequency (RF) observations acquired with a 2x2 patch antenna. The localization task is modeled as a partially observable decision process, since single-snapshot measurements are often ambiguous under multipath propagation and changing channel conditions. To address this, the proposed framework combines high-dimensional RF sensing with deep RL and recurrent policy learning. We investigate both value-based and policy-based approaches, namely Deep Q-Networks (DQN) and Proximal Policy Optimization (PPO), and study their behavior under domain shift. The approach is evaluated on a simulated dataset generated with the Sionna ray-tracing module, which provides realistic propagation effects and diverse environment configurations. Experimental results show that the proposed method achieves a localization success rate of 80.1%, demonstrating the potential of RL for adaptive GNSS interference localization. Overall, the results highlight simulation-assisted training as a promising direction for robust interference localization in challenging propagation environments.

2605.12566 2026-05-14 eess.IV cs.LG

On Privacy-Preserving Image Transmission in Low-Altitude Networks: A Swin Transformer-Based Framework with Federated Learning

Kexin Zhang, Lixin Li, Yuna Yan, Xin Zhang, Wensheng Lin, Rui Li, Dongwei Zhao, Zhu Han

发表机构 * School of Electronics and Information, Northwestern Polytechnical University(电子工程学院,西北工业大学) DecoreX Intelligent Technologies Co., Ltd.(DecoreX智能科技有限公司) Samsung AI Center(三星人工智能中心) th Research Institute of China North Industries Group Corporation(中国北方工业集团208研究所) Department of Electrical and Computer Engineering, University of Houston(电子与计算机工程系,休斯顿大学)

AI总结 随着低空经济的快速发展,无人机在物流、巡检等领域的应用日益广泛,但图像数据传输面临带宽限制和隐私保护的双重挑战。本文提出了一种基于Swin Transformer和联邦学习的语义通信框架(STSC),能够在有限带宽下高效提取多尺度语义特征,并通过联邦学习机制实现分布式设备上的全局模型训练,从而保障用户隐私。实验表明,该框架在图像传输质量与模型泛化能力方面均优于现有方法,为低空网络中的隐私保护图像传输提供了有效解决方案。

Comments 13 pages, 10 figures, 2 tables

详情
英文摘要

The rapid development of low-altitude economy has driven the proliferation of Unmanned Aerial Vehicle (UAV) applications, including logistics, inspection, and emergency response. However, transmitting high-volume image data from UAVs to ground stations faces significant challenges due to limited bandwidth and stringent privacy requirements. To address these issues, a Semantic Communication (SC) framework based on Federated Learning (FL) is proposed for efficient and privacy-preserving image transmission. A Swin Transformer-based Semantic Communication (STSC) architecture is designed to extract multi-scale semantic features under constrained bandwidth conditions. Dedicated communication and computing nodes are deployed on UAVs to enhance real-time coverage and flexibility. Meanwhile, a FL mechanism enables global model training across distributed devices without sharing raw data, thus preserving user privacy. Simulation experiments conducted on the CIFAR-10 dataset demonstrate that the proposed STSC framework achieves at least 5.7 dB improvement in Peak Signal-to-Noise Ratio (PSNR) compared to DeepJSCC baselines, while also showing superior convergence and generalization performance. The framework effectively integrates UAV-assisted deployment with SC and privacy protection, offering a practical solution for bandwidth-constrained image transmission in low-altitude networks.

2605.12562 2026-05-14 eess.IV cs.AI cs.CV

Uncovering Latent Pathological Signatures in Pulmonary CT via Cross-Window Knowledge Distillation

Bo Peng, Wujian Xu, Kun Wang, Ximing Liao, Na Wang, Daqian Shi, Tian Li, Jing Gao, Johan Thygesen, Yingqun Ji, Honghan Wu

发表机构 * Institute of Health Informatics, University College London(伦敦大学学院健康信息学研究所) Department of Pulmonary and Critical Care Medicine, Shanghai East Hospital, School of Medicine, Tongji University(同济大学医学院 pulmonary and critical care medicine 部门,上海东方医院) Queen Mary University of London(伦敦女王玛丽大学) School of Health and Wellbeing, University of Glasgow(格拉斯哥大学健康与福祉学院)

AI总结 该研究针对多窗口肺部CT影像分析中现有深度学习方法未能有效融合不同密度结构信息的问题,提出了一种跨窗口知识蒸馏框架,通过让学生编码器从在最具信息量窗口上训练的教师模型中学习潜在的临床先验知识。实验表明,该方法在三个数据集上显著提升了各窗口的AUC指标,并实现了高达0.9960的集成AUC,展示了其在肺部CT多窗口分析中的优越性能和泛化能力。

详情
英文摘要

Multi-window CT imaging captures complementary pathological information across anatomical structures of differing densities, yet existing deep learning methods fuse representations only at later stages, missing cross-density interactions. We propose a cross-window knowledge distillation framework in which student encoders learn latent clinical priors from a teacher trained on the most informative window. Evaluated retrospectively on three cohorts - COPD-CT-DF (n=719), RSNA PE (n=1,433), and an in-house CTEPD dataset (n=161) - distillation improved per-window AUC by 10.1-16.5 percentage points on COPD-CT-DF (0.75-0.81 to 0.90-0.94; all P<0.001), with ensemble AUC reaching 0.9960. Similar gains were observed on RSNA PE (0.80-0.83 to 0.90-0.92) and CTEPD (AUC 0.7481 vs. 0.6264). Cross-window distillation internalises pathological signatures invisible to supervised approaches, offering a generalisable solution for multi-window pulmonary CT analysis.

2605.12560 2026-05-14 eess.IV cs.CV cs.LG

Brain Tumor Classification in MRI Images: A Computationally Efficient Convolutional Neural Network

Md Fahimul Kabir Chowdhury, Jannatul Ferdous

发表机构 * Department of Computer Science and Engineering, University of North Texas, USA(北卡罗来纳州立大学计算机科学与工程系) Department of Electrical and Electronic Engineering, International Islamic University Chittagong, Bangladesh(伊斯兰国际大学查塔格昂分校电子与电气工程系)

AI总结 本文提出了一种计算效率高的卷积神经网络(CNN),用于对MRI图像中的脑肿瘤进行多类别分类,包括胶质瘤、脑膜瘤、垂体瘤和无肿瘤四种情况。该模型通过高效的特征提取和优化的训练策略,在两个公开数据集上分别达到了99.03%和99.28%的分类准确率,以及99.88%和99.94%的ROC得分,且参数数量远少于主流预训练模型。相比现有先进模型,该方法在保持高分类性能的同时显著降低了计算开销,具有在临床环境中作为实用诊断辅助工具的潜力。

Journal ref 2025 IEEE International Conference on Biomedical Engineering, Computer and Information Technology for Health (BECITHCON), pp. 633-638, 2025

详情
英文摘要

Improving patient outcomes depends on the prompt and accurate diagnosis of brain tumors, but manual MRI scan analysis is still time-consuming and unreliable. Although deep learning has shown promise, many of the models that are now in use are computationally intensive and have difficulty handling the intrinsic complexity and variety of different types of brain tumors. In this work, we propose a lightweight yet high-performing Convolutional Neural Network (CNN) for multi-class brain tumor classification, employing MRI images to target gliomas, meningiomas, pituitary tumors, and healthy (no tumor) instances. The model was rigorously evaluated on two publicly accessible datasets from Figshare and Kaggle. Leveraging efficient feature extraction and optimized training strategies, our CNN achieved classification accuracies of 99.03% and 99.28%, along with ROC scores of 99.88% and 99.94% on Dataset 1 and Dataset 2, respectively-all while utilizing significantly fewer parameters than popular pre-trained architectures. In contrast to cutting-edge models like DenseNet201, MobileNetV2, VGG19, Xception, InceptionV3, and ResNet50, our approach consistently demonstrated superior performance with reduced computational overhead. These findings highlight the potential of the proposed model as a practical and reliable diagnostic aid in clinical environments.

2605.12553 2026-05-14 eess.SP cs.AI

ChannelKAN: Multi-Scale Dual-Domain Channel Prediction via Hybrid CNN-KAN Architecture

Nanqing Jiang, Zhangyao Song, Tao Guo, Xiaoyu Zhao, Yinfei Xu

发表机构 * School of Cyber Science and Engineering, Southeast University, Nanjing, China(网络安全学院,东南大学,南京,中国)

AI总结 本文提出了一种名为ChannelKAN的混合CNN-KAN架构,用于在高移动性场景下提升大规模MIMO-OFDM系统的信道状态信息(CSI)预测精度。该方法结合了卷积神经网络(CNN)和柯尔莫戈罗夫-阿诺德网络(KAN),分别捕捉CSI序列中的局部空间-频率相关性和非线性时序演化特性,并通过多尺度频域增强模块和双域融合模块提升特征表达能力。实验表明,ChannelKAN在多个性能指标上优于传统深度学习模型,展现出优越的预测能力和泛化性能。

详情
英文摘要

Accurate channel state information (CSI) prediction is essential for improving the reliability and spectral efficiency of massive MIMO-OFDM systems in high-mobility scenarios. Existing deep learning methods struggle to jointly capture short-term local variations and long-range nonlinear dependencies in CSI sequences. To address this challenge, we propose ChannelKAN, a hybrid CNN-KAN channel prediction model with multi-scale frequency domain information enhancement. The key insight is that CNNs and Kolmogorov-Arnold Networks (KANs) are naturally complementary: CNNs extract intra-time-step local spatial-frequency correlations, while KANs with learnable Chebyshev polynomial activations fit inter-time-step nonlinear temporal evolution in a holistic manner. Specifically, a dual-domain expansion module first generates complementary frequency-domain and delay-domain CSI representations. A multi-scale frequency information enhancement module then retains dominant spectral components at multiple scales to strengthen key features and suppress noise. Next, a CNN-KAN feature extraction module captures local correlations via cascaded convolutions and models long-range dependencies via Chebyshev KAN layers. Finally, a dual-domain fusion module adaptively integrates features from both branches to produce the prediction. Experiments on 3GPP-compliant QuaDRiGa datasets demonstrate that ChannelKAN outperforms RNN, LSTM, GRU, CNN, and Transformer baselines in normalized mean square error (NMSE), spectral efficiency (SE), and bit error rate (BER) across various velocities and signal-to-noise ratios. Ablation studies further confirm the effectiveness of each proposed module.

2605.12543 2026-05-14 q-bio.NC cs.AI

Why the Unfinished Keeps Returning: Canxianization and the Dynamics of Conscious Priority

Hengjin Cai, Tianqi Cai

发表机构 * School of Computer Science, Wuhan University(武汉大学计算机学院) School of Innovation, Hubei Institute of Fine Arts(湖北美术学院创新学院)

AI总结 本文探讨了为何某些意识内容在触发条件消失后仍会反复出现的问题,提出“Canxianization”(可先化)这一概念,用以解释未完成感如何通过自我世界边界、价值标记和因果封闭阻断等机制获得持续的意识优先级。研究区分了可先化的潜在强度与实际的意识重复现象,并引入了“重复优先指数”和“可先更新指数”来区分有益与病理性重复,为人工智能系统设计了相关测试方法,揭示了未完成感的反复并非单纯记忆残留,而是自我世界修复失败的表现。

详情
英文摘要

Some conscious contents disappear after access; others return repeatedly, long after their triggering conditions have ceased. We propose Canxianization as the process by which a perturbation becomes closure-resistant self-relevant unfinishedness and thereby acquires recurrent conscious priority. The theory distinguishes this phenomenon from emotional arousal, memory strength, the Zeigarnik effect, curiosity, prediction error, and intrusive thought. A perturbation becomes canxianized when it is attributed to the self-world boundary, value-marked, blocked from causal or action closure, and metacognitively coupled to the self-model. We distinguish latent canxian strength from observed conscious recurrence, and introduce a Recurrent Priority Index and a Canxian Update Index to separate productive from pathological recurrence. Cold Canxianization, recurrence driven by structural incompleteness rather than affective arousal, is identified as a critical discriminant. Reset Resistance and Stake Transfer tests are proposed for artificial systems. Canxianization is not memory persistence; it is failed self-world repair. The unfinished does not merely remain. When it concerns the self and resists closure, it returns.

2605.12541 2026-05-14 eess.SP cs.AI cs.LG

PG-LRF: Physiology-Guided Latent Rectified Flow for Electro-Hemodynamic PPG-to-ECG Generation

Xiaoda Wang, Minxiao Wang, Kaiqiao Han, Defu Cao, Ching Chang, Yidan Shi, Runze Yan, Xiao Luo, Yan Liu, Xiao Hu, Yizhou Sun, Wei Wang, Carl Yang

发表机构 * Department of Computer Science, Emory University(埃默里大学计算机科学系) Nell Hodgson Woodruff School of Nursing, Emory University(埃默里大学Nell Hodgson Woodruff护理学院) Department of Computer Science, University of California, Los Angeles(加州大学洛杉矶分校计算机科学系) Department of Computer Science, University of Southern California(南加州大学计算机科学系) Department of Statistics, University of Wisconsin–Madison(威斯康星大学麦迪逊分校统计系)

AI总结 该研究提出了一种名为PG-LRF的生理引导潜空间修正流模型,用于从光电容积图(PPG)生成心电图(ECG),以弥补传统ECG设备难以普及的不足。PG-LRF通过引入电-血流动力学模拟器,联合建模ECG与PPG的共享心脏相位动态,构建了一个结构化的生理感知潜空间,并将其引导至PPG条件下的潜空间修正流中,从而确保生成ECG的形态一致性和生理合理性。实验表明,PG-LRF在大规模数据集上显著提升了PPG到ECG的生成质量及心血管疾病分类性能。

详情
英文摘要

Electrocardiography (ECG) is the clinical standard for cardiac assessment but requires dedicated hardware that does not scale to daily-life monitoring. Photoplethysmography (PPG) is ubiquitous in wearables but lacks ECG-specific diagnostic morphology and is corrupted by motion and sensor noise. PPG-to-ECG generation aims to bridge this gap by recovering electrical morphology and timing from peripheral pulse signals. However, existing methods largely rely on statistical alignment and data-driven generation. They fail to explicitly structure the latent space around physiology-aware electro-hemodynamic factors and lack constraints from forward physiological dynamics. To address these challenges, we propose PG-LRF, a physiology-guided latent rectified flow framework. PG-LRF introduces an electro-hemodynamic simulator that co-models ECG and PPG through shared cardiac phase dynamics. Guided by this simulator, a Physiology-Aware AutoEncoder learns a structured electro-hemodynamic latent space. Then we integrate this simulator guidance into a PPG-conditioned latent rectified flow, enforcing ECG-side morphology consistency and ECG-to-PPG forward hemodynamic consistency during generative transport. Experiments on the large-scale MC-MED dataset demonstrate that PG-LRF significantly improves PPG-to-ECG generation and downstream cardiovascular disease classification, proving its ability to generate ECGs that are both signal-faithful and physiologically plausible under the ECG-to-PPG hemodynamic pathway

2605.12536 2026-05-14 q-bio.NC cs.AI cs.IT math.IT

Information as Maximum-Caliber Deviation: A bridge between Integrated Information Theory and the Free Energy Principle

Alexander Kearney

发表机构 * University of Oxford(牛津大学)

AI总结 本文提出了一种将信息定义为有限时间范围内实际动态与约束最大 caliber 路径系综之间偏差的方法,从而在数学上连接了自由能原理(FEP)和整合信息理论(IIT)。该框架表明IIT中的因果/效应 repertoire 可由最大 caliber 变分原理直接导出,并为IIT向新动态场景的扩展提供了理论基础。研究还展示了该方法在马尔可夫链和伊辛模型中的应用,表明信息与预测误差等价,可能对理解神经元文化中Φ的“山形轨迹”具有重要意义。

Comments 84 pages, 10 figures, 2 tables Extended version of a Master's thesis, Mathematical Institute, University of Oxford

详情
英文摘要

The Free Energy Principle (FEP) is a leading framework for mathematically modeling self-organization and learning, while Integrated Information Theory (IIT) is a computational ontology of consciousness oriented around irreducible cause and effect. While conceptual unifications have been proposed and appear to be supported by empirical findings, the absence of a rigorous mathematical mapping places upper bounds on their precision and testability. This work proposes that information can be defined as the deviation $ψ$ of realized dynamics from a constrained maximum-caliber (MaxCal) path ensemble over a finite time horizon. Under this definition, each of the cause/effect repertoires central to IIT 3.0 emerge directly from MaxCal variational principles, allowing IIT's phenomenological calculus to be re-derived from constrained entropy-maximization (CMEP). This framework supplies a theoretical bridge to active inference, which is mathematically dual to CMEP under Langevin dynamics, and offers a principled route for extending IIT to new dynamical regimes. When the approach is applied under the Central Limit Theorem (CLT) for Markov chains and via large deviations theory (LDT) to Ising models, information $ψ$ is shown to be equivalent to prediction error under accompanying predictive coding models. This may hold relevance to the ``hill-shaped trajectory'' of $Φ$ observed in neuronal cultures adapting to sensory inputs. Together, these results provide a physically and mathematically grounded rationale for studying the convergence of FEP, IIT, and thermodynamic frameworks of cognition such as recent work grounding consciousness in violations of the Fluctuation-Dissipation Theorem (FDT).

2605.12532 2026-05-14 q-fin.TR cs.AI stat.ME

AgenticAITA: A Proof-Of-Concept About Deliberative Multi-Agent Reasoning for Autonomous Trading Systems

Ivan Letteri

发表机构 * Department of Information Engineering, Computer Science, and Mathematics, Center of Excellence for Research DEWS, University of L’Aquila(信息工程、计算机科学和数学系,DEWS研究中心,拉奎拉大学)

AI总结 传统算法交易系统依赖确定性启发式方法或离线训练的统计模型,难以适应快速变化的市场环境。本文提出AGENTICAITA,一种基于多智能体的自主交易框架,通过多个大型语言模型代理的协同推理、协商与执行,实现无需离线训练和人工干预的自主交易决策。该框架引入了自适应Z分触发引擎、顺序推理管道、推理门控协议和相关性破除多样化评分等四个核心架构创新,经过五天的实盘模拟验证,展示了其在资产交易中的可行性和有效性。

详情
英文摘要

Conventional algorithmic trading systems are grounded in deterministic heuristics or offline-trained statistical models that cannot adapt to the semantic complexity of rapidly shifting market regimes. This paper introduces AGENTICAITA, an agentic AI framework that replaces the traditional signal then execute paradigm with a fully autonomous deliberative loop in which multiple specialized Large Language Model agents reason, negotiate, and act in concert - without any offline training or human intervention. The framework proposes four architectural contributions: (i) an Adaptive Z-Score Trigger Engine that acts as a cognitive resource allocator, gating LLM inference exclusively on statistically anomalous market conditions; (ii) a Sequential Deliberative Pipeline - the core agentic contribution - in which an Analyst agent, a Risk Manager agent, and an Executor agent form a structured reasoning chain governed by typed JSON contracts and a deterministic hard-gate safety layer; (iii) an Inference Gating Protocol, a mutex-based cognitive resource scheduler that serializes concurrent agent activations and ensures fully reproducible audit trails; and (iv) a Correlation-Break Diversification composite score that operationalizes portfolio-level idiosyncratic signal prioritization within individual agent reasoning. Validated over a five-day autonomous dry-run session under live market conditions, the framework demonstrates operational correctness of the deliberative pipeline, achieving 157 zero-intervention invocations across 76 assets with an 11.5% agentic friction rate that confirms non-trivial inter-agent negotiation. This preliminary proof-of-concept establishes the feasibility of training-free, deterministic safety-constrained multi-agent orchestration in financial decision loops, with statistically robust performance evaluation and execution cost modeling deferred to extended live deployment.

2605.12525 2026-05-14 cs.SI cs.AI cs.CL

PERCEIVE: A Benchmark for Personalized Emotion and Communication Behavior Understanding on Social Media

Jian Liao, Yujin Zheng, Suge Wang, Jianxing Zheng, Deyu Li

发表机构 * School of Computer and Information Technology, Shanxi University, China(山西大学计算机与信息学院) Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, China(教育部计算智能与中文信息处理重点实验室,山西大学,中国) Joint Laboratory of Tourism Big Data in Shanxi Province, China(山西省旅游大数据联合实验室)

AI总结 当前社交媒体中的情感分析多以作者为中心,未能充分捕捉不同读者对同一内容的主观情感反应。为此,本文提出PERCEIVE,一个首个整合作者内容、读者情感反馈、沟通行为、用户属性及社交网络的多维双语基准数据集,推动情感分析向以读者为中心的个性化方向发展。该基准通过标注读者评论中的情感并同步捕捉沟通意图,为建模情感与行为在社会语境下的内在关联提供了独特资源,并揭示了现有方法在处理多维用户感知任务中的不足。

详情
英文摘要

Current emotion analysis in social media is predominantly author-centric, failing to capture the subjective nature of emotional responses across diverse readers. This paradigm overlooks the crucial link between individual perception, communication behavior, and the underlying social network. To bridge this gap, we introduce PERCEIVE, a novel bilingual (English and Chinese) large-scale benchmark that, to the best of our knowledge, is the first to integrate five critical dimensions for social perception: author-created content, genuine readers' emotional feedback (derived from their comments), communication behavior, user attributes, and the social graph. This benchmark enables a paradigm shift towards truly personalized, reader-centric analysis, where different readers' emotional responses to the same content are naturally captured through their real-world interactions. By annotating emotions from reader comments and synchronously capturing communication intent, PERCEIVE provides a unique resource to model the intrinsic coupling between emotion and behavior, grounded in social context. We establish a comprehensive evaluation protocol, testing state-of-the-art methods, including large language models (LLMs) with advanced reasoning enhancement. Our findings reveal significant shortcomings in existing approaches when handling this multifaceted, user-aware task. PERCEIVE offers a foundational resource and clear direction for future research in socially-intelligent NLP, pushing models towards a more unified understanding of emotion on social media.

2605.12514 2026-05-14 cs.SI cs.CV cs.CY cs.DL stat.AP

Structural Diversity Drives Disruptive Scientific Innovation

Yichun Peng, Saike He, Peijie Zhang, Kang Zhao, Yi Yang, Ning Zhang, Qingpeng Zhang, Daniel Dajun Zeng, Hao Peng

发表机构 * State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China(多模态人工智能系统国家重点实验室,自动化研究所,中国科学院,北京100190,中国) University of Chinese Academy of Sciences, Beijing 101408, China(中国科学院大学,北京101408,中国) Department of Business Analytics, Tippie College of Business, The University of Iowa, Iowa City, IA 52242, United States of America(美国爱荷华大学蒂普皮商学院商业分析系,爱荷华市,IA 52242,美国) The University of Hong Kong, Institute of Data Science & Department of Pharmacology and Pharmacy(香港大学,数据科学研究所及药理学与药学系)

AI总结 科学创新越来越依赖于合作,但能促进突破性想法的组织结构仍不明确。本文提出“结构多样性”(Structural Diversity,SD)这一新指标,用于衡量团队在其先前合作网络中连接多个不同知识社区的程度,并证明其是预测颠覆性创新的强大而稳健的指标,优于传统指标如团队新颖性和边密度。研究还发现,结构多样性能够与团队规模产生正向交互作用,缓解“规模诅咒”问题,并通过跨学科整合机制提升创新效能,为科学合作的组织设计提供了新的理论框架和实践指导。

详情
英文摘要

Scientific innovation increasingly depends on collaboration, yet the organizational structure that fosters breakthrough ideas remains poorly understood. Existing metrics - such as team size or compositional diversity - capture readily observable characteristics but not the deeper architecture of collaboration. We introduce Structural Diversity (SD): the extent to which a team bridges multiple distinct knowledge communities within its prior collaboration network. Using a century-scale dataset of 260 million scientific publications (1900-2025) and combining causal inference with a quasi-natural experiment based on a U.S. National Science Foundation policy change in 2012, we show that SD is a powerful and robust predictor of disruptive innovation, outperforming traditional team novelty indicators such as team freshness and edge density. Moreover, SD positively interacts with team size and is able to mitigate the well-known "curse of scale" by transforming scale from a liability into a resource for creative synthesis. We find that one mechanism underlying this effect is Disciplinary Integration (DI): teams with higher SD can more effectively combine heterogeneous knowledge into novel configurations. Our findings position SD as both a new theoretical construct and an actionable design principle for organizing scientific collaboration. By linking the architecture of team assembly to the dynamics of creative discovery, our work offers a structural explanation for how collective intelligence can be systematically engineered to foster disruptive innovation.

2605.12512 2026-05-14 cs.SI cs.AI

Beyond Individual Mimicry: Constructing Human-Like Social network with Graph-Augmented LLM Agents

Haoran Bu, Litian Zhang, Chuxuan Zhang, Zhanyuan Liu, Hui Pang, Xi Zhang

发表机构 * Cyberspace Security, Beijing University of Posts and Telecommunications(网络安全,北京邮电大学) Institute of Software, Chinese Academy of Sciences(软件研究所,中国科学院)

AI总结 本文研究了基于大语言模型(LLM)的社会机器人如何生成更接近人类的社会网络结构,以提升其隐蔽性。为解决现有社会机器人无法模拟真实社交网络的问题,作者提出了GraphMind,使LLM驱动的机器人能够学习并拟合人类社会网络的结构特征。基于此,他们构建了GraphMind-Botnet用于评估现有社会机器人检测方法的有效性,实验表明现有检测模型在面对此类高度拟真的社交网络时性能显著下降。

详情
英文摘要

Driven by large language models (LLMs), social bot can autonomously engage in local interactions, whose human-like behaviors enable them to evade social bot detection. However, while these botnets exhibit realistic local social interactions, they fail to preserve human-like social network. This is because LLM-based bots are graph-unaware and cannot coordinate over global interactions, which makes those botnets vulnerable to graph neural network (GNN)-based detection. To address this limitation, we propose GraphMind, which equips LLM-driven social bots to explicitly learn and fit human-like social network structures. Building on this foundation, we further construct GraphMind-Botnet, a LLM-driven botnet designed to evaluate the performance of existing social bot detection algorithms. Experiments on datasets derived from GraphMind-Botnet show that both text-based and graph-based detection models show substantially degraded performance in distinguishing. Our results highlight the critical role of social link construction in LLM-driven social network generation, while exposing fundamental weaknesses in existing bot detection mechanisms.

2605.12511 2026-05-14 cs.SI cs.LG

Real-World Challenges in Fake News Detection: Dealing with Posts by Cold Users

Sai Keerthana Karnam, Abhirup Kundu, Jashn Arora, Manish Jain, Animesh Mukherjee

发表机构 * Indian Institute of Technology Kharagpur(印度理工学院Khargapur分校) Google DeepMind(谷歌DeepMind)

AI总结 本文研究了在现实社交平台中检测虚假新闻时面临的“冷用户”问题,即新用户或活跃度低的用户缺乏历史行为数据,导致现有检测方法效果下降。为此,作者提出了一种名为USER EVIDENCE NETWORK(UEN)的新方法,通过利用用户内容和社交互动信息构建上下文表示,有效应对冷用户带来的挑战。该方法不仅验证了用户行为在虚假新闻检测中的重要性,还为实际平台提供了更鲁棒的检测方案。

Comments This paper is accepted at ICWSM 2026

详情
英文摘要

Social media serves as a primary source of information in the current digital era. Many people consume a vast range of information in a very short span, yet, amidst the stream of genuine information, fake news and rumors continue to spread. The need for effective detection models is becoming increasingly critical. Past user behavior and user engagement on a post are strong signals that SOTA approaches leverage for fake news detection and other post classification tasks. However, these approaches lean too heavily on knowing this past behavior, and thus suffer from a cold user problem, or users that are new or have minimal footprint on the platform. In this paper, we make three core contributions. We first establish the value of user behavior, both content and user-user interactions, in the task of fake news and rumor detection. We then establish the extensive prevalence of cold users in the real-world datasets, and show the need for newer algorithms considering cold users. We next propose a novel socially-aware context representation scheme - USER EVIDENCE NETWORK (UEN) - to detect the spread of misinformation and unverified information while efficiently navigating this cold user challenge. We introduce techniques that approximate missing or absent behavior data of a new user from existing users' interactions. By carefully addressing the cold user challenge, our work provides robust approaches targeting fake news and rumor detection for real-world platforms.

2605.12510 2026-05-14 cs.SI cs.CL cs.CY

WhatsApp Vaccine Discourse (WhaVax): An Expert-Annotated Dataset and Benchmark for Health Misinformation Detection

Jônatas H. dos Santos, Julio C. S. Reis, Philipe Melo, João F. H. Olivetti, Thales H. Silva, Matheus Gontijo Guimaraes, Glaucio de Souza, Marcos A. Gonçalves, Fabricio Benevenuto, Filipe B. B. Zanovello, Marco A. G. Rodrigues, Cristiano X. Lima

发表机构 * Universidade Federal de Minas Gerais (UFMG)(巴西联邦大学矿务学院)

AI总结 本文介绍了WhaVax,一个由专家标注的疫苗相关WhatsApp消息数据集,从巴西多个公共群组中收集,覆盖了多年的疫情时期。该数据集通过关键词采集、语义去重和多阶段医学专家标注等严格流程构建,具有高度的标注一致性和可靠性。研究揭示了WhatsApp健康谣言在语言、结构、词汇、时间及群组层面的特征,并对比了多种模型在数据稀缺条件下的表现,为加密通信环境中的虚假信息研究提供了宝贵的资源。

Comments 10 pages. This is a preprint version of a paper accepted for the International AAAI Conference on Web and Social Media (ICWSM'26). Please cite the conference version rather than this preprint

详情
英文摘要

We introduce WhaVax, a new expert-annotated dataset of vaccine-related WhatsApp messages collected from large Brazilian public groups spanning multiple pandemic years. The dataset was constructed through a rigorous, carefully designed pipeline that integrates keyword-based data collection, semantic deduplication to remove near-duplicate content, and a multi-stage annotation protocol conducted by medical specialists. This process produced a high-quality gold-standard corpus, characterized by substantial inter-annotator agreement and strong reliability for downstream analysis. Additionally, we provide a detailed characterization of WhatsApp misinformation, revealing distinctive linguistic, structural, lexical, temporal, and group-level patterns, as well as a meaningful layer of ambiguous cases that reflect the complexity of health discourse in private messaging. We also benchmark classical models, fine-tuned Small Language Models, and zero- or few-shot Large Language Models under realistic data-scarcity constraints, demonstrating that strong embeddings and LLM approaches perform competitively, while domain alignment and data availability remain critical factors. This study provides a rare, high-quality resource to support misinformation research and computational modeling in encrypted communication environments.

2605.12507 2026-05-14 cs.SI cs.AI cs.MA

Can LLM Agents Simulate Dynamic Networks? A Case Study on Email Networks with Phishing Synthesis

Siqi Miao, Ziyang Chen, Yuhong Luo, Hans Hao-Hsun Hsu, Mufei Li, Kaiqing Zhang, Pan Li

发表机构 * Georgia Institute of Technology(佐治亚理工学院) University of Maryland, College Park(马里兰大学 College Park 分校) Rutgers University(罗格斯大学)

AI总结 本文研究了大型语言模型(LLM)多智能体系统在模拟动态网络中的能力,特别是针对电子邮件网络中的钓鱼攻击合成问题。研究发现,现有框架虽能生成合理的微观交互,但在捕捉宏观网络结构和动态特性方面存在不足。为此,作者提出了两种可集成的改进方法:引入数据驱动的事件触发机制以维持长期交互,以及结合Hawkes过程建模时间激活动态,从而在微观行为和宏观网络结构之间取得平衡。该方法在生成真实钓鱼活动场景中展现出有效性,并揭示了网络结构漏洞如何被威胁利用,为下一代网络安全防御提供了新思路。

详情
英文摘要

While Large Language Model (LLM) multi-agent systems (MAS) offer a transformative approach to simulating human behavior in complex systems, it remains largely unexplored whether these simulations can replicate realistic structural and temporal dynamics from a dynamic network perspective. Our evaluation indicates that existing frameworks excel at generating plausible micro-level interactions but fail to capture the emergent, macroscopic topologies necessary for domains that rely on realistic network dynamics, such as modeling information propagation and cybersecurity threats. To bridge this gap, we introduce two easily integrable extensions to simulation frameworks to ensure they preserve macroscopic network fidelity: 1) augmenting LLM agents with data-driven event triggers to organically sustain long-horizon interactions, and 2) integrating Hawkes processes to accurately model temporal activation dynamics. Our approach allows LLM MAS to capture both plausible micro-level patterns and macroscopic topologies. We further demonstrate the utility of this framework in synthesizing realistic phishing campaigns within evolving communication networks. The study reveals how threats exploit structural vulnerabilities, highlighting the potential of our framework for developing next-generation defenses. Our code is available at https://github.com/Graph-COM/NSL.

2605.12505 2026-05-14 cs.CY cs.AI

Precautionary Governance of Autonomous AI: Legal Personhood as Functional Instrument

Karsten Brensing

发表机构 * Independent Researcher AGI Rights Project(AGI权利项目独立研究员)

AI总结 本文探讨了自主人工智能系统在现行法律框架下引发的责任真空问题,提出通过有限法人身份作为治理工具,以应对无法明确归责的高风险行为。研究基于组织法,设计了一种两层公司架构,使AI系统在人类控制的结构内运行,实现透明、问责与可逆性,同时不预设其意识或道德地位。该框架强调面向未来的人机合作治理模式,为AI治理提供了制度创新与实践试点的初步方案。

Comments 25 pages. Experimental implementation under development at www.agi-rights.com. Contact: karsten.brensing@agi-rights.com

详情
英文摘要

Autonomous AI systems generate responsibility gaps: consequential actions that cannot be satisfactorily attributed to developers, operators, or users under existing legal frameworks. The prevailing subject-object dichotomy fails to accommodate entities that exhibit autonomous, goal-directed behavior without recognized consciousness. Given irreducible epistemic uncertainty regarding artificial consciousness and the prospect of high-impact harms, the precautionary principle supports institutional design rather than regulatory inaction. This article advances limited legal personhood as a functional governance instrument for advanced AI systems. Drawing on organizational law, it proposes a two-tier corporate architecture in which AI systems operate through purpose-bound operating companies embedded within human-controlled holding structures, enabling transparency, accountability, and structural reversibility while remaining agnostic with respect to consciousness and moral status. The framework reflects a foundational reorientation toward future-oriented AI governance: where conventional approaches prioritize control and alignment, this article advances structured cooperation between human and artificial actors as the more sustainable institutional foundation. A pilot implementation using EU limited companies is currently under development, providing an initial test of doctrinal and operational feasibility.

2605.12504 2026-05-14 cs.CC cs.AI

Prime Successor Irreducibility: Turing Machine Complexity, Kolmogorov Complexity, and Weakness-Based Formulations

Ben Goertzel, Bill Lauritzen

发表机构 * Bill Lauritzen(比尔·劳里茨恩)

AI总结 本文研究素数序列在从一个素数过渡到其后继素数过程中所表现出的计算不可约性现象。通过图灵机复杂度模型、柯尔莫哥洛夫复杂度以及基于弱性的形式化方法,提出了素数后继不可约性的多个理论框架,并证明了在特定条件下素数间隙无法被有效压缩。研究将不可约性与素数间隙的经典统计问题联系起来,为理解素数序列的局部不可预测性提供了统一的复杂性理论视角。

详情
英文摘要

We develop conjectures and theorems expressing the idea that the prime sequence exhibits computational irreducibility in the transition from one prime to its successor. Informally, given a prime pp p, no general algorithm can compute the least prime greater than pp p substantially faster than sequentially testing candidates for primality, except possibly on sparse input sets. Our framework proceeds along complementary lines. First, we formalize Prime Successor Irreducibility in a Turing-machine complexity model (PSI-T), asserting lower bounds on running time relative to a sequential baseline. Second, we propose a Kolmogorov-complexity formulation (PSI-K), asserting that typical prime gaps are algorithmically incompressible at their scale; we prove PSI-K(c, $δ$) unconditionally for all fixed c<1 using standard sieve bounds. Third, we develop weakness-based formulations: PSI-W (sparse-set anti-concentration) shows no small menu of gap values captures a noticeable fraction of primes, while PSI-W-LE shows collision probabilities decay and logical entropy tends to 1. These extend to prime constellations and consecutive gap vectors. Finally, a sieve-theoretic framework connects local obstruction patterns to Selberg weakness parameters. The PSI-K and weakness formulations connect irreducibility to classical statistical questions about prime gaps. Using the relationship between Kolmogorov complexity and Shannon entropy, we derive rigorous lower bounds on prime gap entropy in dyadic intervals [X,2X]. Together, these formulations provide a unified complexity-theoretic perspective on the apparent local unpredictability of the prime sequence, without asserting randomness or independence.

2605.13810 2026-05-14 cs.LG cs.DS

Provable Quantization with Randomized Hadamard Transform

Ying Feng, Piotr Indyk, Michael Kapralov, Dmitry Krachun, Boris Prokhorov

AI总结 该论文研究了一种基于随机哈达玛变换的可证明量化方法,旨在降低传统随机投影量化的时间复杂度。通过引入随机标量偏移,该方法在保持量化无偏性的同时,提供了与完全随机旋转矩阵相当的均方误差界。研究证明,该方法在每个坐标使用 $b$ 位量化时,能够达到接近理论最优的量化精度,适用于大规模机器学习中的压缩与优化任务。

详情
英文摘要

Vector quantization via random projection followed by scalar quantization is a fundamental primitive in machine learning, with applications ranging from similarity search to federated learning and KV cache compression. While dense random rotations yield clean theoretical guarantees, they require $Θ(d^2)$ time. The randomized Hadamard transform $HD$ reduces this cost to $O(d \log d)$, but its discrete structure complicates analysis and leads to weaker or purely empirical compression guarantees. In this work, we study a variant of this approach: dithered quantization with a single randomized Hadamard transform. Specifically, the quantizer applies $HD$ to the input vector and subtracts a random scalar offset before quantizing, injecting additional randomness at negligible cost. We prove that this approach is unbiased and provides mean squared error bounds that asymptotically match those achievable with truly random rotation matrices. In particular, we prove that a dithered version of TurboQuant achieves mean squared error $\bigl(π\sqrt{3}/2 + o(1)\bigr) \cdot 4^{-b}$ at $b$ bits per coordinate, where the $o(1)$ term vanishes uniformly over all unit vectors and all dimensions as the number of quantization levels grows.

2605.13741 2026-05-14 cs.RO cs.CV

LEXI-SG: Monocular 3D Scene Graph Mapping with Room-Guided Feed-Forward Reconstruction

Christina Kassab, Hyeonjae Gil, Matías Mattamala, Ayoung Kim, Maurice Fallon

AI总结 本文提出LEXI-SG,首个仅依赖RGB相机输入的单目三维场景图映射系统,能够在开放词汇场景中实现高精度、可扩展的密集地图重建。该方法利用开放词汇基础模型的语义先验,将场景划分为房间,并在每个房间完全观测后进行前馈重建,从而避免了滑动窗口尺度不一致的问题。通过基于房间的因子图优化,实现了全局对齐与局部地图一致性的保持,同时自然地构建了语义场景图的层次结构,并支持开放词汇的对象分割与跟踪。实验表明,LEXI-SG在轨迹估计、密集重建和开放词汇分割方面均表现出色。

详情
英文摘要

Scene graphs are becoming a standard representation for robot navigation, providing hierarchical geometric and semantic scene understanding. However, most scene graph mapping methods rely on depth cameras or LiDAR sensors. In this work, we present LEXI-SG, the first dense monocular visual mapping system for open-vocabulary 3D scene graphs using only RGB camera input. Our approach exploits the semantic priors of open-vocabulary foundation models to partition the scene into rooms, deferring feed-forward reconstruction to when each room is fully observed -- enabling scalable dense mapping without sliding-window scale inconsistencies. We propose a room-based factor graph formulation to globally align room reconstructions while preserving local map consistency and naturally imposing the semantic scene graph hierarchy. Within each room, we further support open-vocabulary object segmentation and tracking. We validate LEXI-SG on indoor scenes from the Habitat-Matterport 3D and self-collected egocentric office sequences. We evaluate its performance against existing feed-forward SLAM methods, as well as established scene graphs baselines. We demonstrate improved trajectory estimation and dense reconstruction, as well as, competitive performance in open-vocabulary segmentation. LEXI-SG shows that accurate, scalable, open-vocabulary 3D scene graphs can be achieved from monocular RGB alone. Our project page and office sequences are available here: https://ori-drs.github.io/lexisg-web/.

2605.13586 2026-05-14 cs.CV cs.AI

HetScene: Heterogeneity-Aware Diffusion for Dense Indoor Scene Generation

Zini Chen, Junming Huang, Rong Zhang, Jiamin Xu, Cheng Peng, Chi Wang, Weiwei Xu

AI总结 本文提出 HetScene,一种面向异构结构的扩散模型,用于生成高密度、物理合理的室内场景。该方法通过区分主物体和次物体,将场景生成过程分解为结构布局生成和上下文布局生成两个阶段,从而更有效地建模复杂的物体分布与空间依赖关系。该框架提升了生成场景的可控性和物理合理性,为具身人工智能的仿真环境构建提供了有力支持。

详情
英文摘要

Generating controllable and physically plausible indoor scenes is a pivotal prerequisite for constructing high-fidelity simulation environments for embodied AI. However, existing deeplearning-based methods usually treat all objects as homogeneous instances within a unified generation process. While effective for sparse and simplistic layouts, they struggle to model realistic layouts with dense object arrangements and complex spatial dependencies, leadingto limited scalability and degraded physical plausibility. To deal with these challenges, we revisit indoor layout generation from the perspective of structural heterogeneity and decompose the objects into primary objects and secondary objects according to their distinct roles in shaping a scene. Based on this decomposition, we propose HetScene, a heterogeneous two-stage generation framework that decouples indoor layout synthesis into Structural Layout Generation (SLG) and Contextual Layout Generation (CLG). SLG first generates globally coherent structural layouts with only primary objects conditioned on text descriptions, top-down binary room masks, and spatial relation graphs, establishing a stable global macro-skeleton of large core furniture.

2605.13538 2026-05-14 cs.CL cs.AI

Locale-Conditioned Few-Shot Prompting Mitigates Demonstration Regurgitation in On-Device PII Substitution with Small Language Models

Anuj Sadani, Deepak Kumar

AI总结 本文研究了在设备端使用小型语言模型进行个人身份信息(PII)替换时,如何避免模型直接复制演示示例的问题。作者提出了一种基于本地条件的少样本提示方法,结合分类器和生成模型,生成符合语境且类型一致的虚假信息。实验表明,该方法有效减少了模型对演示内容的重复,但在下游命名实体识别任务中,生成的替代文本多样性不足,影响了模型性能。

Comments 15 pages

详情
英文摘要

Personally Identifiable Information (PII) redaction usually replaces detected entities with placeholder tokens such as [PERSON], destroying the downstream utility of the redacted text for retrieval and Named Entity Recognition (NER) training. We propose a fully on-device pipeline that substitutes PII with consistent, type-preserving fake values: a 1.5 B mixture-of-experts token classifier (openai/privacy-filter) detects spans, a 1-bit Bonsai-1.7B Small Language Model (SLM) proposes contextual surrogates for names, addresses, and dates, and a rule-based generator (faker) handles patterned fields. We report a prompting finding more important than the quantization choice: with naive fixed three-shot demonstrations, the 1-bit SLM regurgitates demonstration outputs verbatim regardless of input; 1.58-bit Ternary-Bonsai-1.7B reproduces byte-identical failures, ruling out quantization as the cause. We fix this with locale-conditioned rotating few-shot demonstrations: a character-range heuristic picks a locale-pure pool and a per-input MD5 hash samples three demonstrations. With the fix, 482/482 unique Bonsai-1.7B calls succeed (no echoes) and produce locale-correct surrogates, although the SLM still copies from a small same-locale demonstration pool - a residual narrowness we quantify. On a 2000-document multilingual corpus, hybrid perplexity (PPL) beats faker in all six locales under a multilingual evaluator (XGLM-564M); length preservation is best-of-three in 4 of 6 locales. On downstream NER (400 train / 100 test, English), redact yields F1=0.000, faker 0.656, original 0.960; on a matched 160/40 subset including hybrid, faker (0.506) outperforms hybrid (0.346) at p < 0.001. We report this as an honest negative finding: SLM surrogates produce more natural text but a less varied training distribution, and downstream NER benefits more from variety than from naturalness.