arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2402.06922 2026-04-22 cs.CR cs.LG

Whispers in the Machine: Confidentiality in Agentic Systems

Jonathan Evertz, Merlin Chlosta, Lea Schönherr, Thorsten Eisenhofer

Comments Accepted at Conference on Detection of Intrusions and Malware & Vulnerability Assessment (DIMVA) 2026

2308.13983 2026-04-22 physics.ao-ph cs.LG

Interpolation of mountain weather forecasts by machine learning

Kazuma Iwase, Tomoyuki Takenawa

Comments 9 pages

1902.07683 2026-04-22 cs.HC cs.AI

Modelling and Analysing Behaviours and Emotions via Complex User Interactions

Mohamed Mostafa

Comments 176 pages; PhD thesis accepted at Cardiff Metropolitan University, UK (February 2019)

详情

英文摘要

Over the past 15 years, the volume, richness and quality of data collected from the combined social networking platforms has increased beyond all expectation, providing researchers from a variety of disciplines to use it in their research. Perhaps more impactfully, it has provided the foundation for a range of new products and services, transforming industries such as advertising and marketing, as well as bringing the challenges of sharing personal data into the public consciousness. But how to make sense of the ever-increasing volume of big social data so that we can better understand and improve the user experience in increasingly complex, data-driven digital systems. This link with usability and the user experience of data-driven system bridges into the wider field of HCI, attracting interdisciplinary researchers as we see the demand for consumer technologies, software and systems, as well as the integration of social networks into our everyday lives. The fact that the data largely posted on social networks tends to be textual, provides a further link to linguistics, psychology and psycholinguistics to better understand the relationship between human behaviours offline and online. In this thesis, we present a novel conceptual framework based on a complex digital system using collected longitudinal datasets to predict system status based on the personality traits and emotions extracted from text posted by users. The system framework was built using a dataset collected from an online scholarship system in which 2000 students had their digital behaviour and social network behaviour collected for this study. We contextualise this research project with a wider review and critical analysis of the current psycholinguistics, artificial intelligence and human-computer interaction literature, which reveals a gap of mapping and understanding digital profiling against system status.

URL PDF HTML ☆

赞 0 踩 0

2604.19309 2026-04-22 cs.HC cs.AI

Co-Refine: AI-Powered Tool Supporting Qualitative Analysis

Athikash Jeyaganthan, Kai Xu, Franziska Becker, Steffen Koch

Comments 7 pages, 4 figures. Includes details on system architecture, a three-stage audit pipeline, and a formative user study

2604.19281 2026-04-22 cs.HC cs.AI cs.CL cs.LG

Beyond Semantic Similarity: A Component-Wise Evaluation Framework for Medical Question Answering Systems with Health Equity Implications

Abu Noman Md Sakib, Md. Main Oddin Chisty, Zijie Zhang

Comments Accepted in the Ninth Annual ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT) 2026

详情

英文摘要

The use of Large Language Models (LLMs) to support patients in addressing medical questions is becoming increasingly prevalent. However, most of the measures currently used to evaluate the performance of these models in this context only measure how closely a model's answers match semantically, and therefore do not provide a true indication of the model's medical accuracy or of the health equity risks associated with it. To address these shortcomings, we present a new evaluation framework for medical question answering called VB-Score (Verification-Based Score) that provides a separate evaluation of the four components of entity recognition, semantic similarity, factual consistency, and structured information completeness for medical question-answering models. We perform rigorous reviews of the performance of three well-known and widely used LLMs on 48 public health-related topics taken from high-quality, authoritative information sources. Based on our analyses, we discover a major discrepancy between the models' semantic and entity accuracy. Our assessments of the performance of all three models show that each of them has almost uniformly severe performance failures when evaluated against our criteria. Our findings indicate alarming performance disparities across various public health topics, with most of the models exhibiting 13.8% lower performance (compared to an overall average) for all the public health topics that relate to chronic conditions that occur in older and minority populations, which indicates the existence of what's known as condition-based algorithmic discrimination. Our findings also demonstrate that prompt engineering alone does not compensate for basic architectural limitations on how these models perform in extracting medical entities and raise the question of whether semantic evaluation alone is a sufficient measure of medical AI safety.

URL PDF HTML ☆

赞 0 踩 0

2604.19251 2026-04-22 cs.LO cs.AI

Streamliners for Answer Set Programming

Florentina Voboril, Martin Gebser, Stefan Szeider, Alice Tarzariol

Comments To appear in Technical Communications of the 42nd International Conference on Logic Programming (ICLP 2026)

2604.19204 2026-04-22 cs.CY cs.LG

Auditing LLMs for Algorithmic Fairness in Casenote-Augmented Tabular Prediction

Xiao Qi Lee, Ezinne Nwankwo, Angela Zhou

2604.19202 2026-04-22 cs.GR cs.CV

SketchFaceGS: Real-Time Sketch-Driven Face Editing and Generation with Gaussian Splatting

Bo Li, Jiahao Kang, Yubo Ma, Feng-Lin Liu, Bin Liu, Fang-Lue Zhang, Lin Gao

2604.19176 2026-04-22 eess.IV cs.LG math.OC

Deep Image Prior for photoacoustic tomography can mitigate limited-view artifacts

Hanna Pulkkinen, Jenni Poimala, Leonid Kunyansky, Janek Gröhl, Andreas Hauptmann

2604.19165 2026-04-22 stat.ML cs.LG cs.NA math.NA

Analytical Extraction of Conditional Sobol' Indices via Basis Decomposition of Polynomial Chaos Expansions

Shijie Zhong, Jiangfeng Fu

Comments 11 pages, 2 figures

2604.19118 2026-04-22 cs.CR cs.AI

DP-FlogTinyLLM: Differentially private federated log anomaly detection using Tiny LLMs

Isaiah Thompson, Tanmay Sen, Ritwik Bhattacharya

2604.19113 2026-04-22 cs.IR cs.AI

Think Before Writing: Feature-Level Multi-Objective Optimization for Generative Citation Visibility

Zikang Liu, Peilan Xu

Comments 14 pages, 5 figures

2604.19106 2026-04-22 cs.AR cs.AI cs.LG

Design Rules for Extreme-Edge Scientific Computing on AI Engines

Zhenghua Ma, G Abarajithan, Dimitrios Danopoulos, Olivia Weng, Francesco Restuccia, Ryan Kastner

2604.19099 2026-04-22 cs.HC cs.AI

Relational AI in Education: Reciprocity, Participatory Design, and Indigenous Worldviews

Roberto Martinez-Maldonado, Vanessa Echeverria, Jenna Hawes, YJ Kim, Zara Maddigan, Mikaela Milesi, Todd Nelson, Yi-Shan Tsai

Comments Accepted

2604.19091 2026-04-22 stat.ML cs.LG

Fast estimation of Gaussian mixture components via centering and singular value thresholding

Huan Qing

Comments 28 pages, 7 figures, 1 table

2604.19083 2026-04-22 cs.CR cs.AI

ProjLens: Unveiling the Role of Projectors in Multimodal Model Safety

Kun Wang, Cheng Qian, Miao Yu, Lilan Peng, Liang Lin, Jiaming Zhang, Tianyu Zhang, Yu Cheng, Yang Wang

Comments 18 pages ,15 figures

2604.19079 2026-04-22 eess.AS cs.AI cs.CL cs.HC

Reducing the Offline-Streaming Gap for Unified ASR Transducer with Consistency Regularization

Andrei Andrusenko, Vladimir Bataev, Lilit Grigoryan, Nune Tadevosyan, Vitaly Lavrukhin, Boris Ginsburg

2604.19049 2026-04-22 cs.CR cs.AI cs.SE

Refute-or-Promote: An Adversarial Stage-Gated Multi-Agent Review Methodology for High-Precision LLM-Assisted Defect Discovery

Abhinav Agarwal

Comments 10 pages, 3 tables. Artifacts: https://github.com/abhinavagarwal07/refute-or-promote (Zenodo DOI: 10.5281/zenodo.19668799)

2604.18973 2026-04-22 stat.AP cs.LG

Ground-Level Near Real-Time Modeling for PM2.5 Pollution Prediction

Zachary R. Fox, Janet O. Agbaje, Dakotah Maguire, Javier E. Santos, Jeremy Logan, Maggie Davis, Rima Habre, Jim VanDerslice, Heidi A. Hanson

详情

英文摘要

Air pollution is a worldwide public health threat that can cause or exacerbate many illnesses, including respiratory disease, cardiovascular disease, and some cancers. However, epidemiological studies and public health decision-making are stymied by the inability to assess pollution exposure impacts in near real time. To address this, developing accurate digital twins of environmental pollutants will enable timely data-driven analytics - a crucial step in modernizing health policy and decision-making. Although other models predict and analyze fine particulate matter exposure, they often rely on modeled input data sources and data streams that are not regularly updated. Another challenge stems from current models relying on predefined grids. In contrast, our deep-learning approach interpolates surface level PM2.5 concentrations between sparsely distributed US EPA monitoring stations in a grid-free manner. By incorporating additional, readily available datasets - including topographic, meteorological, and land-use data - we improve its ability to predict pollutant concentrations with high spatial and temporal resolution. This enables model querying at any spatial location for rapid predictions without computing over the entire grid. To ensure robustness, we randomize spatial sampling during training to enable our model to perform well in both dense and sparse monitored regions. This model is well suited for near real-time deployment because its lightweight architecture allows for fast updates in response to streaming data. Moreover, model flexibility and scalability allow it to be adapted to various geographical contexts and scales, making it a practical tool for delivering accurate and timely air quality assessments. Its capacity to rapidly evaluate multiple scenarios can be especially valuable for decision-making during public health crises.

URL PDF HTML ☆

赞 0 踩 0

2604.18918 2026-04-22 cs.SE cs.LG

From Particles to Perils: SVGD-Based Hazardous Scenario Generation for Autonomous Driving Systems Testing

Linfeng Liang, Xiao Cheng, Tsong Yueh Chen, Xi Zheng

2604.18893 2026-04-22 cs.CY cs.AI cs.ET

Regulating Artificial Intimacy: From Locks and Blocks to Relational Accountability

Henry Fraser, Jessica M. Szczuka, Raffaele F. Ciriello

详情

DOI: 10.1145/3805689.3806790
Journal ref: 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT26), Montreal, Canada

英文摘要

A series of high-profile tragedies involving companion chatbots has triggered an unusually rapid regulatory response. Several jurisdictions, including Australia, California, and New York, have introduced enforceable regulation, while regulators elsewhere have signaled growing concern about risks posed by companion chatbots, particularly to children. In parallel, leading providers, notably OpenAI, appear to have strengthened their self-regulatory approaches. Drawing on legal textual analysis and insights from regulatory theory, psychology, and information systems research, this paper critically examines these recent interventions. We examine what is regulated and who is regulated, identifying regulatory targets, scope, and modalities. We classify interventions by method and priority, showing how emerging regimes combine "locks and blocks", such as access gating and content moderation, with measures addressing toxic relationship features and process-based accountability requirements. We argue that effective regulation of companion chatbots must integrate all three dimensions. More, however, is required. Current regimes tend to focus on discrete harms, narrow conceptions of vulnerability, or highly specified accountability processes, while failing to confront deeper power asymmetries between providers and users. Providers of companion chatbots increasingly control artificial intimacy at scale, creating unprecedented opportunities for control through intimacy. We suggest that a general, open-ended duty of care would be an important first step toward constraining that power and addressing a fundamental source of chatbot risk. The paper contributes to debates on companion chatbot regulation and is relevant to regulators, platform providers, and scholars concerned with digital intimacy, law and technology, and fairness, accountability, and transparency in sociotechnical systems.

URL PDF HTML ☆

赞 0 踩 0

2604.18883 2026-04-22 cs.HC cs.AI cs.SE

Choose Your Own Adventure: Non-Linear AI-Assisted Programming with EvoGraph

Vassilios Exarhakos, Jinghui Cheng, Jin L. C. Guo

2604.18862 2026-04-22 cs.SE cs.AI

Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning

Guoming Long, Shihai Wang, Hui Fang, Tao Chen

Comments Accepted by TOSEM

详情

英文摘要

Bug reports, encompassing a wide range of bug types, are crucial for maintaining software quality. However, the increasing complexity and volume of bug reports pose a significant challenge in sole manual identification and assignment to the appropriate teams for resolution, as dealing with all the reports is time-consuming and resource-intensive. In this paper, we introduce a cross-project framework, dubbed Mutualistic Neural Active Learning (MNAL), designed for automated and more effective identification of bug reports from GitHub repositories boosted by human-machine collaboration. MNAL utilizes a neural language model that learns and generalizes reports across different projects, coupled with active learning to form neural active learning. A distinctive feature of MNAL is the purposely crafted mutualistic relation between the machine learners (neural language model) and human labelers (developers) when enriching the knowledge learned. That is, the most informative human-labeled reports and their corresponding pseudo-labeled ones are used to update the model while those reports that need to be labeled by developers are more readable and identifiable, thereby enhancing the human-machine teaming therein. We evaluate MNAL using a large scale dataset against the SOTA approaches, baselines, and different variants. The results indicate that MNAL achieves up to 95.8% and 196.0% effort reduction in terms of readability and identifiability during human labeling, respectively, while resulting in a better performance in bug report identification. Additionally, our MNAL is model-agnostic since it is capable of improving the model performance with various underlying neural language models. To further verify the efficacy of our approach, we conducted a qualitative case study involving 10 human participants, who rate MNAL as being more effective while saving more time and monetary resources.

URL PDF HTML ☆

赞 0 踩 0

2604.18860 2026-04-22 cs.CR cs.AI

Temporal UI State Inconsistency in Desktop GUI Agents: Formalizing and Defending Against TOCTOU Attacks on Computer-Use Agents

Wenpeng Xu

2604.18850 2026-04-22 cs.HC cs.AI cs.SI

The Triadic Loop: A Framework for Negotiating Alignment in AI Co-hosted Livestreaming

Katherine Wang, Nadia Berthouze, Aneesha Singh

Comments 6 pages, 1 figure, Proceedings the Human-AI Interaction Alignment Workshop at CHI 2026 (CHI26 BiAlign Workshop)

2604.18846 2026-04-22 quant-ph cs.LG

Trainability Beyond Linearity in Variational Quantum Objectives

Gordon Ma, Xiufan Li

Comments 28 pages, 6 figures

详情

英文摘要

Barren-plateau results have established exponential gradient suppression as a widely cited obstacle to the scalability of variational quantum algorithms. When and whether these results extend to a given objective has been addressed through loss-specific arguments, but a general structural characterization has remained open. We show that the objective itself admits a fixed-observable representation if and only if the loss is affine in the measured statistics, thereby identifying the exact boundary of the standard concentration-based proof template. Existing transfer results for non-affine losses achieve this reduction under additional assumptions; our characterization implies that such a reduction is not structurally available for a class of non-affine objectives, placing them outside the automatic reach of the existing proof template. Beyond the affine regime, a chain-rule decomposition reveals three governing factors -- model responsivity, loss-side signal, and transmittance -- and induces a loss-class dichotomy: bounded-gradient losses inherit suppression, while amplification-capable losses can in principle counteract it. In the exponentially wide setting, both classes fail, but for different structural reasons. When the interface is instead designed at polynomial width -- exposing coarse-grained statistics rather than individual bitstring probabilities -- the exponential-dimensional obstruction is relaxed and the dichotomy plays a genuine role. In a numerical demonstration on a charge-conserving quantum system, the amplification-capable objective produces resolved gradients several orders of magnitude larger than affine and inheriting baselines at comparable shot budgets. Over the tested interval, its scaling trend is statistically distinguished from the exponential trend of both alternatives. The boundary is affine; what lies beyond it is a representation-design problem.

URL PDF HTML ☆

赞 0 踩 0

2604.18837 2026-04-22 quant-ph cs.LG

Benchmarking Quantum Kernel Support Vector Machines Against Classical Baselines on Tabular Data: A Rigorous Empirical Study with Hardware Validation

Siavash Kakavand, Christoph Strohmeyer, Michael Schlotter

Comments Code and data: https://doi.org/10.5281/zenodo.19197916

2604.18727 2026-04-22 physics.ao-ph cs.AI nlin.CD

Skillful Global Ocean Emulation and the Role of Correlation-Aware Loss

Niraj Agarwal, Timothy A. Smith, Sergey Frolov, Laura C. Slivinski

Comments 13 pages, 4 figures

2604.18721 2026-04-22 eess.IV cs.CV

A Controlled Benchmark of Visual State-Space Backbones with Domain-Shift and Boundary Analysis for Remote-Sensing Segmentation

Nichula Wasalathilaka, Dineth Perera, Oshadha Samarakoon, Buddhi Wijenayake, Roshan Godaliyadda, Vijitha Herath, Parakrama Ekanayake

Comments 5 pages, 3 figures, Accepted for publication at IEEE IGARSS 2026

2604.18718 2026-04-22 cs.CR cs.AI

Towards Optimal Agentic Architectures for Offensive Security Tasks

Isaac David, Arthur Gervais

Comments 18 pages, 4 figures, supplementary appendix and benchmark artifacts