arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 2941
2603.20262 2026-03-24 q-bio.BM cs.AI cs.LG

Deciphering Scientific Reasoning Steps from Outcome Data for Molecule Optimization

Zequn Liu, Kehan Wu, Shufang Xie, Zekun Guo, Wei Zhang, Tao Qin, Renhe Liu, Yingce Xia

Comments Work in progress, 37 pages

详情
英文摘要

Emerging reasoning models hold promise for automating scientific discovery. However, their training is hindered by a critical supervision gap: experimental outcomes are abundant, whereas intermediate reasoning steps are rarely documented at scale. To bridge this gap, we propose DESRO, a framework for deciphering scientific reasoning from outcomes. By analyzing shared patterns and key differences within grouped data, a large language model (LLM) can recover the underlying logic. We instantiate this framework in molecule optimization, a pivotal stage in drug discovery that traditionally relies on the iterative reasoning of medicinal chemists. Across 2.3 million molecular property records, our framework infers optimization rationales by grouping molecules with shared fragments, then using an LLM to analyze how structural variations correlate with property differences. Based on the derived data, we train a model that conducts molecule optimization through an interpretable reasoning process. DESRO achieves the highest success rates on 15 out of 18 tasks, spanning both single- and multi-property optimization of bioactivity and ADMET properties. The reasoning process enables robust generalization to out-of-distribution scenarios, including novel property combinations, unseen biological targets, and unseen properties defined solely by natural language descriptions. In retrospective case studies under strict temporal splits, the model autonomously reconstructs expert-level lead optimization trajectories. Additionally, our framework extends beyond molecule optimization to reaction ligand selection. Our results establish deciphering reasoning steps from outcome data as a viable paradigm for enabling scientific reasoning, providing a scalable approach to accelerate scientific discovery.

2603.20258 2026-03-24 eess.SP cs.AI cs.LG

The Deep-Match Framework for Event-Related Potential Detection in EEG

Marek Zylinski, Bartosz Tomasz Smigielski, Gerard Cybulski

详情
英文摘要

Reliable detection of event-related potentials (ERPs) at the single-trial level remains a major challenge due to the low signal-to-noise ratio EEG recordings. In this work, we investigate whether incorporating prior knowledge about ERP templates into deep learning models can improve detection performance. We employ the Deep-Match framework for ERP detection using multi-channel EEG signals. The model is trained in two stages. First, an encoder-decoder architecture is trained to reconstruct input EEG signals, enabling the network to learn compact signal representations. In the second stage, the decoder is replaced with a detection module, and the network is fine-tuned for ERP identification. Two model variants are evaluated: a standard model with randomly initialized filters and a Deep-MF model in which input kernels are initialized using ERP templates. Model performance is assessed on a single-trial ERP detection task using leave-one-subject-out validation. The proposed Deep-MF model slightly outperforms the detector with standard kernel initialization for most held-out subjects. Despite substantial inter-subject variability, Deep-MF achieves a higher average F1-score (0.37) compared to the standard network (0.34), indicating improved robustness to cross-subject differences. The best performance obtained by Deep-MF reaches an F1-score of 0.71, exceeding the maximum score achieved by the standard model (0.59). These results demonstrate that ERP-informed kernel initialization can provide consistent improvements in subject-independent single-trial ERP detection. Overall, the findings highlight the potential of integrating domain knowledge with deep learning architectures for EEG analysis. The proposed approach represents a step toward practical wearable EEG and passive brain-computer interface systems capable of real-time monitoring of cognitive processes.

2603.20254 2026-03-24 cs.CY cs.AI stat.OT

AI Detectors Fail Diverse Student Populations: A Mathematical Framing of Structural Detection Limits

Nathan Garland

详情
英文摘要

Student experiences and empirical studies report that "black box" AI text detectors produce high false positive rates with disproportionate errors against certain student populations, yet typically theoretical analyses model detection as a test between two known distributions for human and AI prose. This framing omits the structural feature of university assessment whereby an assessor generally does not know the individual student's writing distribution, making the null hypothesis composite. Standard application of the variational characterisation of total variation distance to this composite null shows trade-off bounds that any text-only, one-shot detector with useful power must produce false accusations at a rate governed by the distributional overlap between student writing and AI output. This is a constraint arising from population diversity that is logically independent of AI model quality and cannot be overcome by better detector engineering or technology. A subgroup mixture bound connects these quantities to observable demographic groups, providing a theoretical basis for the disparate impact patterns documented empirically. We propose suggestions to improve policy and practice, and argue that detection scores should not serve as sole evidence in misconduct proceedings.

2603.20250 2026-03-24 physics.ao-ph cs.AI cs.LG physics.data-an

Developing Machine Learning-Based Watch-to-Warning Severe Weather Guidance from the Warn-on-Forecast System

Montgomery Flora, Samuel Varga, Corey Potvin, Noah Lang

Comments 28 pages, 7 figures

详情
英文摘要

While machine learning (ML) post-processing of convection-allowing model (CAM) output for severe weather hazards (large hail, damaging winds, and/or tornadoes) has shown promise for very short lead times (0-3 hours), its application to slightly longer forecast windows remains relatively underexplored. In this study, we develop and evaluate a grid-based ML framework to predict the probability of severe weather hazards over the next 2-6 hours using forecast output from the Warn-on-Forecast System (WoFS). Our dataset includes WoFS ensemble forecasts valid every 5 minutes out to 6 hours from 108 days during the 2019--2023 NOAA Hazardous Weather Testbed Spring Forecasting Experiments. We train ML models to generate probabilistic forecasts of severe weather akin to Storm Prediction Center outlooks (i.e., likelihood of a tornado, severe wind, or severe hail event within 36 km of each point). We compare a histogram gradient-boosted tree (HGBT) model and a deep learning U-Net approach against a carefully calibrated baseline generated from 2-5 km updraft helicity. Results indicate that the HGBT and U-Net outperform the baseline, particularly at higher probability thresholds. The HGBT achieves the best performance metrics, but predicted probabilities cap at 60% while the U-net forecasts extend to 100%. Similar to previous studies, the U-Net produces spatially smoother guidance than the tree-based method. These findings add to the growing evidence of the effectiveness of ML-based CAM post-processing for providing short-term severe weather guidance.

2603.20248 2026-03-24 cs.CY cs.AI cs.HC cs.MA

Stability of AI Governance Systems: A Coupled Dynamics Model of Public Trust and Social Disruptions

Jiaqi Lai, Hou Liang, Weihong Huang

Comments 15 pages, 8 figures. Equal contribution by Jiaqi Lai and Hou Liang

详情
英文摘要

As artificial intelligence (AI) is increasingly deployed in high-stakes public decision-making (from resource allocation to welfare distribution), public trust in these systems has become a critical determinant of their legitimacy and sustainability. Yet existing AI governance research remains largely qualitative, lacking formal mathematical frameworks to characterize the precise conditions under which public trust collapses. This paper addresses that gap by proposing a rigorous coupled dynamics model that integrates a discrete-time Hawkes process -- capturing the self-exciting generation of AI controversy events such as perceived algorithmic unfairness or accountability failures -- with a Friedkin-Johnsen opinion dynamics model that governs the evolution of institutional trust across social networks. A key innovation is the bidirectional feedback mechanism: declining trust amplifies the intensity of subsequent controversy events, which in turn further erode trust, forming a self-reinforcing collapse loop. We derive closed-form equilibrium solutions and perform formal stability analysis, establishing the critical spectral condition rho(J_{2nt}) < 1 that delineates the boundary between trust resilience and systemic collapse. Numerical experiments further reveal how echo chamber network structures and media amplification accelerate governance failure. Our core contribution to the AI governance field is a baseline collapse model: a formal stability analysis framework demonstrating that, absent strong institutional intervention, even minor algorithmic biases can propagate through social networks to trigger irreversible trust breakdown in AI governance systems.

2603.20235 2026-03-24 cs.CY cs.AI cs.HC

Writing literature reviews with AI: principles, hurdles and some lessons learned

Saadi Lahlou, Annabelle Gouttebroze, Atrina Oraee, Julian Madera

Comments 31 pages and 193 pages of Appendices, including 6 different versions of the literature review, and complete chat with the LLM

详情
英文摘要

We qualitatively compared literature reviews produced with varying degrees of AI assistance. The same LLM, given the same corpus of 280 papers but different selections, produced dramatically different reviews, from mainstream and politically neutral to critical and post-colonial, though neither orientation was intended. LLM outputs always appear at first glance to be well written, well informed and thought out, but closer reading reveals gaps, biases and lack of depth. Our comparison of six versions shows a series of pitfalls and suggests precautions necessary when using AI assistance to make a literature review. Main issues are: (1) The bias of ignorance (you do not know what you do not get) in the selection of relevant papers. (2) Alignment and digital sycophancy: commercial AI models slavishly take you further in the direction they understand you give them, reinforcing biases. (3) Mainstreaming: because of their statistical nature, LLM productions tend to favor mainstream perspectives and content; in our case there was only 20% overlap between paper selections by humans and the LLM. (4) Limited capacity for creative restructuring, with vague and ambiguous statements. (5) Lack of critical perspective, coming from distant reading and political correctness. Most pitfalls can be addressed by prompting, but only if the user knows the domain well enough to detect them. There is a paradox: producing a good AI-assisted review requires expertise that comes from reading the literature, which is precisely what AI was meant to reduce. Overall, AI can improve the span and quality of the review, but the gain of time is not as massive as one would expect, and a press-button strategy leaving AI to do the work is a recipe for disaster. We conclude with recommendations for those who write, or assess, such LLM-augmented reviews.

2603.20229 2026-03-24 cs.CY cs.AI

Characterizing the ability of LLMs to recapitulate Americans' distributional responses to public opinion polling questions across political issues

Eric Gong, Nathan E. Sanders, Bruce Schneier

详情
英文摘要

Traditional survey-based political issue polling is becoming less tractable due to increasing costs and risk of bias associated with growing non-response rates and declining coverage of key demographic groups. With researchers and pollsters seeking alternatives, Large Language Models have drawn attention for their potential to augment human population studies in polling contexts. We propose and implement a new framework for anticipating human responses on multiple-choice political issue polling questions by directly prompting an LLM to predict a distribution of responses. By comparison to a large and high quality issue poll of the US population, the Cooperative Election Study, we evaluate how the accuracy of this framework varies across a range of demographics and questions on a variety of topics, as well as how this framework compares to previously proposed frameworks where LLMs are repeatedly queried to simulate individual respondents. We find the proposed framework consistently exhibits more accurate predictions than individual querying at significantly lower cost. In addition, we find the performance of the proposed framework varies much more systematically and predictably across demographics and questions, making it possible for those performing AI polling to better anticipate model performance using only information available before a query is issued.

2603.20225 2026-03-24 cs.CY cs.AI

The Arrival of AGI? When Expert Personas Exceed Expert Benchmarks

Drake Mullens, Stella Shen

详情
英文摘要

Do expert personas improve language model performance? The Wharton Generative AI Lab reports that they do not, broadcasting to millions via social media the recommendation that practitioners abandon a technique recommended by Anthropic, Google, and OpenAI. We demonstrate that this null finding was structurally predictable. Five core mechanisms precluded detection before data collection began: baseline contamination elevating the starting point to near-ceiling, system prompt hierarchy subordinating experimental manipulation, impossible expert specifications collapsing to generic competence, format constraints suppressing reasoning processes, and provider exclusion limiting generalizability. Controlled trials correcting these limitations reveal what the original design obscured. To test this, we selected the GPQA Diamond hardest questions to prevent baseline pattern matching, forcing reliance on genuine expert reasoning. On items with valid key answers, expert personas achieve ceiling accuracy. They eliminated all baseline errors through confidence amplification. Furthermore, forensic examination of model divergence identified that half of the hardest GPQA items contain chemically or logically indefensible answers. The model's CoT revealed reasoning away from impossible answers, yielding penalization for accurate chemistry. These findings recontextualize the original null results. Methodologically sound persona research faces measurement constraints imposed by benchmark validity limitations. Answering the persona question requires evaluation infrastructure the field does not yet possess.

2603.20223 2026-03-24 cs.CY cs.AI cs.LG

Inference Energy and Latency in AI-Mediated Education: A Learning-per-Watt Analysis of Edge and Cloud Models

Kushal Khemani

详情
英文摘要

Immediate feedback is a foundational requirement of effective AI-mediated learning, yet the energy and latency costs of delivering it remain largely unexamined. This study investigates the latency-energy-learning trade-off in AI tutoring through an empirical comparison of two on-device inference configurations of Microsoft Phi-3 Mini (4k-instruct) on an NVIDIA T4 GPU: full-precision FP16 and 4-bit NormalFloat (NF4) quantisation. Both were evaluated under KV-cache-enabled inference across 500 educational prompts spanning five secondary school subject domains. Pedagogical quality was assessed for each of the 1000 generated responses by a hybrid panel of 10 Cambridge International teachers and three frontier AI systems using a four-dimension rubric. We introduce Learning-per-Watt (LpW), a novel metric quantifying pedagogical value per unit of energy over the learner's waiting window. Under realistic deployment, NF4 achieves lower per-inference energy than FP16 (329 J vs. 369 J) but higher latency (13.4 s vs. 9.2 s), yielding a modest FP16 advantage in LpW of 1.33x at a quality difference of 0.19 points. Under cache-disabled inference -- used in offline evaluation but absent from real deployments -- the gap widens to 7.4x, overstating the FP16 advantage by more than fivefold. Quantisation efficiency is hardware-dependent and inference-regime dependent, with significant implications for equitable AI tutoring deployment in low-resource settings.

2603.20214 2026-03-24 cs.CY cs.AI cs.HC

Beyond Detection: Governing GenAI in Academic Peer Review as a Sociotechnical Challenge

Tatiana Chakravorti, Pranav Narayanan Venkit, Sourojit Ghosh, Sarah Rajtmajer

详情
英文摘要

Generative AI tools are increasingly entering academic peer review workflows, raising questions about fairness, accountability, and the legitimacy of evaluative judgment. While these systems promise efficiency gains amid growing reviewer overload, their use introduces new sociotechnical risks. This paper presents a convergent mixed-method study combining discourse analysis of 448 social media posts with interviews with 14 area chairs and program chairs from leading AI and HCI conferences to examine how GenAI is discussed and experienced in peer review. Across both datasets, we find broad agreement that GenAI may be acceptable for limited supportive tasks, such as improving clarity or structuring feedback, but that core evaluative judgments, assessing novelty, contribution, and acceptance, should remain human responsibilities. At the same time, participants highlight concerns about epistemic harm, over-standardization, unclear responsibility, and adversarial risks such as prompt injection. User interviews reveal how structural strain and institutional policy ambiguity shift interpretive and enforcement burdens onto individual scholars, disproportionately affecting junior authors and reviewers. By triangulating public governance discourse with lived review practices, this work reframes AI mediated peer review as a sociotechnical governance challenge and offers recommendations for preserving accountability, trust, and meaningful human oversight. Overall, we argue that AI-assisted peer review is best governed not by blanket bans or detection alone, but by explicitly reserving evaluative judgment for humans while instituting enforceable, role-specific controls that preserve accountability. We conclude with role specific recommendations that formalize the support judgment boundary.

2603.20211 2026-03-24 cs.CY cs.AI

Exploring Teacher-Chatbot Interaction and Affect in Block-Based Programming

Bahare Riahi, Ally Limke, Xiaoyi Tian, Viktoriia Storozhevykh, Sayali Patukale, Tahreem Yasir, Khushbu Singh, Jennifer Chiu, Nicholas lytle, Tiffany Barnes, Veronica Catete

Comments 19 pages, 9 figures, CHI26

详情
英文摘要

AI-based chatbots have the potential to accelerate learning and teaching, but may also have counterproductive consequences without thoughtful design and scaffolding. To better understand teachers' perspectives on large language model (LLM)-based chatbots, we conducted a study with 11 teams of middle school teachers using chatbots for a science and computational thinking activity within a block-based programming environment. Based on a qualitative analysis of audio transcripts and chatbot interactions, we propose three profiles: explorer, frustrated, and mixed, that reflect diverse scaffolding needs. In their discussions, we found that teachers perceived chatbot benefits such as building prompting skills and self-confidence alongside risks including potential declines in learning and critical thinking. Key design recommendations include scaffolding the introduction to chatbots, facilitating teacher control of chatbot features, and suggesting when and how chatbots should be used. Our contribution informs the design of chatbots to support teachers and learners in middle school coding activities.

2603.20204 2026-03-24 cs.CY cs.AI

Measuring Research Convergence in Interdisciplinary Teams Using Large Language Models and Graph Analytics

Wenwen Li, Yuanyuan Tian, Sizhe Wang, Amber Wutich, Paul Westerhoff, Sarah Porter, Anais Roque, Jobayer Hossain, Patrick Thomson, Rhett Larson, Michael Hanemann

详情
英文摘要

Understanding how interdisciplinary research teams converge on shared knowledge is a persistent challenge. This paper presents a novel, multi-layer, AI-driven analytical framework for mapping research convergence in interdisciplinary teams. The framework integrates large language models (LLMs), graph-based visualization and analytics, and human-in-the-loop evaluation to examine how research viewpoints are shared, influenced, and integrated over time. LLMs are used to extract structured viewpoints aligned with the \emph{Needs-Approach-Benefits-Competition (NABC)} framework and to infer potential viewpoint flows across presenters, forming a common semantic foundation for three complementary analyses: (1) similarity-based qualitative analysis to identify two key types of viewpoints, popular and unique, for building convergence, (2) quantitative cross-domain influence analysis using network centrality measures, and (3) temporal viewpoint flow analysis to capture convergence dynamics. To address uncertainty in LLM-based inference, the framework incorporates expert validation through structured surveys and cross-layer consistency checks. A case study on water insecurity in underserved communities as part of the Arizona Water Innovation Initiatives demonstrates increasing viewpoint convergence and domain-specific influence patterns, illustrating the value of the proposed AI-enabled approach for research convergence analysis.

2603.20201 2026-03-24 cs.MM cs.CV cs.CY

FIGURA: A Modular Prompt Engineering Method for Artistic Figure Photography in Safety-Filtered Text-to-Image Models

Luca Cazzaniga

Comments 10 pages, 6 tables. Preprint

详情
英文摘要

Safety filters in commercial text-to-image (T2I) models systematically block legitimate artistic content involving the human figure, treating classical nude photography with the same restrictiveness as explicit material. While prior research has documented this problem extensively, no operational system exists that enables professional artists to generate artistic figure photography within the constraints of active safety filters. We present the FIGURA Method (Framework for Intelligent Generation of Unrestricted Artistic Results), a modular prompt engineering system comprising eight interconnected knowledge files, empirically validated through 200+ documented generation tests on FLUX 2 Pro (Cloud) with active safety filters at the default tolerance level. Our systematic testing reveals several previously undocumented findings: (1) safety filters primarily detect absence descriptions (references to missing clothing) rather than presence descriptions (references to body form), which we formalize as the Golden Rule; (2) artistic references to painters function simultaneously as aesthetic guides and as safety anchors that alter filter behavior; (3) spatial context operates as an independent filter variable, with documented success rate hierarchies; and (4) geometric vocabulary for body description bypasses pattern recognition in silhouette contexts. The system achieves documented success rates between 80% and 90% across five structured prompt templates, demonstrating that the artistic censorship problem identified in recent literature admits practical, systematic solutions that work with active safety mechanisms rather than circumventing them.

2603.20198 2026-03-24 cs.CR cs.CV cs.LG

Visual Exclusivity Attacks: Automatic Multimodal Red Teaming via Agentic Planning

Yunbei Zhang, Yingqiang Ge, Weijie Xu, Yuhui Xu, Jihun Hamm, Chandan K. Reddy

详情
英文摘要

Current multimodal red teaming treats images as wrappers for malicious payloads via typography or adversarial noise. These attacks are structurally brittle, as standard defenses neutralize them once the payload is exposed. We introduce Visual Exclusivity (VE), a more resilient Image-as-Basis threat where harm emerges only through reasoning over visual content such as technical schematics. To systematically exploit VE, we propose Multimodal Multi-turn Agentic Planning (MM-Plan), a framework that reframes jailbreaking from turn-by-turn reaction to global plan synthesis. MM-Plan trains an attacker planner to synthesize comprehensive, multi-turn strategies, optimized via Group Relative Policy Optimization (GRPO), enabling self-discovery of effective strategies without human supervision. To rigorously benchmark this reasoning-dependent threat, we introduce VE-Safety, a human-curated dataset filling a critical gap in evaluating high-risk technical visual understanding. MM-Plan achieves 46.3% attack success rate against Claude 4.5 Sonnet and 13.8% against GPT-5, outperforming baselines by 2--5x where existing methods largely fail. These findings reveal that frontier models remain vulnerable to agentic multimodal attacks, exposing a critical gap in current safety alignment. Warning: This paper contains potentially harmful content.

2603.12018 2026-03-24 cs.HC cs.AI cs.ET

An Intent of Collaboration: On Agencies between Designers and Emerging (Intelligent) Technologies

Pei-Ying Lin, Julie Heij, Iris Borst, Britt Joosten, Kristina Andersen, Wijnand IJsselsteijn

Comments Accepted by IASDR Conference 2025, Taipei, Taiwan 16 pages excluding references, 8 figures

Journal ref Proceedings of IASDR 2025: Design Next

详情
英文摘要

Amidst the emergence of powerful intelligent technologies such as LLMs and text-to-image AIs that promise to enhance creative processes, designers face the challenges of remaining empowered and creative while working with these foreign digital partners. While generative AIs offer versatile, informative, and occasionally poetic outcomes, their lack of embodied knowledge presents an even greater challenge to designers in gaining fruitful outcomes, such as in the field of Digital Craftsmanship. In this project, three designers embarked on a three-month experimental journey with an intention to co-create with Google's LLM as a potential intelligent partner to investigate how it will influence the designers' creativity. We found that a power dynamic of agencies exists between the LLM and the designer, in which the designer can easily lose their creative agency. Regaining the designer's creative agency involves introspection into their own creative process, a structural understanding of the specific emerging technology involved, and deliberate adjustments to the dynamics of the human-technology relationship. We propose paying attention to the designer's inner world and parties of agencies when engaging with emerging intelligent technologies through three aspects: the sensitivity towards a creative process as cognitive activities; the active investigation into specific technology's capability; and the adjustment towards an appropriate working relationship between the designer and the emerging technology.

2602.08123 2026-03-24 cs.HC cs.RO

Adding More Value Than Work: Practical Guidelines for Integrating Robots into Intercultural Competence Learning

Zhennan Yi, Sophia Sakakibara Capello, Randy Gomez, Selma Šabanović

详情
英文摘要

While social robots have demonstrated effectiveness in supporting students' intercultural competence development, it is unclear how they can effectively be adopted for integrated use in K-12 schools. We conducted two phases of design workshops with teachers, where they co-designed robot-mediated intercultural activities while considering student needs and school integration concerns. Using thematic analysis, we identify appropriate scenarios and roles for classroom robots, explore how robots could complement rather than replace teachers, and consider how to address ethical and compliance considerations. Our findings provide practical design guidelines for the HRI community to develop social robots that can effectively support intercultural education in K-12 schools.

2510.25259 2026-03-24 cs.IR cs.AI cs.LG

TV-Rec: Time-Variant Convolutional Filter for Sequential Recommendation

Yehjin Shin, Jeongwhan Choi, Seojin Kim, Noseong Park

Comments The 39th Conference on Neural Information Processing Systems (NeurIPS 2025)

详情
英文摘要

Recently, convolutional filters have been increasingly adopted in sequential recommendation for their ability to capture local sequential patterns. However, most of these models complement convolutional filters with self-attention. This is because convolutional filters alone, generally fixed filters, struggle to capture global interactions necessary for accurate recommendation. We propose Time-Variant Convolutional Filters for Sequential Recommendation (TV-Rec), a model inspired by graph signal processing, where time-variant graph filters capture position-dependent temporal variations in user sequences. By replacing both fixed kernels and self-attention with time-variant filters, TV-Rec achieves higher expressive power and better captures complex interaction patterns in user behavior. This design not only eliminates the need for self-attention but also reduces computation while accelerating inference. Extensive experiments on six public benchmarks show that TV-Rec outperforms state-of-the-art baselines by an average of 7.49%.

2509.01784 2026-03-24 physics.optics cs.LG

Modeling and benchmarking quantum optical neurons for efficient neural computation

Andrea Andrisani, Gennaro Vessio, Fabrizio Sgobba, Francesco Di Lena, Luigi Amato Santamaria, Giovanna Castellano

Journal ref PLoS One 21(3): e0341545

详情
英文摘要

Quantum optical neurons (QONs) are emerging as promising computational units that leverage photonic interference to perform neural operations in an energy-efficient and physically grounded manner. Building on recent theoretical proposals, we introduce a family of QON architectures based on Hong-Ou-Mandel (HOM) and Mach-Zehnder (MZ) interferometers, incorporating different photon modulation strategies -- phase, amplitude, and intensity. These physical setups yield distinct pre-activation functions, which we implement as fully differentiable software modules. We evaluate these QONs both in isolation and as building blocks of multilayer networks, training them on binary and multiclass image classification tasks using the MNIST and FashionMNIST datasets. Each experiment is repeated over five independent runs and assessed under both ideal and non-ideal conditions to measure accuracy, convergence, and robustness. Across settings, MZ-based neurons exhibit consistently stable behavior -- including under noise -- while HOM amplitude modulation performs competitively in deeper architectures, in several cases approaching classical performance. In contrast, phase- and intensity-modulated HOM-based variants show reduced stability and greater sensitivity to perturbations. These results highlight the potential of QONs as efficient and scalable components for future quantum-inspired neural architectures and hybrid photonic-electronic systems. The code is publicly available at https://github.com/gvessio/quantum-optical-neurons.

2506.02259 2026-03-24 cs.GT cs.AI

Stochastically Dominant Peer Prediction

Yichi Zhang, Shengwei Xu, David Pennock, Grant Schoenebeck

Comments 29 pages, 3 figures

详情
英文摘要

Eliciting reliable human feedback is essential for many machine learning tasks, such as learning from noisy labels and aligning AI systems with human preferences. Peer prediction mechanisms incentivize truthful reporting without ground truth verification by scoring agents based on correlations with peers. Traditional mechanisms, which ensure that truth-telling maximizes the expected scores in equilibrium, can elicit honest information while assuming agents' utilities are linear functions of their scores. However, in practice, non-linear payment rules are usually preferred, or agents' utilities are inherently non-linear. We propose stochastically dominant truthfulness (SD-truthfulness) as a stronger guarantee: the score distribution of truth-telling stochastically dominates all other strategies, incentivizing truthful reporting for a wide range of monotone utility functions. Our first observation is that no existing peer prediction mechanism naturally satisfies this criterion without strong assumptions. A simple solution -- rounding scores into binary lotteries -- can enforce SD-truthfulness, but often degrades sensitivity, a key property related to fairness and statistical efficiency. We demonstrate how a more careful application of rounding can better preserve sensitivity. Furthermore, we introduce a new enforced agreement (EA) mechanism that is theoretically guaranteed to be SD-truthful in binary-signal settings under mild assumptions, and empirically achieves the highest sensitivity among all known SD-truthful mechanisms.

2505.18323 2026-03-24 cs.CR cs.AI cs.LG

Architectural Backdoors for Within-Batch Data Stealing and Model Inference Manipulation

Nicolas Küchler, Ivan Petrov, Conrad Grobler, Ilia Shumailov

Comments This work has been accepted for publication at the IEEE Conference on Secure and Trustworthy Machine Learning (SaTML) 2026. The final version will be available on IEEE Xplore

详情
英文摘要

For nearly a decade the academic community has investigated backdoors in neural networks, primarily focusing on classification tasks where adversaries manipulate the model prediction. While demonstrably malicious, the immediate real-world impact of such prediction-altering attacks has remained unclear. In this paper we introduce a novel and significantly more potent class of backdoors that builds upon recent advancements in architectural backdoors. We demonstrate how these backdoors can be specifically engineered to exploit batched inference, a common technique for hardware utilization, enabling large-scale user data manipulation and theft. By targeting the batching process, these architectural backdoors facilitate information leakage between concurrent user requests and allow attackers to fully control model responses directed at other users within the same batch. In other words, an attacker who can change the model architecture can set and steal model inputs and outputs of other users within the same batch. We show that such attacks are not only feasible but also alarmingly effective, can be readily injected into prevalent model architectures, (e.g. Transformers), and represent a truly malicious threat to user privacy and system integrity. Critically, to counteract this new class of vulnerabilities, we propose a deterministic mitigation strategy that provides formal guarantees against this new attack vector, unlike prior work that relied on LLMs to find the backdoors. Our mitigation strategy employs a novel Information Flow Control mechanism that analyzes the model graph and proves non-interference between different user inputs within the same batch. Using our mitigation strategy we perform a large scale analysis of models hosted through Hugging Face and find over 200 models that introduce (unintended) information leakage between batch entries due to the use of dynamic quantization.

2503.24075 2026-03-24 math.OC cs.LG

Optimization on the Oblique Manifold for Sparse Simplex Constraints via Multiplicative Updates

Flavia Esposito, Andersen Ang

Comments 19 pages, 3 figures, 2 tables

详情
英文摘要

Low-rank optimization problems with sparse simplex constraints involve variables that must satisfy nonnegativity, sparsity, and sum-to-1 conditions, making their optimization particularly challenging due to the interplay between low-rank structures and constraints. These problems arise in various applications, including machine learning, signal processing, environmental fields, and computational biology. In this work, we propose a novel manifold optimization approach to efficiently tackle these problems. Our method leverages the geometry of oblique manifolds to reformulate the problem and introduces a new Riemannian optimization method based on Riemannian gradient descent that strictly maintains the simplex constraints. By exploiting the underlying manifold structure, our approach improves optimization efficiency. Experiments on synthetic and real datasets demonstrate the effectiveness of the proposed method compared to standard Euclidean and Riemannian methods, paving the way for broader applications.

2407.17489 2026-03-24 cs.HC cs.AI econ.GN q-fin.EC

Cognitive Spillover in Human-AI Teams

Christoph Riedl, Saiph Savage, Josie Zvelebilova

Journal ref Riedl, C., Savage, S., Zvelebilova, J. (2026). Cognitive Spillover in Human-AI Teams. ACM Transactions on Computer-Human Interaction (TOCHI)

详情
英文摘要

AI is not only a neutral tool in team settings; it influence the social and cognitive fabric of collaboration. Across two randomized experiments, we demonstrate that AI exposure produces causal spillover into human-human interaction -- affecting shared language, collective attention, shared mental models, and social cohesion. These spillover effects occur robustly across settings, modalities, tasks, and AI qualities, suggesting that mere exposure to AI drives the influence. AI functions as an implicit ``social forcefield,'' influencing not only how people speak, but also how they think, what they attend to, and how they relate to each other. We argue for shifting the design paradigm from optimizing ``AI as a tool'' to understanding AI as a socially influential actor whose effects extend beyond the human-AI interface.

2603.22272 2026-03-24 math.PR

Itô perspective on variance renormalisation

Konstantinos Dareiotis, Máté Gerencsér

Comments 31 pages

详情
英文摘要

We show that the Itô solutions of the nonlinear stochastic heat equation $$ \partial_t u^\varepsilon- Δu^\varepsilon =\varepsilon^{3/4} g (u^\varepsilon) \nabla ξ_\varepsilon, $$ where $ ξ_\varepsilon$ denotes the mollification in space at scale $\varepsilon>0$ of a space-time white noise $ξ$, converge in law, as $\varepsilon\to 0$, to the solution of the stochastic heat equation with right-hand side $cg'g(u)ξ$ with a constant $c>0$. Since the noise $\nablaξ$ is supercritical, the small prefactor is not unexpected to obtain a limit, but the exponent $3/4$ is not predicted by naive scaling arguments. The case $g(u)=u$, modulo a Cole-Hopf transform, corresponds to the result of [Hai25] for the KPZ equation. Our argument is relatively short and relies solely on stochastic analytic techniques.

2603.22269 2026-03-24 q-bio.BM

Computational modeling of RNA-protein binding interactions under an external force

Danielle Wampler, Ralf Bundschuh

详情
英文摘要

RNA binding proteins play a crucial role in post-transcriptional gene regulation by controlling the transport, processing, and translation of their target RNAs. Post-transcriptional gene regulation leads to the differential expression of genetic material and loss of regulation or over-regulation relates to a large range of cancers and diseases - many of which have directly been associated with RNA binding proteins and their target RNAs. To understand RNA, RNA binding proteins, and how they function in gene expression, it is essential to characterize how RNA binding proteins interact with their target RNAs. Here, we aim to assess the potential for single molecule force spectroscopy experiments to be used in the characterization of RNA-protein binding by investigating to what extent a change of extension due to RNA-protein binding is experimentally measurable and what aspects of the interaction can be deduced from such measurements. We predict the effect of protein binding on RNA force extension measurements via the open-source ViennaRNA package, which we have modified to simultaneously consider an external force, protein binding, and RNA secondary structure. From this work, we see protein concentration-dependent responses to external forces with discernable differences in predicted extensions around biologically relevant concentrations and a connection to protein binding domain geometry for several RNA binding proteins.

2603.22268 2026-03-24 physics.chem-ph

An Accurate Tensorial Model for Prediction of Full Zeolite NMR Spectra

Carlos Bornes, Chiheb Ben Mahmoud, Volker L. Deringer, Christopher J. Heard, Lukáš Grajciar

详情
英文摘要

Solid state nuclear magnetic resonance (ss-NMR) is one of the most sensitive and popular techniques for structure elucidation in geometrically complex crystalline materials, such as zeolites. Synergistic support from computational modelling is vital to interpret experimental spectra, and relate ss-NMR to atomistic models. Nevertheless, computational predictions are hindered by the high expense of calculating magnetic shielding (MS) and electric field gradient (EFG) tensors from first principles. In this work, we leverage a novel tensorial machine learning approach to train a general model for predicting complete NMR tensors. We demonstrate the utility of the approach for a diverse dataset of zeolitic materials and NMR-active nuclei ($^{27}$Al, $^{29}$Si, $^{17}$O, $^{23}$Na and $^{1}$H), predicting all NMR observables to a high degree of precision. These observables are then translated into predictions of the full $^{27}$Al and $^{29}$Si ss-nMR spectra for the exemplary zeolite RTH. Thus, this work opens a pathway to accurate, high-throughput NMR simulation for large-scale and realistic models of chemically complex zeolites.

2603.22266 2026-03-24 physics.chem-ph

Microscopic view of materials properties of liquids: An atomic scale perspective

Jaeyun Moon

详情
英文摘要

Microscopic understanding of liquid properties is essential for advancing a wide range of applications from energy applications such as nuclear reactors and batteries to biomedical applications including drug delivery and microfluidics. However, intrinsic dynamic disorder and lack of structural periodicity in liquids have presented fundamental challenges in developing rigorous microscopic theories of their thermodynamic and dynamic behavior. Recent breakthroughs in computational power and experimental metrologies have driven significant progress in unraveling the complex atomic scale dynamics of liquids. In this Review, we provide a brief historical context of liquid state physics and explore recent advances through theoretical, computational, and experimental approaches. For theoretical and computational approaches, instantaneous normal mode and velocity autocorrelation function calculations are discussed. For experiments, we focus on X-ray and neutron scattering techniques that probe liquid dynamics at the atomic level. Finally, we highlight emerging opportunities and future directions in the study of liquid atomic dynamics.

2603.22265 2026-03-24 math.AP

Cohesive Membranes under determinant constraints

Nicola Pio Melillo, Dario Reggiani

详情
英文摘要

This paper is devoted to the variational derivation of reduced models for elastic membranes with fracture under constraints on the determinant of the deformation gradient. We consider two physically relevant settings: the non-interpenetration regime, in which the deformation is required to be orientation-preserving ($\det \nabla u > 0$), and the incompressible regime, in which the deformation preserves volume ($\det \nabla u = 1$). In both cases, the surface energy density is allowed to depend on the jump amplitude, thus encompassing cohesive fracture models with activation threshold. The main technical contribution is the construction of recovery sequences that simultaneously satisfy the determinant constraint and optimize the surface energy. This is achieved through a combination of $C^\infty$ diffeomorphisms converging to the identity (which rotate the normal to the jump set so as to minimize the reduced surface energy), and a new smooth approximation result for $GSBV^p$ functions.

2603.22262 2026-03-24 cs.CG math.CO

Flip Distance of Non-Crossing Spanning Trees: NP-Hardness and Improved Bounds

Håvard Bakke Bjerkevik, Joseph Dorfer, Linda Kleist, Torsten Ueckerdt, Birgit Vogtenhuber

详情
英文摘要

We consider the problem of reconfiguring non-crossing spanning trees on point sets. For a set $P$ of $n$ points in general position in the plane, the flip graph $F(P)$ has a vertex for each non-crossing spanning tree on $P$ and an edge between any two spanning trees that can be transformed into each other by the exchange of a single edge. This flip graph has been intensively studied, lately with an emphasis on determining its diameter diam$(F(P))$ for sets $P$ of $n$ points in convex position. The current best bounds are $\frac{14}{9}n-O(1) \leq$ diam$(F(P))<\frac{15}{9}n-3$ [Bjerkevik, Kleist, Ueckerdt, and Vogtenhuber; SODA 2025]. The crucial tool for both the upper and lower bound are so-called *conflict graphs*, which the authors stated might be the key ingredient for determining the diameter (up to lower-order terms). In this paper, we pick up the concept of conflict graphs and show that this tool is even more versatile than previously hoped. As our first main result, we use conflict graphs to show that computing the flip distance between two non-crossing spanning trees is NP-hard, even for point sets in convex position. Interestingly, the result still holds for more constrained flip operations, concretely, compatible flips (where the removed and the added edge do not cross) and rotations (where the removed and the added edge share an endpoint). Extending the line of research from [BKUV SODA25], we present new insights on the diameter of the flip graph. Their lower bound is based on a constant-size pair of trees, one of which is *stacked*. We show that if one of the trees is stacked, then the lower bound is indeed optimal up to a constant term, that is, there exists a flip sequence of length at most $\frac{14}{9}(n-1)$ to any other tree. Lastly, we improve the lower bound on the diameter of the flip graph $F(P)$ for $n$ points in convex position to $\frac{11}{7}n-o(n)$.

2603.22258 2026-03-24 eess.SP eess.AS

Semi-Blind Channel Estimation and Hybrid Receiver Beamforming in the Tera-Hertz Multi-User Massive MIMO Uplink

Abhisha Garg, Suraj Srivastava, Varsha Dubey, Aditya Jagannatham, Lajos Hanzo

详情
英文摘要

We develop a pragmatic multi-user (MU) massive multiple-input multiple-output (MIMO) channel model tailored to the THz band, encompassing factors such as molecular absorption, reflection losses and multipath diffused ray components. Next, we propose a novel semi-blind based channel state information (CSI) acquisition technique i.e. MU whitening decorrelation semi-blind (MU-WD-SB) that exploits the second order statistics corresponding to the unknown data symbols along with pilot vectors. A constrained Cramer-Rao Lower Bound (C-CRLB) is derived to bound the normalized mean square error (NMSE) performance of the proposed semi-blind learning technique. Our proposed scheme efficiently reduces the training overheads while enhancing the overall accuracy of the channel learning process. Furthermore, a novel hybrid receiver combiner framework is devised for MU THz massive MIMO systems, leveraging multiple measurement vector based sparse Bayesian learning (MMV-SBL) that relies on the estimated CSI acquired through our proposed semi-blind technique relying on low resolution analog-to-digital converters (ADCs). Finally, we propose an optimal hybrid combiner based on MMV-SBL, which directly reduces the MU interference. Extensive simulations are conducted to evaluate the performance gain of the proposed MU-WD-SB scheme over conventional training-based and other semi-blind learning techniques for a practical THz channel obtained from the high-resolution transmission (HITRAN) database. The metrics considered for quantifying the improvements include the NMSE, bit error rate (BER) and spectral-efficiency (SE).

2603.22257 2026-03-24 astro-ph.GA

Mainly on the Plane: Observing the Extended, Ionized Disks of Milky Way Analogs in IllustrisTNG

Michael Messere, Kirill Tchernyshyov, Mary E. Putman, Greg L. Bryan, Jessica K. Werk, Yong Zheng, David Schiminovich

Comments 21 pages, 9 figures + appendices; Accepted to ApJ

Journal ref ApJ, 1000, 172 (2026)

详情
英文摘要

This paper explores the extent to which the circumgalactic medium (CGM) of Milky Way-like galaxies is located in an extended, ionized, disklike structure. To test this hypothesis, we analyze the spatial and kinematic distributions of different ion species within a sample of MW-like systems in IllustrisTNG. We model commonly observed ions (HI, MgII, SiIV, CIV and OVI) and calculate (1) their angular momentum misalignment from the star-forming disk ($θ$) and (2) the fraction of absorption consistent with galaxy rotation ($f_\mathrm{EWcorot}$). We find that 63% of MgII, 45% of SiIV, 38% of CIV, and 35% of OVI mass along the major axis have kinematics aligned with the galaxy angular momentum axis. We extend this to a mock absorption line survey and quantify $f_\mathrm{EWcorot}$. We find that $f_\mathrm{EWcorot}$(MgII) $\sim80\%$ and $f_\mathrm{EWcorot}$(OVI) $\sim60\%$ at $\sim0.5\ \mathrm{R_{200c}}$, in agreement with recent observational work. We find that in the typical MW analog, there is evidence of cool-warm material in an extended, corotating structure, regardless of whether the angular momentum or observational definition is used. Hence, we expect that the typical MW CGM, especially in the low ions, should be mainly on the plane.