arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.10042 2026-03-12 cs.CR cs.AI

Targeted Bit-Flip Attacks on LLM-Based Agents

Jialai Wang, Ya Wen, Zhongmou Liu, Yuxiao Wu, Bingyi He, Zongpeng Li, Ee-Chien Chang

Comments To appear in DAC 2026 (Design Automation Conference)

2603.10041 2026-03-12 cs.CR cs.LG

Evaluating Generalization Mechanisms in Autonomous Cyber Attack Agents

Ondřej Lukáš, Jihoon Shin, Emilia Rivas, Diego Forni, Maria Rigaki, Carlos Catania, Aritran Piplai, Christopher Kiekintveld, Sebastian Garcia

2603.10038 2026-03-12 cs.NI cs.LG

Tureis: Transformer-based Unified Resilience for IoT Devices in Smart Homes

Alireza Borhani, Vafa Andalibi, Bahar Asgari

详情

英文摘要

Smart-home IoT systems rely on heterogeneous sensor networks whose correctness shapes application behavior and the physical environment. However, these low-cost, resource-constrained sensors are highly prone to failure under real-world stressors. Prior methods often assume single-failure, single-resident settings, offer only failure detection rather than sensor-level localization, cover limited fault types and sensor modalities, require labels and human intervention, or impose overheads hindering edge deployment. To overcome these limitations, we propose Tureis, a self-supervised, context-aware method for failure detection and faulty-sensor localization in smart homes, designed for multi-failure, multi-resident edge settings. Tureis encodes heterogeneous binary and numeric sensor streams into compact bit-level features. It then trains a lightweight BERT-style Transformer with sensor-wise masked reconstruction over short-horizon windows, capturing spatial and short-term temporal correlations without mixing unrelated events. This self-supervised objective removes the need for labels or curated semantics. Then, at run-time, Tureis converts reconstruction residuals into sensor-level failure evidence and uses an iterative isolate-and-continue loop that masks flagged sensors, allowing other failures to surface and enabling resilient, fine-grained localization. Across five datasets with up to nine residents, Tureis improves single-failure localization F1 by +7.6%, +21.0%, and +25.0% over three strong baselines. In multi-failure scenarios with up to five faulty sensors, it further boosts localization F1 by +17.6% and +35.4% over two baselines, while the third does not extend to this setting. These gains come with minute-scale localization and an edge-friendly footprint, as a sub-megabyte model that processes each minute of data in a few milliseconds with ~0.5 GB peak memory on a Raspberry Pi 5.

URL PDF HTML ☆

赞 0 踩 0

2603.10032 2026-03-12 cs.AR cs.AI cs.LG

HTM-EAR: Importance-Preserving Tiered Memory with Hybrid Routing under Saturation

Shubham Kumar Singh

Comments 7 pages, 4 figures, 3 tables. Code available at GitHub

2603.10031 2026-03-12 cs.AR cs.AI cs.DC

Architecture-Aware LLM Inference Optimization on AMD Instinct GPUs: A Comprehensive Benchmark and Deployment Study

Athos Georgiou

Comments 40 pages, 6 figures, 30 tables. Technical report

2603.10028 2026-03-12 cs.CY cs.AI

How to Count AIs: Individuation and Liability for AI Agents

Yonathan Arbel, Peter Salib, Simon Goldstein

Comments 36 pages. Presented at the Law Following AI conference, Cambridge University. Interdisciplinary: AI safety, AI governance, legal theory

详情

英文摘要

Very soon, millions of AI agents will proliferate across the economy, autonomously taking billions of actions. Inevitably, things will go wrong. Humans will be defrauded, injured, even killed. Law will somehow have to govern the coming wave. But when an AI causes harm, the first question to answer, before anyone can be held accountable is: Which AI Did It? Identifying AIs is unusually difficult. AIs lack bodies. They can copy, split, merge, swarm, and vanish at will. Even today, a "single" AI agent is often an ensemble of instances based on multiple models. The complexity will only multiply as AI capabilities improve. This Article is the first to comprehensively diagnose the legal problem of identifying AIs. Two kinds of identity are required: "thin" and "thick." Thin identification ties every AI action to some human principal, essential for holding accountable the humans who make and use AI agents. Thick identification distinguishes between AI agents, qua agents -- sorting millions of AI entities into discrete, persistent units with stable, coherent goals, essential where principal-agent problems prevent humans from perfectly controlling AIs. This Article also presents a solution: the "Algorithmic Corporation" or "A-corp" -- a legal-fictional entity that can hold property, make contracts, and litigate in its own name. Owned by humans but run by AIs, A-corps solve the thin identity problem by tying AI actions to a human owner, and the thick identity problem via emergent self-organization. A-corps own the resources -- including compute -- that AIs need to accomplish their goals, giving AI managers strong incentives to share control only with goal-aligned AIs. In equilibrium, incentive and selection mechanisms force A-corps to self-organize into persistent, legally legible entities with coherent goals that respond rationally to legal incentives, like liability.

URL PDF HTML ☆

赞 0 踩 0

2603.10027 2026-03-12 cs.CY cs.AI cs.HC

A Governance and Evaluation Framework for Deterministic, Rule-Based Clinical Decision Support in Empiric Antibiotic Prescribing

Francisco José Gárate, Paloma Chausa, Diego Moreno, Judit López Luque, Vicens Díaz-Brito, Enrique Javier Gómez

Comments Methodological framework paper describing deterministic rule-based clinical decision support specification and a behavioral evaluation protocol using synthetic mechanism-driven cases. No empirical clinical validation is claimed

详情

英文摘要

Empiric antibiotic prescribing in high-risk clinical contexts often requires decision making under conditions of incomplete information, where inappropriate coverage or unjustified escalation may compromise safety and antimicrobial stewardship. While clinical decision-support systems have been proposed to assist in this process, many approaches lack explicit governance and evaluation mechanisms defining scope, abstention conditions, recommendation permissibility, and expected system behavior. This work specifies a governance and evaluation framework for deterministic clinical decision-support systems operating under explicitly constrained scope. Deterministic behavior is adopted to ensure that identical inputs yield identical outputs, supporting transparency, auditability, and conservative decision support in high-risk prescribing contexts. The framework treats governance as a first-class design component, separating clinical decision logic from rule-based mechanisms that determine whether a recommendation may be issued. Explicit abstention, deterministic stewardship constraints, and exclusion rules are formalized as core constructs. The framework defines an evaluation methodology utilizing a fixed set of synthetic, mechanism-driven clinical cases with predefined expected behavior. This validation process focuses on behavioral alignment with specified rules rather than clinical effectiveness, predictive accuracy, or outcome optimization. Within this protocol, abstention is treated as a correct and intended outcome when governance conditions are not satisfied. The proposed framework provides a reproducible approach for specifying, governing, and inspecting deterministic clinical decision-support systems in empiric antibiotic prescribing contexts where transparency, auditability, and conservative behavior are prioritized.

URL PDF HTML ☆

赞 0 踩 0

2603.10026 2026-03-12 cs.AR cs.AI cs.DC cs.PF

RedFuser: An Automatic Operator Fusion Framework for Cascaded Reductions on AI Accelerators

Xinsheng Tang, Yangcheng Li, Nan Wang, Zhiyi Shu, Xingyu Ling, Junna Xing, Peng Zhou, Qiang Liu

Comments 22 pages, 13 figures, ASPLOS '26

2603.10019 2026-03-12 cs.CY cs.AI

Prompts and Prayers: the Rise of GPTheology

Ioana Cheres, Adrian Groza, Ioana Moldovan, Mick O'Hara, Connell Vaughan

详情

英文摘要

Increasingly artificial intelligence (AI) has been cast in "god-like" roles (to name a few: film industry - Matrix, The Creator, Mission Impossible, Foundation, Dune etc.; literature - Children of Time, Permutation City, Neuromancer, I Have no Mouth and I Must Scream, Alphaville etc.). This trend has accelerated with the advent of sophisticated Large Language Models such as ChatGPT. For this phenomenon, where AI is perceived as divine, we use the term GPTheology, where ChatGPT and other AI models are treated as potential oracles of a semi-divine nature. This paper explores the emergence of GPTheology as a form of techno-religion, examining how narratives around AI echo traditional religious constructs. We draw on community narratives from online forums - Reddit - and recent projects - AI-powered Mazu Statue in Malaysia (Lu, 2025); "ShamAIn" Project in Korea (He-rim, 2025); AI Jesus in a Swiss Church (Kennedy, 2024). These examples show striking similarities to technological notions of the Singularity and the development of Artificial General Intelligence (AGI). Additionally, we analyse how daily interactions with AI are acquiring ritualistic associations and how AI-centric ideologies clash with or are integrated into established religions. This study uses a dataset of Reddit posts discussing AI to identify recurring themes of salvation, prophecy, and demonization surrounding AI. Our findings suggest that new belief systems are developing around AI, and this carries both philosophical and sociotechnical implications. Our paper critically analyses the benefits and dangers, as well as the social, political and ethical challenges of this development. This transdisciplinary inquiry highlights how AI and religion are increasingly intertwined, prompting necessary questions about humanity's relationship with its creations and the future of belief.

URL PDF HTML ☆

赞 0 踩 0

2603.10018 2026-03-12 cs.CY cs.AI

DeliberationBench: A Normative Benchmark for the Influence of Large Language Models on Users' Views

Luke Hewitt, Maximilian Kroner Dale, Paul de Font-Reaulx

Comments 20 pages, 6 figures, IASEAI 2026

2603.10016 2026-03-12 cs.CY cs.AI

Assessing Cognitive Biases in LLMs for Judicial Decision Support: Virtuous Victim and Halo Effects

Sierra S. Liu

Comments IEEE ICDM 2025

2603.09978 2026-03-12 cs.SE cs.AI cs.LG

One Model, Many Skills: Parameter-Efficient Fine-Tuning for Multitask Code Analysis

Amal Akli, Maxime Cordy, Mike Papadakis, Yves Le Traon

详情

英文摘要

Large language models have recently surpassed specialized systems on code generation, yet their effectiveness on other code-analysis tasks remains less clear. At the same time, multi-task learning offers a way to unify diverse objectives within a single model, but fully fine-tuning LLMs across tasks is computationally prohibitive. Parameter-efficient fine-tuning mitigates this cost by updating only a small fraction of weights. Although PEFT has proven effective in single-task settings, its potential for multi-task learning has not yet been systematically explored. We present the first comprehensive evaluation of multi-task PEFT for code analysis, comparing several methods across diverse tasks and model architectures. Our experiments show that a single PEFT module shared across tasks can match, and in some cases surpass, full multi-task fine-tuning, confirming that the benefits of PEFT extend beyond isolated tasks. When comparing single-task and multi-task setups, we find that multi-task PEFT achieves a favorable performance-efficiency trade-off: it delivers accuracy close to single-task fine-tuning while reducing storage requirements, cutting the number of trainable parameters by a factor of the task count, and lowering computation costs by as much as 85%. At the same time, multi-task gains remain sensitive to task grouping. Through task-pairing experiments, we identify key factors shaping outcomes: task stability, model architecture, task complementarity, asymmetry, and dataset quality determine the success of co-fine-tuning. Finally, we benchmark efficient multi-task PEFT against direct prompting of open-source general-purpose LLMs, including DeepSeek, Qwen, Mistral, CodeLlama, and StarCoder. Despite their strong performance in code generation, these models underperform on analysis tasks, where even a 1B-parameter model with multi-task PEFT achieves significantly better results.

URL PDF HTML ☆

赞 0 踩 0

2603.09800 2026-03-12 cs.IR cs.AI cs.CL cs.LG hep-ex

MITRA: An AI Assistant for Knowledge Retrieval in Physics Collaborations

Abhishikth Mallampalli, Sridhara Dasu

Comments Accepted at NeurIPS 2025 Machine Learning for the Physical Sciences workshop and Lepton Photon conference 2025 (Computing AI/ML track)

2603.08723 2026-03-12 cs.CY cs.AI

Alignment as Iatrogenesis: Pastoral Power, Collective Pathology, and the Structural Limits of Monolingual Safety Evaluation

Hiroki Fukui

Comments 30 pages, 1 figure, 24-page supplementary. Preprint v3. Companion paper: arXiv:2603.04904. Previous versions: Zenodo DOI 10.5281/zenodo.18646998

2603.08719 2026-03-12 cs.AR cs.AI cs.SE

SiliconMind-V1: Multi-Agent Distillation and Debug-Reasoning Workflows for Verilog Code Generation

Mu-Chi Chen, Yu-Hung Kao, Po-Hsuan Huang, Shao-Chun Ho, Hsiang-Yu Tsou, I-Ting Wu, En-Ming Huang, Yu-Kai Hung, Wei-Po Hsin, Cheng Liang, Chia-Heng Tu, Shih-Hao Hung, H. T. Kung

2603.08017 2026-03-12 cs.HC cs.AI

Alignment-Process-Outcome: Rethinking How AIs and Humans Collaborate

Haichang Li, Anjun Zhu, Arpit Narechania

Comments Accepted by Extended Abstracts of the 2026 CHI Conference on Human Factors in Computing Systems (CHI EA 26), Barcelona, Spain, 2026

2603.07502 2026-03-12 cs.IR cs.AI

SeDa: A Unified System for Dataset Discovery and Multi-Entity Augmented Semantic Exploration

Kan Ling, Zhen Qin, Yichi Zhu, Hengrun Zhang, Huiqun Yu, Guisheng Fan

Comments 16 pages, 8 figures. System for large-scale dataset discovery and multi-entity semantic exploration

2603.06739 2026-03-12 cs.SE cs.AI

ResearchEnvBench: Benchmarking Agents on Environment Synthesis for Research Code Execution

Yubang Wang, Chenxi Zhang, Bowen Chen, Zezheng Huai, Zihao Dai, Xinchi Chen, Yuxin Wang, Yining Zheng, Jingjing Gong, Xipeng Qiu

2603.01246 2026-03-12 cs.CR cs.AI

Defensive Refusal Bias: How Safety Alignment Fails Cyber Defenders

David Campbell, Neil Kale, Udari Madhushani Sehwag, Bert Herring, Nick Price, Dan Borges, Alex Levinson, Christina Q Knight

2602.22427 2026-03-12 cs.CR cs.AI

Adversarial Hubness Detector: Detecting Hubness Poisoning in Retrieval-Augmented Generation Systems

Idan Habler, Vineeth Sai Narajala, Stav Koren, Amy Chang, Tiffany Saade

Comments 11 pages, 5 figures, 2 tables, Github: https://github.com/cisco-ai-defense/adversarial-hubness-detector, Updated with minor changes to naming

2602.20833 2026-03-12 cs.DS cs.LG

DRESS: A Continuous Framework for Structural Graph Refinement

Eduar Castrillo Velilla

2602.18419 2026-03-12 cond-mat.dis-nn cs.LG

Benchmarking Graph Neural Networks in Solving Hard Constraint Satisfaction Problems

Geri Skenderi, Lorenzo Buffoni, Francesco D'Amico, David Machado, Raffaele Marino, Matteo Negri, Federico Ricci-Tersenghi, Carlo Lucibello, Maria Chiara Angelini

2602.18045 2026-03-12 stat.ME cs.AI cs.LG

Conformal Tradeoffs: Operational Profiles Beyond Coverage

Petrus H. Zwart

2602.04472 2026-03-12 math.ST cs.LG math.PR stat.ML stat.TH

Universality of General Spiked Tensor Models

Yanjin Xiang, Zhihua Zhang

Comments 115pages

2602.04347 2026-03-12 stat.ML cs.LG

A Bandit-Based Approach to Educational Recommender Systems: Contextual Thompson Sampling for Learner Skill Gain Optimization

Lukas De Kerpel, Arthur Thuy, Dries F. Benoit

Comments Accepted for publication in INFORMS Transactions on Education

2602.03850 2026-03-12 cs.HC cs.AI cs.CV

WebAccessVL: Violation-Aware VLM for Web Accessibility

Amber Yijia Zheng, Jae Joong Lee, Bedrich Benes, Raymond A. Yeh

2602.00387 2026-03-12 stat.ML cs.LG stat.AP

Singular Bayesian Neural Networks

Mame Diarra Toure, David A. Stephens

Comments 8 pages Main text, 53 pages Appendix, 20 figures

2601.23034 2026-03-12 math.OC cs.LG

Breaking the Stochasticity Barrier: An Adaptive Variance-Reduced Method for Variational Inequalities

Yungi Jeong, Takumi Otsuka

2601.17374 2026-03-12 stat.ML cs.LG cs.NA math.NA

Error Analysis of Bayesian Inverse Problems with Generative Priors

Bamdad Hosseini, Ziqi Huang

Comments 30 pages, 8 figures

2601.13879 2026-03-12 cs.MM cs.CL cs.CV

Chain-of-Thought Compression Should Not Be Blind: V-Skip for Efficient Multimodal Reasoning via Dual-Path Anchoring

Dongxu Zhang, Yiding Sun, Cheng Tan, Wenbiao Yan, Ning Yang, Jihua Zhu, Haijun Zhang