arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2412.01711 2026-03-09 cs.CL

Towards Resource Efficient and Interpretable Bias Mitigation in Large Language Models

Schrasing Tong, Eliott Zemour, Jessica Lu, Rawisara Lohanimit, Lalana Kagal

Comments 38th Conference on Neural Information Processing Systems (NeurIPS 2024) Safe Generative AI Workshop. Updated results in V2

2411.19509 2026-03-09 cs.CV cs.LG cs.SD eess.AS

Ditto: Motion-Space Diffusion for Controllable Realtime Talking Head Synthesis

Tianqi Li, Ruobing Zheng, Minghui Yang, Jingdong Chen, Ming Yang

Comments Project Page: https://digital-avatar.github.io/ai/Ditto/

2411.07019 2026-03-09 cs.CL cs.AI

UniHR: Hierarchical Representation Learning for Unified Knowledge Graph Link Prediction

Zhiqiang Liu, Yin Hua, Mingyang Chen, Yichi Zhang, Zhuo Chen, Lei Liang, Wen Zhang

Comments AAAI 2026 (oral)

2410.09864 2026-03-09 cs.CV

AuthFace: Towards Authentic Blind Face Restoration with Face-oriented Generative Diffusion Prior

Guoqiang Liang, Qingnan Fan, Bingtao Fu, Jinwei Chen, Hong Gu, Lin Wang

Comments ACM MM 25, Codes and datasets are available at https://github.com/EthanLiang99/AuthFace

2409.18300 2026-03-09 cs.CV cs.AI cs.LG cs.RO

FALCON: Future-Aware Learning with Contextual Object-Centric Pretraining for UAV Action Recognition

Ruiqi Xian, Xiyang Wu, Tianrui Guan, Xijun Wang, Boqing Gong, Dinesh Manocha

2409.17137 2026-03-09 cs.LG cs.CV

PACE: Marrying generalization in PArameter-efficient fine-tuning with Consistency rEgularization

Yao Ni, Shan Zhang, Piotr Koniusz

Comments Accepted by NeurIPS 2024 as a spotlight

2409.10328 2026-03-09 cs.CV

Fuse4Seg: Image Fusion for Multi-Modal Medical Segmentation via Bi-level Optimization

Yuchen Guo, Junli Gong, Hongmin Cai, Yiu-ming Cheung, Weifeng Su

2408.01285 2026-03-09 cs.CL cs.CY

Do Prevalent Bias Metrics Capture Allocational Harms from LLMs?

Hannah Cyberey, Yangfeng Ji, David Evans

Comments Accepted to Workshop on Insights from Negative Results in NLP (2025)

2407.10735 2026-03-09 cs.AI cs.CL cs.CY cs.LG

Transforming Agency. On the mode of existence of Large Language Models

Xabier E. Barandiaran, Lola S. Almendros

详情

DOI: 10.1007/s11097-025-10094-3

英文摘要

This paper investigates the ontological characterization of Large Language Models (LLMs) like ChatGPT. Between inflationary and deflationary accounts, we pay special attention to their status as agents. This requires explaining in detail the architecture, processing, and training procedures that enable LLMs to display their capacities, and the extensions used to turn LLMs into agent-like systems. After a systematic analysis we conclude that a LLM fails to meet necessary and sufficient conditions for autonomous agency in the light of embodied theories of mind: the individuality condition (it is not the product of its own activity, it is not even directly affected by it), the normativity condition (it does not generate its own norms or goals), and, partially the interactional asymmetry condition (it is not the origin and sustained source of its interaction with the environment). If not agents, then ... what are LLMs? We argue that ChatGPT should be characterized as an interlocutor or linguistic automaton, a library-that-talks, devoid of (autonomous) agency, but capable to engage performatively on non-purposeful yet purpose-structured and purpose-bounded tasks. When interacting with humans, a "ghostly" component of the human-machine interaction makes it possible to enact genuine conversational experiences with LLMs. Despite their lack of sensorimotor and biological embodiment, LLMs textual embodiment (the training corpus) and resource-hungry computational embodiment, significantly transform existing forms of human agency. Beyond assisted and extended agency, the LLM-human coupling can produce midtended forms of agency, closer to the production of intentional agency than to the extended instrumentality of any previous technologies.

URL PDF HTML ☆

赞 0 踩 0

2407.04117 2026-03-09 cs.LG cond-mat.dis-nn cs.AI cs.NE stat.ML

Predictive Coding Networks and Inference Learning: Tutorial and Survey

Björn van Zwol, Ro Jefferson, Egon L. van den Broek

Comments 47 pages, 11 figures, 9 tables

2403.15048 2026-03-09 cs.CV cs.AI cs.LG cs.MM

Make VLM Recognize Visual Hallucination on Cartoon Character Image with Pose Information

Bumsoo Kim, Wonseop Shin, Kyuchul Lee, Yonghoon Jung, Sanghyun Seo

Comments Accepted at WACV 2025, Project page: https://gh-bumsookim.github.io/Cartoon-Hallucinations-Detection/. (Fixed typos)

2402.10828 2026-03-09 cs.RO cs.AI

RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model

Jianhao Yuan, Shuyang Sun, Daniel Omeiza, Bo Zhao, Paul Newman, Lars Kunze, Matthew Gadd

Comments 14 pages, 6 figures

2402.06204 2026-03-09 cs.CL cs.AI cs.HC

The Generative AI Paradox on Evaluation: What It Can Solve, It May Not Evaluate

Juhyun Oh, Eunsu Kim, Inha Cha, Alice Oh

2311.14886 2026-03-09 cs.LG cs.IT math.IT

A unified framework for learning with nonlinear model classes from arbitrary linear samples

Ben Adcock, Juan M. Cardenas, Nick Dexter

2310.00342 2026-03-09 cs.CV

RBF Weighted Hyper-Involution for RGB-D Object Detection

Mehfuz A Rahman, Khushal Das, Jiju Poovvancheri, Neil London, Dong Chen

Comments 33 pages, 15 figures

2309.12032 2026-03-09 cs.LG stat.ML

Expert-Aided Causal Discovery of Ancestral Graphs

Tiago da Silva, Bruna Bazaluk, Eliezer de Souza da Silva, António Góis, Salem Lahlou, Dominik Heider, Samuel Kaski, Diego Mesquita, Adèle Helena Ribeiro

2307.02518 2026-03-09 cs.CL cs.CY

Analyzing the Performance of ChatGPT in Cardiology and Vascular Pathologies

Walid Hariri

2304.14680 2026-03-09 cs.LG cs.SY eess.SY

Graph Neural Networks on Factor Graphs for Robust, Fast, and Scalable Linear State Estimation with PMUs

Ognjen Kundacina, Mirsad Cosovic, Dragisa Miskovic, Dejan Vukobratovic

Comments arXiv admin note: substantial text overlap with arXiv:2206.02731

2209.14007 2026-03-09 cs.RO cs.MA

OA-Bug: An Olfactory-Auditory Augmented Bug Algorithm for Swarm Robots in a Denied Environment

Siqi Tan, Xiaoya Zhang, Jingyao Li, Ruitao Jing, Mufan Zhao, Yang Liu, Quan Quan

Comments 7 pages, 6 figures, accepted by 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

2207.10783 2026-03-09 cs.AI cs.DM

Mean-based incomplete pairwise comparisons method with the reference values

Konrad Kułakowski, Anna Kędzior, Jacek Szybowski, Jiri Mazurek

Comments 36 pages

2201.07798 2026-03-09 cs.LG cs.AI

A Cognitive Explainer for Fetal ultrasound images classifier Based on Medical Concepts

Yingni Wanga, Yunxiao Liua, Licong Dongc, Xuzhou Wua, Huabin Zhangb, Qiongyu Yed, Desheng Sunc, Xiaobo Zhoue, Kehong Yuan

Comments 9 pages, 5 figures

2603.06413 2026-03-09 cs.SE cs.AI cs.LG

A Reference Architecture of Reinforcement Learning Frameworks

Xiaoran Liu, Istvan David

2603.03992 2026-03-09 cs.CY cs.AI

Measuring AI R&D Automation

Alan Chan, Ranay Padarath, Joe Kwon, Hilary Greaves, Markus Anderljung

2602.16069 2026-03-09 cs.SE cs.LG

The Limits of Long-Context Reasoning in Automated Bug Fixing

Ravi Raju, Mengmeng Ji, Shubhangi Upasani, Bo Li, Urmish Thakker

Comments Accepted to ICLR 2026 ICBINB workshop

详情

英文摘要

Rapidly increasing context lengths have led to the assumption that large language models (LLMs) can directly reason over entire codebases. Concurrently, recent advances in LLMs have enabled strong performance on software engineering benchmarks, particularly when paired with agentic workflows. In this work, we systematically evaluate whether current LLMs can reliably perform long-context code debugging and patch generation. Using SWE-bench Verified as a controlled experimental setting, we first evaluate state-of-the-art models within an agentic harness (mini-SWE-agent), where performance improves substantially: GPT-5-nano achieves up to a 31\% resolve rate on 100 samples, and open-source models such as Deepseek-R1-0528 obtain competitive results. However, token-level analysis shows that successful agentic trajectories typically remain under 20k-30k tokens, and that longer accumulated contexts correlate with lower success rates, indicating that agentic success primarily arises from task decomposition into short-context steps rather than effective long-context reasoning. To directly test long-context capability, we construct a data pipeline where we artificially inflate the context length of the input by placing the relevant files into the context (ensuring perfect retrieval recall); we then study single-shot patch generation under genuinely long contexts (64k tokens). Despite this setup, performance degrades sharply: Qwen3-Coder-30B-A3B achieves only a 7\% resolve rate at 64k context, while GPT-5-nano solves none of the tasks. Qualitative analysis reveals systematic failure modes, including hallucinated diffs, incorrect file targets, and malformed patch headers. Overall, our findings highlight a significant gap between nominal context length and usable context capacity in current LLMs, and suggest that existing agentic coding benchmarks do not meaningfully evaluate long-context reasoning.

URL PDF HTML ☆

赞 0 踩 0

2602.10152 2026-03-09 q-bio.GN cs.LG

Validating Interpretability in siRNA Efficacy Prediction: A Perturbation-Based, Dataset-Aware Protocol

Zahra Khodagholi, Niloofar Yousefi

Comments Accepted at the Machine Learning for Genomics Explorations (MLGenX) Workshop at ICLR 2026

2510.20975 2026-03-09 cs.CR cs.AI

REx86: A Local Large Language Model for Assisting in x86 Assembly Reverse Engineering

Darrin Lea, James Ghawaly, Golden Richard, Aisha Ali-Gombe, Andrew Case

Comments Accepted in 2025 Annual Computer Security Applications Conference (ACSAC)

2509.14961 2026-03-09 stat.ML cond-mat.mtrl-sci cs.LG physics.chem-ph

Spectral/Spatial Tensor Atomic Cluster Expansion with Universal Embeddings in Cartesian Space

Zemin Xu, Wenbo Xie, P. Hu

2508.06490 2026-03-09 eess.IV cs.CV cs.LG eess.SP

Multivariate Fields of Experts for Convergent Image Reconstruction

Stanislas Ducotterd, Michael Unser

2505.13531 2026-03-09 cs.CY cs.AI cs.CL

AdAEM: An Adaptively and Automated Extensible Measurement of LLMs' Value Difference

Jing Yao, Shitong Duan, Xiaoyuan Yi, Dongkuan Xu, Peng Zhang, Tun Lu, Ning Gu, Zhicheng Dou, Xing Xie

Comments This paper is accepted by ICLR 2026(Oral)

2505.02614 2026-03-09 math.OC cs.LG stat.ML

Entropic Mirror Descent for Linear Systems: Polyak's Stepsize and Implicit Bias

Yura Malitsky, Alexander Posch

Comments 20 pages, 2 figures