arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

Jane Luo, Xin Zhang, Steven Liu, Jie Wu, Jianfeng Liu, Yiming Huang, Yangyu Huang, Chengyu Yin, Ying Xin, Yuefeng Zhan, Hao Sun, Qi Chen, Scarlett Li, Mao Yang

详情

英文摘要

Large language models excel at generating individual functions or single files of code, yet generating complete repositories from scratch remains a fundamental challenge. This capability is key to building coherent software systems from high-level specifications and realizing the full potential of automated code generation. The process requires planning at two levels: deciding what features and modules to build (proposal stage) and defining their implementation details (implementation stage). Current approaches rely on natural language planning, which often produces unclear specifications, misaligned components, and brittle designs due to its inherent ambiguity and lack of structure. To address these limitations, we introduce the Repository Planning Graph (RPG), a structured representation that encodes capabilities, file structures, data flows, and functions in a unified graph. By replacing free-form natural language with an explicit blueprint, RPG enables consistent long-horizon planning for repository generation. Building on RPG, we develop ZeroRepo, a graph-driven framework that operates in three stages: proposal-level planning, implementation-level construction, and graph-guided code generation with test validation. To evaluate, we construct RepoCraft, a benchmark of six real-world projects with 1,052 tasks. On RepoCraft, ZeroRepo produces nearly 36K Code Lines and 445K Code Tokens, on average 3.9$\times$ larger than the strongest baseline (Claude Code), and 68$\times$ larger than other baselines. It achieves 81.5% coverage and 69.7% test accuracy, improving over Claude Code by 27.3 and 35.8 points. Further analysis shows that RPG models complex dependencies, enables more sophisticated planning through near-linear scaling, and improves agent understanding of repositories, thus accelerating localization. Our data and code are available at https://github.com/microsoft/RPG-ZeroRepo.

URL PDF HTML ☆

赞 0 踩 0

2509.14832 2026-02-16 cs.LG cs.AI cs.SY eess.SY

Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization

Stelios Zarifis, Ioannis Kordonis, Petros Maragos

Comments 5 pages, 2 figures, 1 table, and 1 algorithm. This version is submitted to the 34th EURASIP European Signal Processing Conference 2026 (EUSIPCO 2026), to be held in Bruges, Belgium, on August 31 - September 4, 2026

2509.03834 2026-02-16 cs.LG cs.AI cs.GT

From Leiden to Pleasure Island: The Constant Potts Model for Community Detection as a Hedonic Game

Lucas Lopes Felipe, Konstantin Avrachenkov, Daniel Sadoc Menasche

Comments Manuscript submitted to Physica A: Statistical Mechanics and its Applications

Journal ref Felipe, L. L., Avrachenkov, K., & Menasché, D. S. (2025). From Leiden to Pleasure Island: The Constant Potts Model for community detection as a hedonic game. Physica A: Statistical Mechanics and its Applications, 130989

2508.16371 2026-02-16 cs.CL

The Mediomatix Corpus: Parallel Data for Romansh Language Varieties via Comparable Schoolbooks

Zachary Hopton, Jannis Vamvas, Andrin Büchler, Anna Rutkiewicz, Rico Cathomas, Rico Sennrich

2508.12685 2026-02-16 cs.CL cs.AI cs.LG

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

Xingshan Zeng, Weiwen Liu, Lingzhi Wang, Liangyou Li, Fei Mi, Yasheng Wang, Lifeng Shang, Xin Jiang, Qun Liu

Comments Accepted by ICLR2026

2508.11850 2026-02-16 cs.AI

EvoCut: Strengthening Integer Programs via Evolution-Guided Language Models

Milad Yazdani, Mahdi Mostajabdaveh, Samin Aref, Zirui Zhou

2508.10587 2026-02-16 cs.LG cs.NA eess.SP math.NA

Self-Supervised Temporal Super-Resolution of Energy Data using Generative Adversarial Transformer

Xuanhao Mu, Gökhan Demirel, Yuzhe Zhang, Jianlei Liu, Thorsten Schlachter, Veit Hagenmeyer

Comments The authors have identified a critical error in the experimental setup (data leakage in the training/validation split) that invalidates the self-supervised learning claims presented in this version. The results are therefore unreliable

2508.08236 2026-02-16 cs.CL cs.CY

Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge

Yunna Cai, Fan Wang, Haowei Wang, Kun Wang, Kailai Yang, Sophia Ananiadou, Moyan Li, Mingming Fan

2508.07388 2026-02-16 cs.AI

Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

Zhaoyu Chen, Hongnan Lin, Yongwei Nie, Fei Ma, Xuemiao Xu, Fei Yu, Chengjiang Long

2507.19593 2026-02-16 cs.AI cs.MA

A Survey on Hypergame Theory: Modeling Misaligned Perceptions and Nested Beliefs for Multi-agent Systems

Vince Trencsenyi, Agnieszka Mensfelt, Kostas Stathis

详情

英文摘要

Classical game-theoretic models typically assume rational agents, complete information, and common knowledge of payoffs - assumptions that are often violated in real-world MAS characterized by uncertainty, misaligned perceptions, and nested beliefs. To overcome these limitations, researchers have proposed extensions that incorporate models of cognitive constraints, subjective beliefs, and heterogeneous reasoning. Among these, hypergame theory extends the classical paradigm by explicitly modeling agents' subjective perceptions of the strategic scenario, known as perceptual games, in which agents may hold divergent beliefs about the structure, payoffs, or available actions. We present a systematic review of agent-compatible applications of hypergame theory, examining how its descriptive capabilities have been adapted to dynamic and interactive MAS contexts. We analyze 44 selected studies from cybersecurity, robotics, social simulation, communications, and general game-theoretic modeling. Building on a formal introduction to hypergame theory and its two major extensions - hierarchical hypergames and HNF - we develop agent-compatibility criteria and an agent-based classification framework to assess integration patterns and practical applicability. Our analysis reveals prevailing tendencies, including the prevalence of hierarchical and graph-based models in deceptive reasoning and the simplification of extensive theoretical frameworks in practical applications. We identify structural gaps, including the limited adoption of HNF-based models, the lack of formal hypergame languages, and unexplored opportunities for modeling human-agent and agent-agent misalignment. By synthesizing trends, challenges, and open research directions, this review provides a new roadmap for applying hypergame theory to enhance the realism and effectiveness of strategic modeling in dynamic multi-agent environments.

URL PDF HTML ☆

赞 0 踩 0

2507.16696 2026-02-16 cs.LG cs.AI cs.MM cs.SD

FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation

Pingyi Fan, Anbai Jiang, Shuwei Zhang, Zhiqiang Lv, Bing Han, Xinhu Zheng, Wenrui Liang, Junjie Li, Wei-Qiang Zhang, Yanmin Qian, Xie Chen, Cheng Lu, Jia Liu

Comments 11 pages, 6 figures. FISHER open-sourced on \url{https://github.com/jianganbai/FISHER} RMIS open-sourced on \url{https://github.com/jianganbai/RMIS}

2507.03886 2026-02-16 cs.CV

ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments

Guile Wu, Dongfeng Bai, Bingbing Liu

Comments ICRA 2026

2507.02310 2026-02-16 cs.LG cs.AI cs.CV

Holistic Continual Learning under Concept Drift with Adaptive Memory Realignment

Alif Ashrafee, Jedrzej Kozal, Michal Wozniak, Bartosz Krawczyk

Comments Published in Transactions on Machine Learning Research (TMLR), 01/2026. https://openreview.net/forum?id=1drDlt0CLM

详情

英文摘要

Traditional continual learning methods prioritize knowledge retention and focus primarily on mitigating catastrophic forgetting, implicitly assuming that the data distribution of previously learned tasks remains static. This overlooks the dynamic nature of real-world data streams, where concept drift permanently alters previously seen data and demands both stability and rapid adaptation. We introduce a holistic framework for continual learning under concept drift that simulates realistic scenarios by evolving task distributions. As a baseline, we consider Full Relearning (FR), in which the model is retrained from scratch on newly labeled samples from the drifted distribution. While effective, this approach incurs substantial annotation and computational overhead. To address these limitations, we propose Adaptive Memory Realignment (AMR), a lightweight alternative that equips rehearsal-based learners with a drift-aware adaptation mechanism. AMR selectively removes outdated samples of drifted classes from the replay buffer and repopulates it with a small number of up-to-date instances, effectively realigning memory with the new distribution. This targeted resampling matches the performance of FR while reducing the need for labeled data and computation by orders of magnitude. To enable reproducible evaluation, we introduce four concept drift variants of standard vision benchmarks, where previously seen classes reappear with shifted representations. Comprehensive experiments on these datasets using several rehearsal-based baselines show that AMR consistently counters concept drift, maintaining high accuracy with minimal overhead. These results position AMR as a scalable solution that reconciles stability and plasticity in non-stationary continual learning environments. Full implementation of our framework and benchmark datasets is available at: github.com/AlifAshrafee/CL-Under-Concept-Drift.

URL PDF HTML ☆

赞 0 踩 0

2506.22890 2026-02-16 cs.CV cs.CR

CP-uniGuard: A Unified, Probability-Agnostic, and Adaptive Framework for Malicious Agent Detection and Defense in Multi-Agent Embodied Perception Systems

Senkang Hu, Yihang Tao, Guowen Xu, Xinyuan Qian, Yiqin Deng, Xianhao Chen, Sam Tak Wu Kwong, Yuguang Fang

Comments Accepted by IEEE Transactions on Mobile Computing (TMC)

2506.13652 2026-02-16 cs.LG stat.ML

PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning

Daniele Zambon, Michele Cattaneo, Ivan Marisca, Jonas Bhend, Daniele Nerini, Cesare Alippi

2506.06977 2026-02-16 cs.LG cs.AI

Discovering Hierarchy-Grounded Domains with Adaptive Granularity for Clinical Domain Generalization

Pengfei Hu, Xiaoxue Han, Fei Wang, Yue Ning

2506.06027 2026-02-16 cs.CV cs.LG

Sample-Specific Noise Injection For Diffusion-Based Adversarial Purification

Yuhao Sun, Jiacheng Zhang, Zesheng Ye, Chaowei Xiao, Feng Liu

2506.05325 2026-02-16 cs.LG

Quasiparticle Interference Kernel Extraction with Variational Autoencoders via Latent Alignment

Yingshuai Ji, Haomin Zhuang, Matthew Toole, James McKenzie, Xiaolong Liu, Xiangliang Zhang

2505.23381 2026-02-16 cs.AI

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

Bowen Ping, Minnan Luo, Zhuohang Dang, Chenxi Wang, Chengyou Jia

2505.22650 2026-02-16 cs.LG

On Learning Verifiers and Implications to Chain-of-Thought Reasoning

Maria-Florina Balcan, Avrim Blum, Zhiyuan Li, Dravyansh Sharma

Comments 26 pages, NeurIPS 2025

2505.16348 2026-02-16 cs.CL

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

Taeyoon Kwon, Dongwook Choi, Hyojun Kim, Sunghwan Kim, Seungjun Moon, Beong-woo Kwak, Kuan-Hao Huang, Jinyoung Yeo

Comments Accepted at ICLR 2026

2505.14381 2026-02-16 cs.AI

SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

Nobuhiro Ueda, Yuyang Dong, Krisztián Boros, Daiki Ito, Takuya Sera, Masafumi Oyamada

2505.08259 2026-02-16 cs.CV

CNN and ViT Efficiency Study on Tiny ImageNet and DermaMNIST Datasets

Aidar Amangeldi, Angsar Taigonyrov, Muhammad Huzaifa Jawad, Chinedu Emmanuel Mbonu

2504.17052 2026-02-16 cs.CL

PReSS: A Black-Box Framework for Evaluating Political Stance Stability in LLMs via Argumentative Pressure

Shariar Kabir, Kevin Esterling, Yue Dong

Comments 13 pages, 8 figures

2504.16585 2026-02-16 cs.LG stat.CO stat.ML

Leveraging Noisy Manual Labels as Useful Information: An Information Fusion Approach for Enhanced Variable Selection in Penalized Logistic Regression

Xiaofei Wu, Rongmei Liangse

2503.19140 2026-02-16 cs.RO cs.SY eess.SY

Dom, cars don't fly! -- Or do they? In-Air Vehicle Maneuver for High-Speed Off-Road Navigation

Anuj Pokhrel, Aniket Datar, Xuesu Xiao

Comments 8 Pages, 4 Figures

2503.03704 2026-02-16 cs.LG

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Shen Dong, Shaochen Xu, Pengfei He, Yige Li, Jiliang Tang, Tianming Liu, Hui Liu, Zhen Xiang

Comments Code released

2503.01159 2026-02-16 cs.CL

Large Language Models for Healthcare Text Classification: A Systematic Review

Hajar Sakai, Sarah S. Lam

详情

DOI: 10.2196/79202

英文摘要

Large Language Models (LLMs) have fundamentally transformed approaches to Natural Language Processing (NLP) tasks across diverse domains. In healthcare, accurate and cost-efficient text classification is crucial, whether for clinical notes analysis, diagnosis coding, or any other task, and LLMs present promising potential. Text classification has always faced multiple challenges, including manual annotation for training, handling imbalanced data, and developing scalable approaches. With healthcare, additional challenges are added, particularly the critical need to preserve patients' data privacy and the complexity of the medical terminology. Numerous studies have been conducted to leverage LLMs for automated healthcare text classification and contrast the results with existing machine learning-based methods where embedding, annotation, and training are traditionally required. Existing systematic reviews about LLMs either do not specialize in text classification or do not focus on the healthcare domain. This research synthesizes and critically evaluates the current evidence found in the literature regarding the use of LLMs for text classification in a healthcare setting. Major databases (e.g., Google Scholar, Scopus, PubMed, Science Direct) and other resources were queried, which focused on the papers published between 2018 and 2024 within the framework of PRISMA guidelines, which resulted in 65 eligible research articles. These were categorized by text classification type (e.g., binary classification, multi-label classification), application (e.g., clinical decision support, public health and opinion analysis), methodology, type of healthcare text, and metrics used for evaluation and validation. This review reveals the existing gaps in the literature and suggests future research lines that can be investigated and explored.

URL PDF HTML ☆

赞 0 踩 0

2503.00736 2026-02-16 cs.CV

Unifying Multiple Foundation Models for Advanced Computational Pathology

Wenhui Lei, Yusheng Tan, Anqi Li, Hanyu Chen, Hengrui Tian, Ruiying Li, Zhengqun Jiang, Fang Yan, Xiaofan Zhang, Shaoting Zhang

Comments 50 pages, 5 main figures

2502.17822 2026-02-16 cs.CV

Easy-Poly: An Easy Polyhedral Framework For 3D Multi-Object Tracking

Peng Zhang, Xin Li, Xin Lin, Liang He

Comments 8 pages, 4 figures, 6 tables

AI 大模型

视觉与机器人

科学与医疗

RPG: A Repository Planning Graph for Unified and Scalable Codebase Generation

Diffusion-Based Scenario Tree Generation for Multivariate Time Series Prediction and Multistage Stochastic Optimization

From Leiden to Pleasure Island: The Constant Potts Model for Community Detection as a Hedonic Game

The Mediomatix Corpus: Parallel Data for Romansh Language Varieties via Comparable Schoolbooks

ToolACE-MT: Non-Autoregressive Generation for Agentic Multi-Turn Interaction

EvoCut: Strengthening Integer Programs via Evolution-Guided Language Models

Self-Supervised Temporal Super-Resolution of Energy Data using Generative Adversarial Transformer

Exploring Safety Alignment Evaluation of LLMs in Chinese Mental Health Dialogues via LLM-as-Judge

Invert4TVG: A Temporal Video Grounding Framework with Inversion Tasks Preserving Action Understanding Ability

A Survey on Hypergame Theory: Modeling Misaligned Perceptions and Nested Beliefs for Multi-agent Systems

FISHER: A Foundation Model for Multi-Modal Industrial Signal Comprehensive Representation

ArmGS: Composite Gaussian Appearance Refinement for Modeling Dynamic Urban Environments

Holistic Continual Learning under Concept Drift with Adaptive Memory Realignment

CP-uniGuard: A Unified, Probability-Agnostic, and Adaptive Framework for Malicious Agent Detection and Defense in Multi-Agent Embodied Perception Systems

PeakWeather: MeteoSwiss Weather Station Measurements for Spatiotemporal Deep Learning

Discovering Hierarchy-Grounded Domains with Adaptive Granularity for Clinical Domain Generalization

Sample-Specific Noise Injection For Diffusion-Based Adversarial Purification

Quasiparticle Interference Kernel Extraction with Variational Autoencoders via Latent Alignment

AutoGPS: Automated Geometry Problem Solving via Multimodal Formalization and Deductive Reasoning

On Learning Verifiers and Implications to Chain-of-Thought Reasoning

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation

CNN and ViT Efficiency Study on Tiny ImageNet and DermaMNIST Datasets

PReSS: A Black-Box Framework for Evaluating Political Stance Stability in LLMs via Argumentative Pressure

Leveraging Noisy Manual Labels as Useful Information: An Information Fusion Approach for Enhanced Variable Selection in Penalized Logistic Regression

Dom, cars don't fly! -- Or do they? In-Air Vehicle Maneuver for High-Speed Off-Road Navigation

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Large Language Models for Healthcare Text Classification: A Systematic Review

Unifying Multiple Foundation Models for Advanced Computational Pathology

Easy-Poly: An Easy Polyhedral Framework For 3D Multi-Object Tracking