arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.18273 2026-03-20 cs.AI

EDM-ARS: A Domain-Specific Multi-Agent System for Automated Educational Data Mining Research

Chenguang Pan, Zhou Zhang, Weixuan Xiao, Chengyuan Yao

详情

英文摘要

In this technical report, we present the Educational Data Mining Automated Research System (EDM-ARS), a domain-specific multi-agent pipeline that automates end-to-end educational data mining (EDM) research. We conceptualize EDM-ARS as a general framework for domain-aware automated research pipelines, where educational expertise is embedded into each stage of the research lifecycle. As a first instantiation of this framework, we focus on predictive modeling tasks. Within this scope, EDM-ARS orchestrates five specialized LLM-powered agents (ProblemFormulator, DataEngineer, Analyst, Critic, and Writer) through a state-machine coordinator that supports revision loops, checkpoint-based recovery, and sandboxed code execution. Given a research prompt and a dataset, EDM-ARS produces a complete LaTeX manuscript with real Semantic Scholar citations, validated machine learning analyses, and automated methodological peer review. We also provide a detailed description of the system architecture, the three-tier data registry design that encodes educational domain expertise, the specification of each agent, the inter-agent communication protocol, and mechanisms for error-handling and self-correction. Finally, we discuss current limitations, including single-dataset scope and formulaic paper output, and outline a phased roadmap toward causal inference, transfer learning, psychometric, and multi-dataset generalization. EDM-ARS is released as an open-source project to support the educational research community.

URL PDF HTML ☆

赞 0 踩 0

2603.18272 2026-03-20 cs.AI cs.CL

Retrieval-Augmented LLM Agents: Learning to Learn from Experience

Thomas Palmeira Ferraz, Romain Deffayet, Vassilina Nikoulina, Hervé Déjean, Stéphane Clinchant

2603.18266 2026-03-20 cs.LG cs.AI

Enactor: From Traffic Simulators to Surrogate World Models

Yash Ranjan, Rahul Sengupta, Anand Rangarajan, Sanjay Ranka

详情

英文摘要

Traffic microsimulators are widely used to evaluate road network performance under various ``what-if" conditions. However, the behavior models controlling the actions of the actors are overly simplistic and fails to capture realistic actor-actor interactions. Deep learning-based methods have been applied to model vehicles and pedestrians as ``agents" responding to their surrounding ``environment" (including lanes, signals, and neighboring agents). Although effective in learning actor-actor interaction, these approaches fail to generate physically consistent trajectories over long time periods, and they do not explicitly address the complex dynamics that arise at traffic intersections which is a critical location in urban networks. Inspired by the World Model paradigm, we have developed an actor centric generative model using transformer-based architecture that is able to capture the actor-actor interaction, at the same time understanding the geometry to the traffic intersection to generate physically grounded trajectories that are based on learned behavior. Moreover, we test the model in a live ``simulation-in-the-loop" setting, where we generate the initial conditions of the actors using SUMO and then let the model control the dynamics of the actors. We let the simulation run for 40000 timesteps (4000 seconds), testing the performance of the model on long timerange and evaluating the trajectories on traffic engineering related metrics. Experimental results demonstrate that the proposed framework effectively captures complex actor-actor interactions and generates long-horizon, physically consistent trajectories, while requiring significantly fewer training samples than traditional agent-centric generative approaches. Our model is able to outperform the baseline in traffic related as well as aggregate metrics where our model beats the baseline by more than 10x on the KL-Divergence.

URL PDF HTML ☆

赞 0 踩 0

2603.18261 2026-03-20 cs.CV cs.AI

LRConv-NeRV: Low Rank Convolution for Efficient Neural Video Compression

Tamer Shanableh

Comments This work has been submitted to the IEEE for possible publication

2603.18260 2026-03-20 cs.RO

Manufacturing Micro-Patterned Surfaces with Multi-Robot Systems

Annalisa T. Taylor, Malachi Landis, Ping Guo, Todd D. Murphey

2603.18258 2026-03-20 cs.LG cs.AI

Sharpness-Aware Minimization in Logit Space Efficiently Enhances Direct Preference Optimization

Haocheng Luo, Zehang Deng, Thanh-Toan Do, Mehrtash Harandi, Dinh Phung, Trung Le

Comments Accepted at ICLR 2026

2603.18247 2026-03-20 cs.LG

AGRI-Fidelity: Evaluating the Reliability of Listenable Explanations for Poultry Disease Detection

Sindhuja Madabushi, Arda Dogan, Jonathan Liu, Dian Chen, Dong S. Ha, Sook Shin, Sam H. Noh, Jin-Hee Cho

2603.18246 2026-03-20 cs.RO

Rapid Adaptation of Particle Dynamics for Generalized Deformable Object Mobile Manipulation

Bohan Wu, Roberto Martín-Martín, Li Fei-Fei

Comments 8 pages, ICRA 2026

详情

英文摘要

We address the challenge of learning to manipulate deformable objects with unknown dynamics. In non-rigid objects, the dynamics parameters define how they react to interactions -- how they stretch, bend, compress, and move -- and they are critical to determining the optimal actions to perform a manipulation task successfully. In other robotic domains, such as legged locomotion and in-hand rigid object manipulation, state-of-the-art approaches can handle unknown dynamics using Rapid Motor Adaptation (RMA). Through a supervised procedure in simulation that encodes each rigid object's dynamics, such as mass and position, these approaches learn a policy that conditions actions on a vector of latent dynamic parameters inferred from sequences of state-actions. However, in deformable object manipulation, the object's dynamics not only includes its mass and position, but also how the shape of the object changes. Our key insight is that the recent ground-truth particle positions of a deformable object in simulation capture changes in the object's shape, making it possible to extend RMA to deformable object manipulation. This key insight allows us to develop RAPiD, a two-phase method that learns to perform real-robot deformable object mobile manipulation by: 1) learning a visuomotor policy conditioned on the object's dynamics embedding, which is encoded from the object's privileged information in simulation, such as its mass and ground-truth particle positions, and 2) learning to infer this embedding using non-privileged information instead, such as robot visual observations and actions, so that the learned policy can transfer to the real world. On a mobile manipulator with 22 degrees of freedom, RAPiD enables over 80%+ success rates across two vision-based deformable object mobile manipulation tasks in the real world, under various object dynamics, categories, and instances.

URL PDF HTML ☆

赞 0 踩 0

2603.18238 2026-03-20 cs.RO

ReDAG-RT: Global Rate-Priority Scheduling for Real-Time Multi-DAG Execution in ROS 2

Md. Mehedi Hasan, Rafid Mostafiz, Bikash Kumar Paul, Md. Abir Hossain, Ziaur Rahman

Comments 12 pages, 6 figures

详情

英文摘要

ROS 2 has become a dominant middleware for robotic systems, where perception, estimation, planning, and control pipelines are structured as directed acyclic graphs of callbacks executed under a shared executor. However, default ROS 2 executors use best-effort dispatch without cross-DAG priority enforcement, leading to callback contention, structural priority inversion, and deadline instability under concurrent workloads. These limitations restrict deployment in time-critical and safety-sensitive cyber-physical systems. This paper presents ReDAGRT, a user-space global scheduling framework for deterministic multi-DAG execution in unmodified ROS 2. The framework introduces a Rate-Priority driven global ready queue that orders callbacks by activation rate, enforces per-DAG concurrency bounds, and mitigates cross-graph priority inversion without modifying the ROS 2 API, executor interface, or underlying operating system scheduler. We formalize a multi-DAG task model for ROS 2 callback pipelines and analyze cross-DAG interference under Rate-Priority scheduling. Response-time recurrences and schedulability conditions are derived within classical Rate-Monotonic theory. Experiments in a ROS 2 Humble environment compare ReDAGRT against SingleThreadedExecutor and MultiThreadedExecutor using synthetic multi-DAG workloads. Results show up to 29.7 percent reduction in deadline miss rate, 42.9 percent reduction in 99th percentile response time, and 13.7 percent improvement over MultiThreadedExecutor under comparable utilization. Asymmetric per-DAG concurrency bounds further reduce interference by 40.8 percent. These results demonstrate that deterministic and analyzable multi-DAG scheduling can be achieved entirely in the ROS 2 user-space execution layer, providing a practical foundation for real-time robotic middleware in safety-critical systems.

URL PDF HTML ☆

赞 0 踩 0

2603.18237 2026-03-20 cs.LG cs.AI

Gradient-Informed Temporal Sampling Improves Rollout Accuracy in PDE Surrogate Training

Wenshuo Wang, Fan Zhang

2603.18218 2026-03-20 cs.CV cs.RO

Semantic Segmentation and Depth Estimation for Real-Time Lunar Surface Mapping Using 3D Gaussian Splatting

Guillem Casadesus Vila, Adam Dai, Grace Gao

2603.18210 2026-03-20 cs.RO

GoalVLM: VLM-driven Object Goal Navigation for Multi-Agent System

MoniJesu James, Amir Atef Habel, Aleksey Fedoseev, Dzmitry Tsetserokou

Comments 8 pages, 5 figures

2603.18201 2026-03-20 cs.AI stat.CO

A Computationally Efficient Learning of Artificial Intelligence System Reliability Considering Error Propagation

Fenglian Pan, Yinwei Zhang, Yili Hong, Larry Head, Jian Liu

Comments 42 pages, 11 figures

2603.18197 2026-03-20 cs.AI cs.CR cs.NI

Access Controlled Website Interaction for Agentic AI with Delegated Critical Tasks

Sunyoung Kim, Hokeun Kim

2603.18192 2026-03-20 cs.CV

MicroVision: An Open Dataset and Benchmark Models for Detecting Vulnerable Road Users and Micromobility Vehicles

Alexander Rasch, Rahul Rajendra Pai

2603.18189 2026-03-20 cs.AI

TeachingCoach: A Fine-Tuned Scaffolding Chatbot for Instructional Guidance to Instructors

Isabel Molnar, Peiyu Li, Si Chen, Sugana Chawla, James Lang, Ronald Metoyer, Ting Hua, Nitesh V. Chawla

2603.18184 2026-03-20 cs.CL

CWoMP: Morpheme Representation Learning for Interlinear Glossing

Morris Alper, Enora Rice, Bhargav Shandilya, Alexis Palmer, Lori Levin

Comments Project page: http://cwomp.github.io

2603.18174 2026-03-20 cs.LG

Conflict-Free Policy Languages for Probabilistic ML Predicates: A Framework and Case Study with the Semantic Router DSL

Xunzhuo Liu, Hao Wu, Huamin Chen, Bowei He, Xue Liu

Comments Work in progess

2603.18173 2026-03-20 cs.CL

GRAFITE: Generative Regression Analysis Framework for Issue Tracking and Evaluation

Ja Young Lee, Mírian Silva, Mohamed Nasr, Shonda Witherspoon, Enzo Bozzani, Veronique Demers, Radha Ratnaparkhi, Hui Wu, Sara Rosenthal

Comments 7 pages, 2 figures

2603.18171 2026-03-20 cs.CL

Modeling the human lexicon under temperature variations: linguistic factors, diversity and typicality in LLM word associations

Maria Andueza Rodriguez, Marie Candito, Richard Huyghe

Comments 11 pages, 12 figures, to appear in LREC 2026

2603.18166 2026-03-20 cs.AI

Efficient Dense Crowd Trajectory Prediction Via Dynamic Clustering

Antonius Bima Murti Wijaya, Paul Henderson, Marwa Mahmoud

2603.18161 2026-03-20 cs.CL cs.AI

How LLMs Distort Our Written Language

Marwa Abdulhai, Isadora White, Yanming Wan, Ibrahim Qureshi, Joel Leibo, Max Kleiman-Weiner, Natasha Jaques

2603.18130 2026-03-20 cs.RO cs.AI

Final Report for the Workshop on Robotics & AI in Medicine

Juan P Wachs

Comments 51 pages, 5 figures

2603.18124 2026-03-20 cs.CL

Evaluating FrameNet-Based Semantic Modeling for Gender-Based Violence Detection in Clinical Records

Lívia Dutra, Arthur Lorenzi, Frederico Belcavello, Ely Matos, Marcelo Viridiano, Lorena Larré, Olívia Guaranha, Erik Santos, Sofia Reinach, Pedro de Paula, Tiago Torrent

Comments Paper accepted to the Lang4Heath Workshop at PROPOR 2026

2603.18122 2026-03-20 cs.AI cs.HC cs.PL cs.SY eess.SY

Don't Vibe Code, Do Skele-Code: Interactive No-Code Notebooks for Subject Matter Experts to Build Lower-Cost Agentic Workflows

Sriram Gopalakrishnan

Comments Main paper 9 pages. Topics: Agentic Coding, HCI, LLMs, Workflows

2603.18118 2026-03-20 cs.CV cs.AI cs.LG

Insight-V++: Towards Advanced Long-Chain Visual Reasoning with Multimodal Large Language Models

Yuhao Dong, Zuyan Liu, Shulin Tian, Yongming Rao, Ziwei Liu

Comments arXiv admin note: text overlap with arXiv:2411.14432

详情

英文摘要

Large Language Models (LLMs) have achieved remarkable reliability and advanced capabilities through extended test-time reasoning. However, extending these capabilities to Multi-modal Large Language Models (MLLMs) remains a significant challenge due to a critical scarcity of high-quality, long-chain reasoning data and optimized training pipelines. To bridge this gap, we present a unified multi-agent visual reasoning framework that systematically evolves from our foundational image-centric model, Insight-V, into a generalized spatial-temporal architecture, Insight-V++. We first propose a scalable data generation pipeline equipped with multi-granularity assessment that autonomously synthesizes structured, complex reasoning trajectories across image and video domains without human intervention. Recognizing that directly supervising MLLMs with such intricate data yields sub-optimal results, we design a dual-agent architecture comprising a reasoning agent to execute extensive analytical chains, and a summary agent to critically evaluate and distill final outcomes. While our initial framework utilized Direct Preference Optimization (DPO), its off-policy nature fundamentally constrained reinforcement learning potential. To overcome these limitations, particularly for long-horizon video understanding, Insight-V++ introduces two novel algorithms, ST-GRPO and J-GRPO, which enhance spatial-temporal reasoning and improve evaluative robustness. Crucially, by leveraging reliable feedback from the summary agent, we guide an iterative reasoning path generation process, retraining the entire multi-agent system in a continuous, self-improving loop. Extensive experiments on base models like LLaVA-NeXT and Qwen2.5-VL demonstrate significant performance gains across challenging image and video reasoning benchmarks while preserving strong capabilities on traditional perception-focused tasks.

URL PDF HTML ☆

赞 0 踩 0

2603.18115 2026-03-20 cs.LG cs.AI

LLM-Augmented Computational Phenotyping of Long Covid

Jing Wang, Jie Shen, Amar Sra, Qiaomin Xie, Jeremy C Weiss

2603.18112 2026-03-20 cs.LG cs.AI

Tula: Optimizing Time, Cost, and Generalization in Distributed Large-Batch Training

Sahil Tyagi, Feiyi Wang

2603.18111 2026-03-20 cs.LG stat.ML

BoundAD: Boundary-Aware Negative Generation for Time Series Anomaly Detection

Xiancheng Wang, Lin Wang, Zhibo Zhang, Rui Wang, Minghang Zhao

2603.18108 2026-03-20 cs.CV

From Concepts to Judgments: Interpretable Image Aesthetic Assessment

Xiao-Chang Liu, Johan Wagemans

Comments 12 pages, 8 figures