arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.19935 2026-03-23 cs.LG

Memori: A Persistent Memory Layer for Efficient, Context-Aware LLM Agents

Luiz C. Borro, Luiz A. B. Macarini, Gordon Tindall, Michael Montero, Adam B. Struck

Comments 9 pages; 2 figures; white paper

详情

英文摘要

As large language models (LLMs) evolve into autonomous agents, persistent memory at the API layer is essential for enabling context-aware behavior across LLMs and multi-session interactions. Existing approaches force vendor lock-in and rely on injecting large volumes of raw conversation into prompts, leading to high token costs and degraded performance. We introduce Memori, an LLM-agnostic persistent memory layer that treats memory as a data structuring problem. Its Advanced Augmentation pipeline converts unstructured dialogue into compact semantic triples and conversation summaries, enabling precise retrieval and coherent reasoning. Evaluated on the LoCoMo benchmark, Memori achieves 81.95% accuracy, outperforming existing memory systems while using only 1,294 tokens per query (~5% of full context). This results in substantial cost reductions, including 67% fewer tokens than competing approaches and over 20x savings compared to full-context methods. These results show that effective memory in LLM agents depends on structured representations instead of larger context windows, enabling scalable and cost-efficient deployment.

URL PDF HTML ☆

赞 0 踩 0

2603.19931 2026-03-23 cs.CL

SAGE: Sustainable Agent-Guided Expert-tuning for Culturally Attuned Translation in Low-Resource Southeast Asia

Zhixiang Lu, Chong Zhang, Yulong Li, Angelos Stefanidis, Anh Nguyen, Imran Razzak, Jionglong Su, Zhengyong Jiang

Comments Accepted by WWW 2026

2603.19926 2026-03-23 cs.CV

SegVGGT: Joint 3D Reconstruction and Instance Segmentation from Multi-View Images

Jinyuan Qu, Hongyang Li, Lei Zhang

2603.19921 2026-03-23 cs.CL cs.AI

Span-Level Machine Translation Meta-Evaluation

Stefano Perrella, Eric Morales Agostinho, Hugo Zaragoza

Comments 18 pages, 4 figures

2603.19920 2026-03-23 cs.CV

PanORama: Multiview Consistent Panoptic Segmentation in Operating Rooms

Tuna Gürbüz, Ege Özsoy, Tony Danjun Wang, Nassir Navab

2603.19918 2026-03-23 cs.CV cs.AI

Learning Like Humans: Analogical Concept Learning for Generalized Category Discovery

Jizhou Han, Chenhao Ding, Yuhang He, Qiang Wang, Shaokun Wang, SongLin Dong, Yihong Gong

Comments Accept by CVPR 2026

2603.19896 2026-03-23 cs.AI

Utility-Guided Agent Orchestration for Efficient LLM Tool Use

Boyan Liu, Gongming Zhao, Hongli Xu

2603.19888 2026-03-23 cs.LG cs.AI

Integrating Meta-Features with Knowledge Graph Embeddings for Meta-Learning

Antonis Klironomos, Ioannis Dasoulas, Francesco Periti, Mohamed Gad-Elrab, Heiko Paulheim, Anastasia Dimou, Evgeny Kharlamov

2603.19879 2026-03-23 cs.LG

Discovery of Decision Synchronization Patterns from Event Logs

Tijmen Kuijpers, Karolin Winter, Remco Dijkman

2603.19873 2026-03-23 cs.CV

SIMPLER: Efficient Foundation Model Adaptation via Similarity-Guided Layer Pruning for Earth Observation

Víctor Barreiro, Johannes Jakubik, Francisco Argüello, Dora B. Heras

2603.19865 2026-03-23 cs.LG

On the Dynamics & Transferability of Latent Generalization during Memorization

Simran Ketha, Venkatakrishnan Ramaswamy

详情

Journal ref: Transactions on Machine Learning Research 2026

英文摘要

Deep networks have been known to have extraordinary generalization abilities, via mechanisms that aren't yet well understood. It is also known that upon shuffling labels in the training data to varying degrees, deep networks, trained with standard methods, can still achieve perfect or high accuracy on this corrupted training data. This phenomenon is called memorization, and typically comes at the cost of poorer generalization to true labels. Our recent work has demonstrated, that the internal representations of such models retain significantly better latent generalization abilities than is directly apparent from the model. In particular, it has been shown that such latent generalization can be recovered via simple probes (called MASC probes) on the layer-wise representations of the model. However, the origin and dynamics over training of this latent generalization during memorization is not well understood. Here, we track the training dynamics, empirically, and find that latent generalization abilities largely peak early in training, with model generalization. Next, we investigate to what extent the specific nature of the MASC probe is critical for our ability to extract latent generalization from the model's layerwise outputs. To this end, we first examine the mathematical structure of the MASC probe and show that it is a quadratic classifier, i.e. is non-linear. This brings up the question of the extent to which this latent generalization might be linearly decodable from layerwise outputs. To investigate this, we designed a new linear probe for this setting. Next, we consider the question of whether it is possible to transfer latent generalization to model generalization by directly editing model weights. To this end, we devise a way to transfer the latent generalization present in last-layer representations to the model using the new linear probe.

URL PDF HTML ☆

赞 0 踩 0

2603.19864 2026-03-23 cs.LG cs.CR

NASimJax: GPU-Accelerated Policy Learning Framework for Penetration Testing

Raphael Simon, José Carrasquel, Wim Mees, Pieter Libin

详情

英文摘要

Penetration testing, the practice of simulating cyberattacks to identify vulnerabilities, is a complex sequential decision-making task that is inherently partially observable and features large action spaces. Training reinforcement learning (RL) policies for this domain faces a fundamental bottleneck: existing simulators are too slow to train on realistic network scenarios at scale, resulting in policies that fail to generalize. We present NASimJax, a complete JAX-based reimplementation of the Network Attack Simulator (NASim), achieving up to 100x higher environment throughput than the original simulator. By running the entire training pipeline on hardware accelerators, NASimJax enables experimentation on larger networks under fixed compute budgets that were previously infeasible. We formulate automated penetration testing as a Contextual POMDP and introduce a network generation pipeline that produces structurally diverse and guaranteed-solvable scenarios. Together, these provide a principled basis for studying zero-shot policy generalization. We use the framework to investigate action-space scaling and generalization across networks of up to 40 hosts. We find that Prioritized Level Replay better handles dense training distributions than Domain Randomization, particularly at larger scales, and that training on sparser topologies yields an implicit curriculum that improves out-of-distribution generalization, even on topologies denser than those seen during training. To handle linearly growing action spaces, we propose a two-stage action decomposition (2SAS) that substantially outperforms flat action masking at scale. Finally, we identify a failure mode arising from the interaction between Prioritized Level Replay's episode-reset behaviour and 2SAS's credit assignment structure. NASimJax thus provides a fast, flexible, and realistic platform for advancing RL-based penetration testing.

URL PDF HTML ☆

赞 0 踩 0

2603.19863 2026-03-23 cs.CV

MedQ-Engine: A Closed-Loop Data Engine for Evolving MLLMs in Medical Image Quality Assessment

Jiyao Liu, Junzhi Ning, Wanying Qu, Lihao Liu, Chenglong Ma, Junjun He, Ningsheng Xu

2603.19858 2026-03-23 cs.RO cs.MA

Beyond detection: cooperative multi-agent reasoning for rapid onboard EO crisis response

Alejandro D. Mousist, Pedro Delgado de Robles Martín, Raquel Lladró Climent, Julian Cobos Aparicio

Comments Accepted for presentation at the ESA's 4S Symposium 2026 Conference (see https://atpi.eventsair.com/4s-symposium-2026/)

2603.19852 2026-03-23 cs.CV cs.AI cs.LG

Failure Modes for Deep Learning-Based Online Mapping: How to Measure and Address Them

Michael Hubbertz, Qi Han, Tobias Meisen

Comments Accepted to CVPR 2026, final camera ready version is published there

2603.19849 2026-03-23 cs.CL cs.AI

Semantic Delta: An Interpretable Signal Differentiating Human and LLMs Dialogue

Riccardo Scantamburlo, Mauro Mezzanzana, Giacomo Buonanno, Francesco Bertolotti

2603.19838 2026-03-23 cs.RO

Multi-Agent Motion Planning on Industrial Magnetic Levitation Platforms: A Hybrid ADMM-HOCBF approach

Bavo Tistaert, Stan Servaes, Alejandro Gonzalez-Garcia, Ibrahim Ibrahim, Louis Callens, Jan Swevers, Wilm Decré

Comments 8 pages, 4 figures, accepted to the European Control Conference 2026

2603.19825 2026-03-23 cs.CL cs.AI

FrameNet Semantic Role Classification by Analogy

Van-Duy Ngo, Stergos Afantenos, Emiliano Lorini, Miguel Couceiro

Comments Paper to be presented at LREC 2026

2603.19822 2026-03-23 cs.CV

HUGE-Bench: A Benchmark for High-Level UAV Vision-Language-Action Tasks

Jingyu Guo, Ziye Chen, Ziwen Li, Zhengqing Gao, Jiaxin Huang, Hanlue Zhang, Fengming Huang, Yu Yao, Tongliang Liu, Mingming Gong

2603.19817 2026-03-23 cs.LG

GDEGAN: Gaussian Dynamic Equivariant Graph Attention Network for Ligand Binding Site Prediction

Animesh, Plaban Kumar Bhowmick, Pralay Mitra

2603.19807 2026-03-23 cs.CV cs.AI

Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision

Jiyeong Kim, Yerim So, Hyesong Choi, Uiwon Hwang, Dongbo Min

2603.19805 2026-03-23 cs.LG

Quantifying Gate Contribution in Quantum Feature Maps for Scalable Circuit Optimization

F. Rodríguez-Díaz, D. Gutiérrez-Avilés, A. Troncoso, F. Martínez-Álvarez

2603.19802 2026-03-23 cs.CV

Evaluating Vision Foundation Models for Pixel and Object Classification in Microscopy

Carolin Teuber, Anwai Archit, Tobias Boothe, Peter Ditte, Jochen Rink, Constantin Pape

2603.19795 2026-03-23 cs.CV

Controllable Text-to-Motion Generation via Modular Body-Part Phase Control

Minyue Dai, Ke Fan, Anyi Rao, Jingbo Wang, Bo Dai

2603.19794 2026-03-23 cs.RO

Generalized Task-Driven Design of Soft Robots via Reduced-Order FEM-based Surrogate Modeling

Yao Yao, David Howard, Perla Maiolino

2603.19792 2026-03-23 cs.LG cs.DS stat.CO stat.ME stat.ML

Scalable Learning of Multivariate Distributions via Coresets

Zeyu Ding, Katja Ickstadt, Nadja Klein, Alexander Munteanu, Simon Omlor

Comments AISTATS 2026

2603.19788 2026-03-23 cs.CV cs.AI

Learning Hierarchical Orthogonal Prototypes for Generalized Few-Shot 3D Point Cloud Segmentation

Yifei Zhao, Fanyu Zhao, Zhongyuan Zhang, Shengtang Wu, Yixuan Lin, Yinsheng Li

Comments 6 pages, 6 figures, 2 tables, Accepted by ICME 2026

2603.19782 2026-03-23 cs.AI

Embodied Science: Closing the Discovery Loop with Agentic Embodied AI

Xiang Zhuang, Chenyi Zhou, Kehua Feng, Zhihui Zhu, Yunfan Gao, Yijie Zhong, Yichi Zhang, Junjie Huang, Keyan Ding, Lei Bai, Haofen Wang, Qiang Zhang, Huajun Chen

Comments Work in progress

2603.19780 2026-03-23 cs.CV

Decoupled Sensitivity-Consistency Learning for Weakly Supervised Video Anomaly Detection

Hantao Zheng, Ning Han, Yawen Zeng, Hao Chen

Comments 6 pages, 3 figures, 4 tables. Accepted by ICME 2026

2603.19779 2026-03-23 cs.CV

One Model, Two Minds: Task-Conditioned Reasoning for Unified Image Quality and Aesthetic Assessment

Wen Yin, Cencen Liu, Dingrui Liu, Bing Su, Yuan-Fang Li, Tao He

Comments 10 pages,7 figures