arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2601.22057 2026-03-19 cs.CV cs.AI

Unsupervised Decomposition and Recombination with Discriminator-Driven Diffusion Models

Archer Wang, Emile Anand, Yilun Du, Marin Soljačić

Comments 28 pages, 16 figures, 4 tables

详情

英文摘要

Decomposing complex data into factorized representations can reveal reusable components and enable synthesizing new samples via component recombination. We investigate this in the context of diffusion-based models that learn factorized latent spaces without factor-level supervision. In images, factors can capture background, illumination, and object attributes; in robotic videos, they can capture reusable motion components. To improve both latent factor discovery and quality of compositional generation, we introduce an adversarial training signal via a discriminator trained to distinguish between single-source samples and those generated by recombining factors across sources. By optimizing the generator to fool this discriminator, we encourage physical and semantic consistency in the resulting recombinations. Our method outperforms implementations of prior baselines on CelebA-HQ, Virtual KITTI, CLEVR, and Falcor3D, achieving lower FID scores and better disentanglement as measured by MIG and MCC. Furthermore, we demonstrate a novel application to robotic video trajectories: by recombining learned action components, we generate diverse sequences that significantly increase state-space coverage for exploration on the LIBERO benchmark.

URL PDF HTML ☆

赞 0 踩 0

2601.15674 2026-03-19 cs.CL

What Patients Really Ask: Exploring the Effect of False Assumptions in Patient Information Seeking

Raymond Xiong, Furong Jia, Lionel Wong, Monica Agrawal

2601.12535 2026-03-19 cs.CL cs.AI

Improving Low-Resource Machine Translation via Round-Trip Reinforcement Learning

Ahmed Attia, Alham Fikri Aji

2601.11896 2026-03-19 cs.CV

Digital FAST: An AI-Driven Multimodal Framework for Rapid and Early Stroke Screening

Ngoc-Khai Hoang, Thi-Nhu-Mai Nguyen, Huy-Hieu Pham

2601.11580 2026-03-19 cs.CL cs.AI

Speculative Decoding: Performance or Illusion?

Xiaoxuan Liu, Jiaxiang Yu, Jongseok Park, Ion Stoica, Alvin Cheung

2601.10029 2026-03-19 cs.AI

PaperScout: An Autonomous Agent for Academic Paper Search with Process-Aware Sequence-Level Policy Optimization

Tingyue Pan, Jie Ouyang, Mingyue Cheng, Qingchuan Li, Zirui Liu, Daoyu Wang, Mingfan Pan, Shuo Yu, Qi Liu

2601.05960 2026-03-19 cs.CL

Distilling Feedback into Memory-as-a-Tool

Víctor Gallego

Comments Code: https://github.com/vicgalle/feedback-memory-as-a-tool Data: https://huggingface.co/datasets/vicgalle/rubric-feedback-bench

2601.01904 2026-03-19 cs.LG cs.AI

Evaluating Feature Dependent Noise in Preference-based Reinforcement Learning

Yuxuan Li, Harshith Reddy Kethireddy, Srijita Das

2512.23562 2026-03-19 cs.LG cs.AI cs.CL

VL-RouterBench: A Benchmark for Vision-Language Model Routing

Zhehao Huang, Baijiong Lin, Jingyuan Zhang, Jingying Wang, Yuhang Liu, Ning Lu, Tao Li, Xiaolin Huang

Comments CVPR 2026 Accepted

2512.21852 2026-03-19 cs.LG cs.AI

A Comedy of Estimators: On KL Regularization in RL Training of LLMs

Vedant Shah, Johan Obando-Ceron, Vineet Jain, Brian Bartoldson, Bhavya Kailkhura, Sarthak Mittal, Glen Berseth, Pablo Samuel Castro, Yoshua Bengio, Nikolay Malkin, Moksh Jain, Siddarth Venkatraman, Aaron Courville

2512.15956 2026-03-19 cs.LG

Tracking Wildfire Assets with Commodity RFID and Gaussian Process Modeling

John Hateley, Sriram Narasimhan, Omid Abari

详情

DOI: 10.1109/JRFID.2025.3643353
Journal ref: IEEE Journal of Radio Frequency Identification (2025)

英文摘要

This paper presents a novel, cost-effective, and scalable approach to track numerous assets distributed in forested environments using commodity Radio Frequency Identification (RFID) targeting wildfire response applications. Commodity RFID systems suffer from poor tag localization when dispersed in forested environments due to signal attenuation, multi-path effects and environmental variability. Current methods to address this issue via fingerprinting rely on dispersing tags at known locations {\em a priori}. In this paper, we address the case when it is not possible to tag known locations and show that it is possible to localize tags to accuracies comparable to global positioning systems (GPS) without such a constraint. For this, we propose Gaussian Process to model various environments solely based on RF signal response signatures and without the aid of additional sensors such as global positioning GPS or cameras, and match an unknown RF to the closest match in a model dictionary. We utilize a new weighted log-likelihood method to associate an unknown environment with the closest environment in a dictionary of previously modeled environments, which is a crucial step in being able to use our approach. Our results show that it is possible to achieve localization accuracies of the order of GPS, but with passive commodity RFID, which will allow the tracking of dozens of wildfire assets within the vicinity of mobile readers at-a-time simultaneously, does not require known positions to be tagged {\em a priori}, and can achieve localization at a fraction of the cost compared to GPS.

URL PDF HTML ☆

赞 0 踩 0

2512.15662 2026-03-19 cs.AI

Stepwise Think-Critique: A Unified Framework for Robust and Interpretable LLM Reasoning

Jiaqi Xu, Cuiling Lan, Xuejin Chen, Yan Lu

Comments Under Review

2512.02458 2026-03-19 cs.CV

Vision to Geometry: 3D Spatial Memory for Sequential Embodied MLLM Reasoning and Exploration

Zhongyi Cai, Yi Du, Chen Wang, Yu Kong

Comments Computer Vision

2512.02341 2026-03-19 cs.CV

TALO: Pushing 3D Vision Foundation Models Towards Globally Consistent Online Reconstruction

Fengyi Zhang, Tianjun Zhang, Kasra Khosoussi, Zheng Zhang, Zi Huang, Yadan Luo

Comments CVPR 2026

2512.00479 2026-03-19 cs.AI

Aligning Probabilistic Beliefs under Informative Missingness: LLM Steerability in Clinical Reasoning

Yuta Kobayashi, Vincent Jeanselme, Shalmali Joshi

Comments Under review

2511.21750 2026-03-19 cs.CV cs.AI cs.CL cs.RO

SO-Bench: A Structural Output Evaluation of Multimodal LLMs

Di Feng, Kaixin Ma, Feng Nan, Haofeng Chen, Bohan Zhai, David Griffiths, Mingfei Gao, Zhe Gan, Eshan Verma, Yinfei Yang, Zhifeng Chen, Afshin Dehghan

Comments v3 preprint. Added the link to the public benchmark

2511.20095 2026-03-19 cs.CV

WPT: World-to-Policy Transfer via Online World Model Distillation

Guangfeng Jiang, Yueru Luo, Jun Liu, Yi Huang, Yiyao Zhu, Zhan Qu, Dave Zhenyu Chen, Bingbing Liu, Xu Yan

Comments CVPR2026 Accepted

2511.17777 2026-03-19 cs.RO

See, Plan, Cut: MPC-Based Autonomous Volumetric Robotic Laser Surgery with OCT Guidance

Ravi Prakash, Vincent Y. Wang, Arpit Mishra, Devi Yuliarti, Pei Zhong, Ryan P. McNabb, Patrick J. Codd, Leila J. Bridgeman

Comments 9 pages, 8 figures

2511.17354 2026-03-19 cs.CV

DSeq-JEPA: Discriminative Sequential Joint-Embedding Predictive Architecture

Xiangteng He, Shunsuke Sakai, Shivam Chandhok, Sara Beery, Kun Yuan, Nicolas Padoy, Tatsuhito Hasegawa, Leonid Sigal

Comments Project page: https://github.com/SkyShunsuke/DSeq-JEPA

2511.16955 2026-03-19 cs.CV cs.LG eess.IV

Neighbor GRPO: Contrastive ODE Policy Optimization Aligns Flow Models

Dailan He, Guanlin Feng, Xingtong Ge, Yazhe Niu, Yi Zhang, Bingqi Ma, Guanglu Song, Yu Liu, Hongsheng Li

Comments CVPR 2026

2511.16830 2026-03-19 cs.CL

PEPPER: Perception-Guided Perturbation for Robust Backdoor Defense in Text-to-Image Diffusion Models

Oscar Chew, Po-Yi Lu, Jayden Lin, Kuan-Hao Huang, Hsuan-Tien Lin

2511.12797 2026-03-19 cs.LG cs.AI q-bio.GN

Genomic Next-Token Predictors are In-Context Learners

Nathan Breslow, Aayush Mishra, Mahler Revsine, Michael C. Schatz, Anqi Liu, Daniel Khashabi

2511.11005 2026-03-19 cs.CV

Draft and Refine with Visual Experts

Sungheon Jeong, Ryozo Masukawa, Jihong Park, Sanggeon Yun, Wenjun Huang, Hanning Chen, Mahdi Imani, Mohsen Imani

Comments Accepted to CVPR 2026

2511.08120 2026-03-19 cs.LG cs.AI

A robust methodology for long-term sustainability evaluation of Machine Learning models

Jorge Paz-Ruza, João Gama, Amparo Alonso-Betanzos, Bertha Guijarro-Berdiñas

2511.07842 2026-03-19 cs.AI

Safety-Preserving PTQ via Contrastive Alignment Loss

Sunghyun Wee, Suyoung Kim, Hyeonjin Kim, Kyomin Hwang, Nojun Kwak

Comments 9 pages, 4 figures. Includes 8 pages of supplementary material

2511.06396 2026-03-19 cs.AI cs.CR

Efficient LLM Safety Evaluation through Multi-Agent Debate

Dachuan Lin, Guobin Shen, Zihao Yang, Tianrong Liu, Dongcheng Zhao, Yi Zeng

Comments 15 pages, 5 figures, 10 tables. Updated abstract to fix an incconsistency issue with the main paper: HAJailBench size (12,000 -> 11,100)

2511.05482 2026-03-19 cs.LG

SoilX: Calibration-Free Comprehensive Soil Sensing through Contrastive Cross-Component Learning

Kang Yang, Yuanlin Yang, Yuning Chen, Sikai Yang, Xinyu Zhang, Wan Du

2511.03369 2026-03-19 cs.CL stat.ML

Silenced Biases: The Dark Side LLMs Learned to Refuse

Rom Himelstein, Amit LeVi, Brit Youngmann, Yaniv Nemcovsky, Avi Mendelson

Comments Accepted to The 40th Annual AAAI Conference on Artificial Intelligence - AI Alignment Track (Oral)

2511.02933 2026-03-19 cs.CV cs.AI

Generative Hints

Andy Dimnaku, Abdullah Yusuf Kavranoglu, Yaser Abu-Mostafa

Comments 15 pages, 15 figures

2511.01419 2026-03-19 cs.CV

Towards One-step Causal Video Generation via Adversarial Self-Distillation

Yongqi Yang, Huayang Huang, Xu Peng, Xiaobin Hu, Donghao Luo, Jiangning Zhang, Chengjie Wang, Yu Wu

Comments Published as a conference paper at ICLR 2026