arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.19117 2026-03-18 cs.CV physics.optics

3M-TI: High-Quality Mobile Thermal Imaging via Calibration-free Multi-Camera Cross-Modal Diffusion

Minchong Chen, Xiaoyun Yuan, Junzhe Wan, Jianing Zhang, Jun Zhang

Comments Accepted by CVPR 2026, Code: https://github.com/work-submit/3MTI, Project page: https://lab.xiaoyunyuan.net/index.html?project=3m-ti

详情

英文摘要

The miniaturization of thermal sensors for mobile platforms inherently limits their spatial resolution and textural fidelity, leading to blurry and less informative images. Existing thermal super-resolution (SR) methods can be grouped into single-image and RGB-guided approaches: the former struggles to recover fine structures from limited information, while the latter relies on accurate and laborious cross-camera calibration, which hinders practical deployment and robustness. Here, we propose 3M-TI, a calibration-free Multi-camera cross-Modality diffusion framework for Mobile Thermal Imaging. At its core, 3M-TI integrates a cross-modal self-attention module (CSM) into the diffusion UNet, replacing the original self-attention layers to adaptively align thermal and RGB features throughout the denoising process, without requiring explicit camera calibration. This design enables the diffusion network to leverage its generative prior to enhance spatial resolution, structural fidelity, and texture detail in the super-resolved thermal images. Extensive evaluations on real-world mobile thermal cameras and public benchmarks validate our superior performance, achieving state-of-the-art results in both visual quality and quantitative metrics. More importantly, the thermal images enhanced by 3M-TI lead to substantial gains in critical downstream tasks like object detection and segmentation, underscoring its practical value for robust mobile thermal perception systems. More materials: https://github.com/work-submit/3MTI.

URL PDF HTML ☆

赞 0 踩 0

2511.18444 2026-03-18 cs.CV

SineProject: Machine Unlearning for Stable Vision Language Alignment

Arpit Garg, Hemanth Saratchandran, Simon Lucey

Comments Accepted at CVPR 2026

2511.18344 2026-03-18 cs.CV

A Tri-Modal Dataset and a Baseline System for Tracking Unmanned Aerial Vehicles

Tianyang Xu, Jinjie Gu, Xuefeng Zhu, XiaoJun Wu, Josef Kittler

Comments V3

2511.15190 2026-03-18 cs.LG cs.AI

Masked Auto-Regressive Variational Acceleration: Fast Inference Makes Practical Reinforcement Learning

Yuxuan Gu, Weimin Bai, Yifei Wang, Weijian Luo, He Sun

2511.12832 2026-03-18 cs.CL cs.AI

From Passive to Persuasive: Localized Activation Injection for Empathy and Negotiation

Niranjan Chebrolu, Kokil Jaidka, Gerard Christopher Yeo

2511.10376 2026-03-18 cs.CV cs.RO

MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation

Xun Huang, Shijia Zhao, Yunxiang Wang, Xin Lu, Wanfa Zhang, Rongsheng Qu, Weixin Li, Yunhong Wang, Chenglu Wen

Comments 18 pages, Accepted by CVPR 2026

2511.09036 2026-03-18 cs.LG cs.AI

FedSDWC: Federated Synergistic Dual-Representation Weak Causal Learning for OOD

Zhenyuan Huang, Hui Zhang, Wenzhong Tang, Haijun Yang

2511.08628 2026-03-18 cs.CV cs.AI

Learning Topology-Driven Multi-Subspace Fusion for Grassmannian Deep Network

Xuan Yu, Tianyang Xu

Comments Accepted at AAAI 2026

2511.07923 2026-03-18 cs.CV cs.AI

Exploring the Underwater World Segmentation without Extra Training

Bingyu Li, Tao Huo, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li

2511.06680 2026-03-18 cs.CL

Steering LLMs toward Korean Local Speech: Iterative Refinement Framework for Faithful Dialect Translation

Keunhyeung Park, Seunguk Yu, Youngbin Kim

Comments Accepted to LREC 2026

2511.05549 2026-03-18 cs.LG cs.AI cs.IR

AGRAG: Advanced Graph-based Retrieval-Augmented Generation for LLMs

Yubo Wang, Haoyang Li, Fei Teng, Lei Chen

Comments ICDE 2026 Camera-ready

2511.02580 2026-03-18 cs.CV cs.AI cs.GR cs.LG

TAUE: Training-free Noise Transplant and Cultivation Diffusion Model

Daichi Nagai, Ryugo Morita, Shunsuke Kitada, Hitoshi Iyatomi

Comments Accepted to CVPR 2026 Findings. The first two authors contributed equally. Project Page: https://iyatomilab.github.io/TAUE

2510.26683 2026-03-18 cs.CL cs.AI

Evontree: Ontology Rule-Guided Self-Evolution of Large Language Models

Mingchen Tu, Zhiqiang Liu, Juan Li, Liangyurui Liu, Junjie Wang, Lei Liang, Wen Zhang

2510.24318 2026-03-18 cs.LG cs.AI

Transformers can do Bayesian Clustering

Prajit Bhaskaran, Tom Viering

2510.23685 2026-03-18 cs.LG cs.AI

Parallel BiLSTM-Transformer networks for forecasting chaotic dynamics

Junwen Ma, Mingyu Ge, Yisen Wang, Yong Zhang, Weicheng Fu

Comments 9 pages,7 figures

详情

DOI: 10.1063/5.0314572
Journal ref: AIP Advances 16, 035302 (2026)

英文摘要

The nonlinear nature of chaotic systems results in extreme sensitivity to initial conditions and highly intricate dynamical behaviors, posing fundamental challenges for accurately predicting their evolution. To overcome the limitation that conventional approaches fail to capture both local features and global dependencies in chaotic time series simultaneously, this study proposes a parallel predictive framework integrating Transformer and Bidirectional Long Short-Term Memory (BiLSTM) networks. The hybrid model employs a dual-branch architecture, where the Transformer branch mainly captures long-range dependencies while the BiLSTM branch focuses on extracting local temporal features. The complementary representations from the two branches are fused in a dedicated feature-fusion layer to enhance predictive accuracy. As illustrating examples, the model's performance is systematically evaluated on two representative tasks in the Lorenz system. The first is autonomous evolution prediction, in which the model recursively extrapolates system trajectories from the time-delay embeddings of the state vector to evaluate long-term tracking accuracy and stability. The second is inference of unmeasured variable, where the model reconstructs the unobserved states from the time-delay embeddings of partial observations to assess its state-completion capability. The results consistently indicate that the proposed hybrid framework outperforms both single-branch architectures across tasks, demonstrating its robustness and effectiveness in chaotic system prediction.

URL PDF HTML ☆

赞 0 踩 0

2510.21721 2026-03-18 cs.AI cs.HC

PREFINE: Personalized Story Generation via Simulated User Critics and User-Specific Rubric Generation

Kentaro Ueda, Takehiro Takayanagi

2510.20644 2026-03-18 cs.LG cs.IT math.IT

Connecting Jensen-Shannon and Kullback-Leibler Divergences: A New Bound for Representation Learning

Reuben Dorent, Polina Golland, William Wells

Comments Accepted at NeurIPS 2025. This revised version provides a proof of Lemma B.5, previously stated as a conjecture in the original submission. Code available at https://github.com/ReubenDo/JSDlowerbound/

2510.18546 2026-03-18 cs.RO cs.AI

EfficientNav: Towards On-Device Object-Goal Navigation with Navigation Map Caching and Retrieval

Zebin Yang, Sunjian Zheng, Tong Xie, Tianshi Xu, Bo Yu, Fan Wang, Jie Tang, Shaoshan Liu, Meng Li

Comments NeurIPS 2025

2510.18229 2026-03-18 cs.CV

Unbiased Object Detection Beyond Frequency with Visually Prompted Image Synthesis

Xinhao Cai, Liulei Li, Gensheng Pei, Tao Chen, Jinshan Pan, Yazhou Yao, Wenguan Wang

Comments Accepted by ICLR2026

2510.15398 2026-03-18 cs.CV cs.AI

MARIS: Marine Open-Vocabulary Instance Segmentation with Geometric Enhancement and Semantic Alignment

Bingyu Li, Feiyu Wang, Da Zhang, Zhiyuan Zhao, Junyu Gao, Xuelong Li

2510.13972 2026-03-18 cs.LG cs.CV physics.med-ph

Distributional Consistency Loss: Beyond Pointwise Data Terms in Inverse Problems

George Webber, Andrew J. Reader

Comments Author's accepted version (ICLR 2026)

详情

英文摘要

Recovering true signals from noisy measurements is a central challenge in inverse problems spanning medical imaging, geophysics, and signal processing. Current methods balance prior signal priors (regularization) with agreement with noisy data (data-fidelity). Conventional data-fidelity loss functions, such as mean-squared error (MSE) or negative log-likelihood, seek pointwise agreement with noisy measurements, often leading to overfitting to noise. In this work, we instead evaluate data-fidelity collectively by testing whether the observed measurements are statistically consistent with the noise distributions implied by the current estimate. We introduce distributional consistency (DC) loss, a data-fidelity objective that replaces pointwise matching with distribution-level calibration. DC loss acts as a direct and practical plug-in replacement for standard data consistency terms: i) it is compatible with modern unsupervised regularizers that operate without paired measurement-ground-truth data, ii) it is optimized in the same way as traditional losses, and iii) it avoids overfitting to measurement noise without early stopping or priors. Its scope naturally fits many practical inverse problems where the measurement-noise distribution is known and where the measured dataset consists of many independent noisy values. We demonstrate efficacy in two key example application areas: i) in image denoising with deep image prior, using DC instead of MSE loss removes the need for early stopping and achieves higher PSNR; ii) in medical image reconstruction from Poisson-noisy data, DC loss reduces artifacts in highly-iterated reconstructions and enhances the efficacy of hand-crafted regularization. These results position DC loss as a statistically grounded, performance-enhancing alternative to conventional fidelity losses for an important class of unsupervised noise-dominated inverse problems.

URL PDF HTML ☆

赞 0 踩 0

2510.13898 2026-03-18 cs.CL

Attribution Quality in AI-Generated Content:Benchmarking Style Embeddings and LLM Judges

Misam Abbas

Comments Accepted for publication at the 2025 IEEE ICDM Workshop on "Grounding Documents with Reasoning, Agents, Retrieval, and Attribution". This is author submitted version. Not yet published

2510.10402 2026-03-18 cs.LG cs.AI cs.CE

Controllable Graph Generation with Diffusion Models via Inference-Time Tree Search Guidance

Jiachi Zhao, Zehong Wang, Yamei Liao, Chuxu Zhang, Yanfang Ye

Comments Accepted by WWW 2026

详情

英文摘要

Graph generation is a fundamental problem in graph learning with broad applications across Web-scale systems, knowledge graphs, and scientific domains such as drug and material discovery. Recent approaches leverage diffusion models for step-by-step generation, yet unconditional diffusion offers little control over desired properties, often leading to unstable quality and difficulty in incorporating new objectives. Inference-time guidance methods mitigate these issues by adjusting the sampling process without retraining, but they remain inherently local, heuristic, and limited in controllability. To overcome these limitations, we propose TreeDiff, a Monte Carlo Tree Search (MCTS) guided dual-space diffusion framework for controllable graph generation. TreeDiff is a plug-and-play inference-time method that expands the search space while keeping computation tractable. Specifically, TreeDiff introduces three key designs to make it practical and scalable: (1) a macro-step expansion strategy that groups multiple denoising updates into a single transition, reducing tree depth and enabling long-horizon exploration; (2) a dual-space denoising mechanism that couples efficient latent-space denoising with lightweight discrete correction in graph space, ensuring both scalability and structural fidelity; and (3) a dual-space verifier that predicts long-term rewards from partially denoised graphs, enabling early value estimation and removing the need for full rollouts. Extensive experiments on 2D and 3D molecular generation benchmarks, under both unconditional and conditional settings, demonstrate that TreeDiff achieves state-of-the-art performance. Notably, TreeDiff exhibits favorable inference-time scaling: it continues to improve with additional computation, while existing inference-time methods plateau early under limited resources.

URL PDF HTML ☆

赞 0 踩 0

2510.09881 2026-03-18 cs.CV

LTGS: Long-Term Gaussian Scene Chronology From Sparse View Updates

Minkwan Kim, Seungmin Lee, Junho Kim, Young Min Kim

Comments Accepted to CVPR 2026 Findings

2510.06383 2026-03-18 cs.CL cs.AI

Protecting De-identified Documents from Search-based Linkage Attacks

Pierre Lison, Mark Anderson

2510.06122 2026-03-18 cs.LG stat.ML

PolyGraph Discrepancy: a classifier-based metric for graph generation

Markus Krimmel, Philip Hartout, Karsten Borgwardt, Dexiong Chen

Comments Camera-ready version published at ICLR 2026

2510.04476 2026-03-18 cs.CL cs.AI

Compressed Convolutional Attention: Efficient Attention in a Compressed Latent Space

Tomas Figliolia, Nicholas Alonso, Rishi Iyer, Quentin Anthony, Beren Millidge

2510.04282 2026-03-18 cs.CV

Flexible and Efficient Spatio-Temporal Transformer for Sequential Visual Place Recognition

Yu Kiu, Lau, Chao Chen, Ge Jin, Chen Feng

Comments 8 pages, 6 figures

2510.00458 2026-03-18 cs.CV

VLOD-TTA: Test-Time Adaptation of Vision-Language Object Detectors

Atif Belal, Heitor R. Medeiros, Marco Pedersoli, Eric Granger

2509.26307 2026-03-18 cs.LG

Attribution-Guided Decoding

Piotr Komorowski, Elena Golimblevskaia, Reduan Achtibat, Thomas Wiegand, Sebastian Lapuschkin, Wojciech Samek

Comments Published as a conference paper at ICLR 2026