arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.12322 2026-04-15 cs.CV

Self-Adversarial One Step Generation via Condition Shifting

Deyuan Liu, Peng Sun, Yansen Han, Zhenglin Cheng, Chuyan Chen, Tao Lin

详情

英文摘要

The push for efficient text to image synthesis has moved the field toward one step sampling, yet existing methods still face a three way tradeoff among fidelity, inference speed, and training efficiency. Approaches that rely on external discriminators can sharpen one step performance, but they often introduce training instability, high GPU memory overhead, and slow convergence, which complicates scaling and parameter efficient tuning. In contrast, regression based distillation and consistency objectives are easier to optimize, but they typically lose fine details when constrained to a single step. We present APEX, built on a key theoretical insight: adversarial correction signals can be extracted endogenously from a flow model through condition shifting. Using a transformation creates a shifted condition branch whose velocity field serves as an independent estimator of the model's current generation distribution, yielding a gradient that is provably GAN aligned, replacing the sample dependent discriminator terms that cause gradient vanishing. This discriminator free design is architecture preserving, making APEX a plug and play framework compatible with both full parameter and LoRA based tuning. Empirically, our 0.6B model surpasses FLUX-Schnell 12B (20$\times$ more parameters) in one step quality. With LoRA tuning on Qwen-Image 20B, APEX reaches a GenEval score of 0.89 at NFE=1 in 6 hours, surpassing the original 50-step teacher (0.87) and providing a 15.33$\times$ inference speedup. Code is available https://github.com/LINs-lab/APEX.

URL PDF HTML ☆

赞 0 踩 0

2604.12321 2026-04-15 cs.CL

ToxiTrace: Gradient-Aligned Training for Explainable Chinese Toxicity Detection

Boyang Li, Hongzhe Shou, Yuanyuan Liang, Jingbin Zhang, Fang Zhou

Comments Accepted to ACL 2026 Findings

2604.12318 2026-04-15 cs.CV

Cell Instance Segmentation via Multi-Task Image-to-Image Schrödinger Bridge

Hayato Inoue, Shota Harada, Shumpei Takezaki, Ryoma Bise

2604.12315 2026-04-15 cs.CV cs.MM

GTPBD-MM: A Global Terraced Parcel and Boundary Dataset with Multi-Modality

Zhiwei Zhang, Xingyuan Zeng, Xinkai Kong, Kunquan Zhang, Haoyuan Liang, Bohan Shi, Juepeng Zheng, Jianxi Huang, Yutong Lu, Haohuan Fu

Comments 15 pages, 11 figures. Submitted to ACM Multimedia 2026 Dataset Track

2604.12312 2026-04-15 cs.CL

CompliBench: Benchmarking LLM Judges for Compliance Violation Detection in Dialogue Systems

Jingbo Yang, Guanyu Yao, Bairu Hou, Xinghan Yang, Nikolai Glushnev, Iwona Bialynicka-Birula, Duo Ding, Shiyu Chang

2604.12309 2026-04-15 cs.CV

Towards Realistic and Consistent Orbital Video Generation via 3D Foundation Priors

Rong Wang, Ruyi Zha, Ziang Cheng, Jiayu Yang, Pulak Purkait, Hongdong Li

Comments Accepted to CVPR 2026

2604.12308 2026-04-15 cs.CL

ContextLens: Modeling Imperfect Privacy and Safety Context for Legal Compliance

Haoran Li, Yulin Chen, Huihao Jing, Wenbin Hu, Tsz Ho Li, Chanhou Lou, Hong Ting Tsang, Sirui Han, Yangqiu Song

Comments Accepted by ACL 26

2604.12307 2026-04-15 cs.CV

Boosting Robust AIGI Detection with LoRA-based Pairwise Training

Ruiyang Xia, Qi Zhang, Yaowen Xu, Zhaofan Zou, Hao Sun, Zhongjiang He, Xuelong Li

Comments 3th place (3/514) technical report(CVPRW-26) at the NTIRE 2026: Robust AI-Generated Image Detection in the Wild Challenge

2604.12304 2026-04-15 cs.LG cs.SY eess.SY

Beyond Weather Correlation: A Comparative Study of Static and Temporal Neural Architectures for Fine-Grained Residential Energy Consumption Forecasting in Melbourne, Australia

Prasad Nimantha Madusanka Ukwatta Hewage, Hao Wu

Comments 22 pages, 6 figures. Earlier preprint versions: Zenodo https://doi.org/10.5281/zenodo.19158396; SSRN https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6453198

详情

英文摘要

Accurate short-term residential energy consumption forecasting at sub-hourly resolution is critical for smart grid management, demand response programmes, and renewable energy integration. While weather variables are widely acknowledged as key drivers of residential electricity demand, the relative merit of incorporating temporal autocorrelation - the sequential memory of past consumption; over static meteorological features alone remains underexplored at fine-grained (5-minute) temporal resolution for Australian households. This paper presents a rigorous empirical comparison of a Multilayer Perceptron (MLP) and a Long Short-Term Memory (LSTM) recurrent network applied to two real-world Melbourne households: House 3 (a standard grid-connected dwelling) and House 4 (a rooftop solar photovoltaic-integrated household). Both models are trained on 14 months of 5-minute interval smart meter data (March 2023-April 2024) merged with official Bureau of Meteorology (BOM) daily weather observations, yielding over 117,000 samples per household. The LSTM, operating on 24-step (2-hour) sliding consumption windows, achieves coefficients of determination of R^2 = 0.883 (House 3) and R^2 = 0.865 (House 4), compared to R^2 = -0.055 and R^2 = 0.410 for the corresponding weather-driven MLPs - differences of 93.8 and 45.5 percentage points. These results establish that temporal autocorrelation in the consumption sequence dominates meteorological information for short-term forecasting at 5-minute granularity. Additionally, we demonstrate an asymmetry introduced by solar generation: for the PV-integrated household, the MLP achieves R^2 = 0.410, revealing implicit solar forecasting from weather-time correlations. A persistence baseline analysis and seasonal stratification contextualise model performance. We propose a hybrid weather-augmented LSTM and federated learning extensions as directions for future work.

URL PDF HTML ☆

赞 0 踩 0

2604.12303 2026-04-15 cs.LG

Labeled TrustSet Guided: Batch Active Learning with Reinforcement Learning

Guofeng Cui, Yang Liu, Pichao Wang, Hankai Hsu, Xiaohang Sun, Xiang Hao, Zhu Liu

Comments Published as a conference paper at IJCNN 2026

2604.12292 2026-04-15 cs.SD cs.CV cs.MM

CoSyncDiT: Cognitive Synchronous Diffusion Transformer for Movie Dubbing

Gaoxiang Cong, Liang Li, Jiaxin Ye, Zhedong Zhang, Hongming Shan, Yuankai Qi, Qingming Huang

2604.12286 2026-04-15 cs.CV

LiveMoments: Reselected Key Photo Restoration in Live Photos via Reference-guided Diffusion

Clara Xue, Zizheng Yan, Zhenning Shi, Yuhang Yu, Jingyu Zhuang, Qi Zhang, Jinwei Chen, Qingnan Fan

Comments Accepted by ICLR 2026

2604.12285 2026-04-15 cs.AI

GAM: Hierarchical Graph-based Agentic Memory for LLM Agents

Zhaofen Wu, Hanrong Zhang, Fulin Lin, Wujiang Xu, Xinran Xu, Yankai Chen, Henry Peng Zou, Shaowen Chen, Weizhi Zhang, Xue Liu, Philip S. Yu, Hongwei Wang

Comments 18 pages, 6 figures

2604.12282 2026-04-15 cs.CL

Towards Robust Real-World Spreadsheet Understanding with Multi-Agent Multi-Format Reasoning

Houxing Ren, Mingjie Zhan, Zimu Lu, Ke Wang, Yunqiao Yang, Haotian Hou, Hongsheng Li

Comments Accepted to ACL 2026 (main conference)

2604.12281 2026-04-15 cs.CV cs.AI

MAST: Mask-Guided Attention Mass Allocation for Training-Free Multi-Style Transfer

Dongkyung Kang, Jaeyeon Hwang, Junseo Park, Minji Kang, Yeryeong Lee, Beomseok Ko, Hanyoung Roh, Jeongmin Shin, Hyeryung Jang

Comments 16 pages, 16 figures, 6 tables

2604.12274 2026-04-15 cs.RO

Asymptotically Stable Gait Generation and Instantaneous Walkability Determination for Planar Almost Linear Biped with Knees

Fumihiko Asano, Ning Lei, Taiki Sedoguchi

Comments Accepted for presentation at the IEEE International Conference on Robotics and Automation (ICRA), 2026. This version includes a correction to a typographical error in one equation

2604.12273 2026-04-15 cs.LG cs.CV

SubFlow: Sub-mode Conditioned Flow Matching for Diverse One-Step Generation

Yexiong Lin, Jia Shi, Shanshan Ye, Wanyu Wang, Yu Yao, Tongliang Liu

2604.12271 2026-04-15 cs.LG

RoleMAG: Learning Neighbor Roles in Multimodal Graphs

Yilong Zuo, Xunkai Li, Zhihan Zhang, Ronghua Li, Guoren Wang

2604.12270 2026-04-15 cs.CV

DreamStereo: Towards Real-Time Stereo Inpainting for HD Videos

Yuan Huang, Sijie Zhao, Jing Cheng, Hao Xu, Shaohui Jiao

2604.12262 2026-04-15 cs.CL cs.AI

CascadeDebate: Multi-Agent Deliberation for Cost-Aware LLM Cascades

Raeyoung Chang, Dongwook Kwon, Jisoo Lee, Nikhil Verma

Comments 12 pages, 6 figures, 4 tables, 1 algorithm

2604.12260 2026-04-15 cs.LG cs.DC eess.SP

Decentralized Learning via Random Walk with Jumps

Zonghong Liu, Matthew Dwyer, Salim El Rouayheb

2604.12257 2026-04-15 cs.CV

Style-Decoupled Adaptive Routing Network for Underwater Image Enhancement

Hang Xu, Chen Long, Bing Wang, Hao Chen, Zhen Dong

2604.12255 2026-04-15 cs.CV cs.AI

ARGen: Affect-Reinforced Generative Augmentation towards Vision-based Dynamic Emotion Perception

Huanzhen Wang, Ziheng Zhou, Jiaqi Song, Li He, Yunshi Lan, Yan Wang, Wenqiang Zhang

2604.12251 2026-04-15 cs.CV

ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models

Xinliang Wang, Yifeng Shi, Zhenyu Wu

Comments The second author is the corresponding author

2604.12250 2026-04-15 cs.AI cs.CL cs.GT cs.MA

How memory can affect collective and cooperative behaviors in an LLM-Based Social Particle Swarm

Taisei Hishiki, Takaya Arita, Reiji Suzuki

Comments 12 pages, 6 figures and 2 tables

2604.12247 2026-04-15 cs.CL cs.AI cs.LG

SpecBound: Adaptive Bounded Self-Speculation with Layer-wise Confidence Calibration

Zhuofan Wen, Yang Feng

Comments ACL 2026 Findings

2604.12245 2026-04-15 cs.LG cs.AI cs.CV cs.NE

Socrates Loss: Unifying Confidence Calibration and Classification by Leveraging the Unknown

Sandra Gómez-Gálvez, Tobias Olenyi, Gillian Dobbie, Katerina Taškova

Comments Published at TMLR 2026. https://openreview.net/forum?id=DONqw1KhHq Video: https://youtu.be/7WuSkC-aWW8?si=9fgq5ZN7euIyGZGU Code: https://github.com/sandruskyi/SocratesLoss

2604.12243 2026-04-15 cs.CL cs.AI

Continuous Knowledge Metabolism: Generating Scientific Hypotheses from Evolving Literature

Jinkai Tao, Yubo Wang, Xiaoyu Liu, Menglin Yang

Comments 32 pages, 6 figures

详情

英文摘要

Scientific hypothesis generation requires tracking how knowledge evolves, not just what is currently known. We introduce Continuous Knowledge Metabolism (CKM), a framework that processes scientific literature through sliding time windows and incrementally updates a structured knowledge base as new findings arrive. We present CKM-Lite, an efficient variant that achieves strong predictive coverage through incremental accumulation, outperforming batch processing on hit rate (+2.8%, p=0.006), hypothesis yield (+3.6, p<0.001), and best-match alignment (+0.43, p<0.001) while reducing token cost by 92%. To understand what drives these differences, we develop CKM-Full, an instrumented variant that categorizes each new finding as novel, confirming, or contradicting, detects knowledge change signals, and conditions hypothesis generation on the full evolution trajectory. Analyzing 892 hypotheses generated by CKM-Full across 50 research topics, alongside parallel runs of the other variants, we report four empirical observations: (1) incremental processing outperforms batch baseline across predictive and efficiency metrics; (2) change-aware instrumentation is associated with higher LLM-judged novelty (Cohen's d=3.46) but lower predictive coverage, revealing a quality-coverage trade-off; (3) a field's trajectory stability is associated with hypothesis success (r=-0.28, p=0.051), suggesting boundary conditions for literature-based prediction; (4) knowledge convergence signals are associated with nearly 5x higher hit rate than contradiction signals, pointing to differential predictability across change types. These findings suggest that the character of generated hypotheses is shaped not only by how much literature is processed, but also by how it is processed. They further indicate that evaluation frameworks must account for the quality-coverage trade-off rather than optimize for a single metric.

URL PDF HTML ☆

赞 0 踩 0

2604.12237 2026-04-15 cs.LG cs.AI cs.CL

MolMem: Memory-Augmented Agentic Reinforcement Learning for Sample-Efficient Molecular Optimization

Ziqing Wang, Yibo Wen, Abhishek Pandy, Han Liu, Kaize Ding

2604.12231 2026-04-15 cs.CL cs.IR

Thought-Retriever: Don't Just Retrieve Raw Data, Retrieve Thoughts for Memory-Augmented Agentic Systems

Tao Feng, Pengrui Han, Guanyu Lin, Ge Liu, Jiaxuan You

详情

Journal ref: Transactions on Machine Learning Research (TMLR), 04/2026

英文摘要

Large language models (LLMs) have transformed AI research thanks to their powerful internal capabilities and knowledge. However, existing LLMs still fail to effectively incorporate the massive external knowledge when interacting with the world. Although retrieval-augmented LLMs are proposed to mitigate the issue, they are still fundamentally constrained by the context length of LLMs, as they can only retrieve top-K raw data chunks from the external knowledge base which often consists of millions of data chunks. Here we propose Thought-Retriever, a novel model-agnostic algorithm that helps LLMs generate output conditioned on arbitrarily long external data, without being constrained by the context length or number of retrieved data chunks. Our key insight is to let an LLM fully leverage its intermediate responses generated when solving past user queries (thoughts), filtering meaningless and redundant thoughts, organizing them in thought memory, and retrieving the relevant thoughts when addressing new queries. This effectively equips LLM-based agents with a self-evolving long-term memory that grows more capable through continuous interaction. Besides algorithmic innovation, we further meticulously prepare a novel benchmark, AcademicEval, which requires an LLM to faithfully leverage ultra-long context to answer queries based on real-world academic papers. Extensive experiments on AcademicEval and two other public datasets validate that Thought-Retriever remarkably outperforms state-of-the-art baselines, achieving an average increase of at least 7.6% in F1 score and 16% in win rate across various tasks. More importantly, we further demonstrate two exciting findings: (1) Thought-Retriever can indeed help LLM self-evolve after solving more user queries; (2) Thought-Retriever learns to leverage deeper thoughts to answer more abstract user queries.

URL PDF HTML ☆

赞 0 踩 0