arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2505.19807 2026-03-27 cs.LG stat.ML

Density Ratio-Free Doubly Robust Proxy Causal Learning

Bariscan Bozkurt, Houssam Zenati, Dimitri Meunier, Liyuan Xu, Arthur Gretton

Comments Neurips published version

详情

英文摘要

We study the problem of causal function estimation in the Proxy Causal Learning (PCL) framework, where confounders are not observed but proxies for the confounders are available. Two main approaches have been proposed: outcome bridge-based and treatment bridge-based methods. In this work, we propose two kernel-based doubly robust estimators that combine the strengths of both approaches, and naturally handle continuous and high-dimensional variables. Our identification strategy builds on a recent density ratio-free method for treatment bridge-based PCL; furthermore, in contrast to previous approaches, it does not require indicator functions or kernel smoothing over the treatment variable. These properties make it especially well-suited for continuous or high-dimensional treatments. By using kernel mean embeddings, we propose the first density-ratio free doubly robust estimators for proxy causal learning, which have closed form solutions and strong uniform consistency guarantees. Our estimators outperform existing methods on PCL benchmarks, including a prior doubly robust method that requires both kernel smoothing and density ratio estimation.

URL PDF HTML ☆

赞 0 踩 0

2505.12200 2026-03-27 cs.CV

CompBench: Benchmarking Complex Instruction-guided Image Editing

Bohan Jia, Wenxuan Huang, Yuntian Tang, Junbo Qiao, Jincheng Liao, Shaosheng Cao, Fei Zhao, Zhaopeng Feng, Zhouhong Gu, Zhenfei Yin, Lei Bai, Wanli Ouyang, Lin Chen, Fei Zhao, Yao Hu, Zihan Wang, Yuan Xie, Shaohui Lin

2504.06055 2026-03-27 cs.LG

A Trustworthy By Design Classification Model for Building Energy Retrofit Decision Support

Panagiota Rempi, Sotiris Pelekis, Alexandros Menelaos Tzortzis, Evangelos Spiliotis, Evangelos Karakolis, Christos Ntanos, Dimitris Askounis

详情

DOI: 10.1016/j.enbuild.2026.117340

英文摘要

Improving energy efficiency in residential buildings is critical to combating climate change and reducing greenhouse gas emissions. Retrofitting existing buildings, which contribute a significant share of energy use, is therefore a key priority, especially in regions with outdated building stock. Artificial Intelligence (AI) and Machine Learning (ML) can automate retrofit decision-making and find retrofit strategies. However, their use faces challenges of data availability, model transparency, and compliance with national and EU AI regulations including the AI act, ethics guidelines and the ALTAI. This paper presents a trustworthy-by-design ML-based decision support framework that recommends energy efficiency strategies for residential buildings using minimal user-accessible inputs. The framework merges Conditional Tabular Generative Adversarial Networks (CTGAN) to augment limited and imbalanced data with a neural network-based multi-label classifier that predicts potential combinations of retrofit actions. To support explanation and trustworthiness, an Explainable AI (XAI) layer using SHapley Additive exPlanations (SHAP) clarifies the rationale behind recommendations and guides feature engineering. Two case studies validate performance and generalization: the first leveraging a well-established, large EPC dataset for England and Wales; the second using a small, imbalanced post-retrofit dataset from Latvia (RETROFIT-LAT). Results show that the framework can handle diverse data conditions and improve performance up to 53% compared to the baseline. Overall, the proposed framework provides a feasible, interpretable, and trustworthy AI system for building retrofit decision support through assured performance, usability, and transparency to aid stakeholders in prioritizing effective energy investments and support regulation-compliant, data-driven innovation in sustainable energy transition.

URL PDF HTML ☆

赞 0 踩 0

2504.01053 2026-03-27 cs.CV cs.AI

Knowledge-Base based Semantic Image Transmission Using CLIP

Chongyang Li, Yanmei He, Tianqian Zhang, Mingjian He, Shouyin Liu

2503.08371 2026-03-27 cs.LG

Density Ratio-based Proxy Causal Learning Without Density Ratios

Bariscan Bozkurt, Ben Deaner, Dimitri Meunier, Liyuan Xu, Arthur Gretton

Comments AISTATS 2025 accepted, 81 pages

2501.11770 2026-03-27 cs.CL cs.CY cs.SI

The Value of Nothing: Multimodal Extraction of Human Values Expressed by TikTok Influencers

Alina Starovolsky-Shitrit, Alon Neduva, Naama Appel Doron, Itamar Gafni, Ella Daniel, Oren Tsur

2501.04480 2026-03-27 cs.AI cs.RO

Research on environment perception and behavior prediction of intelligent UAV based on semantic communication

Kechong Ren, Li Gao, Qi Guan

Comments The author list of this manuscript is incorrect and incomplete. This version is an unauthorized early draft without approval from all authors

2411.18195 2026-03-27 cs.LG

Scalable Multi-Objective Reinforcement Learning with Fairness Guarantees using Lorenz Dominance

Dimitris Michailidis, Willem Röpke, Diederik M. Roijers, Sennay Ghebreab, Fernando P. Santos

Comments 32 pages. Published in Journal of Artificial Intelligence Research, Vol. 85, Article 31

2411.17501 2026-03-27 cs.LG cs.AI

The Limits of Inference Scaling Through Resampling

Benedikt Stroebl, Sayash Kapoor, Arvind Narayanan

2411.15869 2026-03-27 cs.CV

Self-Calibrated CLIP for Training-Free Open-Vocabulary Segmentation

Sule Bai, Yong Liu, Yifei Han, Haoji Zhang, Yansong Tang, Jie Zhou, Jiwen Lu

Comments Accepted by IEEE TIP

详情

英文摘要

Recent advancements in pre-trained vision-language models like CLIP have enabled the task of open-vocabulary segmentation. CLIP demonstrates impressive zero-shot capabilities in various downstream tasks that require holistic image understanding. However, due to the image-level contrastive learning and fully global feature interaction, ViT-based CLIP struggles to capture local details, resulting in poor performance in segmentation tasks. Our analysis of ViT-based CLIP reveals that anomaly tokens emerge during the forward process, attracting disproportionate attention from normal patch tokens and thereby diminishing spatial awareness. To address this issue, we propose Self-Calibrated CLIP (SC-CLIP), a training-free method that calibrates CLIP to generate finer representations while preserving its original generalization ability-without introducing new parameters or relying on additional backbones. Specifically, we mitigate the negative impact of anomaly tokens from two complementary perspectives. First, we explicitly identify the anomaly tokens and replace them based on local context. Second, we reduce their influence on normal tokens by enhancing feature discriminability and attention correlation, leveraging the inherent semantic consistency within CLIP's mid-level features. In addition, we introduce a two-pass strategy that effectively integrates multi-level features to enrich local details under the training-free setting. Together, these strategies enhance CLIP's feature representations with improved granularity and semantic coherence. Experimental results demonstrate the effectiveness of SC-CLIP, achieving state-of-the-art results across all datasets and surpassing previous methods by 9.5%. Notably, SC-CLIP boosts the performance of vanilla CLIP ViT-L/14 by 6.8 times. Our source code is available at https://github.com/SuleBai/SC-CLIP.

URL PDF HTML ☆

赞 0 踩 0

2410.20894 2026-03-27 cs.AI cs.LG

Working Paper: Active Causal Structure Learning with Latent Variables: Towards Learning to Detour in Autonomous Robots

Pablo de los Riscos, Fernando J. Corbacho

Comments 44 pages, 12 figures

2410.15281 2026-03-27 cs.RO cs.AI cs.CL cs.HC

LLM4AD: Large Language Models for Autonomous Driving -- Concept, Review, Benchmark, Experiments, and Future Trends

Can Cui, Yunsheng Ma, Sung-Yeon Park, Zichong Yang, Yupeng Zhou, Peiran Liu, Juanwu Lu, Juntong Peng, Jiaru Zhang, Ruqi Zhang, Lingxi Li, Yaobin Chen, Jitesh H. Panchal, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Ziran Wang

Comments The paper was accepted by the Proceedings of the IEEE

2409.09575 2026-03-27 cs.RO

Traffic Scene Generation from Natural Language Description for Autonomous Vehicles with Large Language Model

Bo-Kai Ruan, Hao-Tang Tsui, Yung-Hui Li, Hong-Han Shuai

Comments Accepted by WAD@CVPR2026

2408.13366 2026-03-27 cs.CL cs.AI cs.LG

CodeRefine: A Pipeline for Enhancing LLM-Generated Code Implementations of Research Papers

Ekaterina Trofimova, Emil Sataev, Abhijit Singh Jowhari

Comments The results mentioned in the paper are non-reproducible. We have rechecked the metrics, and they do not match with the ones that have been provided in the paper. Therefore, we accept that this article is neither suitable nor up to the mark for the scientific community and must be with-drawn. We fully understand the consequences, and would like to wishfully retract this article

2408.05696 2026-03-27 cs.LG q-bio.QM

SMILES-Mamba: Chemical Mamba Foundation Models for Drug ADMET Prediction

Bohao Xu, Yingzhou Lu, Chenhao Li, Ling Yue, Xiao Wang, Tianfan Fu, Minjie Shen, Lulu Chen

2404.05290 2026-03-27 cs.CV cs.AI

MindSet: Vision. A toolbox for testing DNNs on key psychological experiments

Valerio Biscione, Milton L. Montero, Marin Dujmovic, Gaurav Malhotra, Dong Yin, Guillermo Puebla, Federico Adolfi, Rachel F. Heaton, John E. Hummel, Benjamin D. Evans, Karim Habashy, Jeffrey S. Bowers

Comments 34 pages, 12 figures. Updated version with additional model evaluations

2401.12546 2026-03-27 cs.LG cs.SY eess.SY math.OC

On Building Myopic MPC Policies using Supervised Learning

Christopher A. Orrico, Bokan Yang, Dinesh Krishnamoorthy

Comments Updated version available as arXiv:2508.05804

2312.10807 2026-03-27 cs.RO

Bridging Language and Action: A Survey of Language-Conditioned Robot Manipulation

Xiangtong Yao, Hongkuan Zhou, Oier Mees, Yuan Meng, Ted Xiao, Yonatan Bisk, Jean Oh, Edward Johns, Mohit Shridhar, Dhruv Shah, Jesse Thomason, Kai Huang, Joyce Chai, Zhenshan Bing, Alois Knoll

2312.10431 2026-03-27 cs.LG stat.ML

Continuous Diffusion for Mixed-Type Tabular Data

Markus Mueller, Kathrin Gruber, Dennis Fok

Comments published at ICLR 2025

2208.09843 2026-03-27 cs.CV

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval

Haoran Wang, Dongliang He, Wenhao Wu, Boyang Xia, Min Yang, Fu Li, Yunlong Yu, Zhong Ji, Errui Ding, Jingdong Wang

Comments Accepted by ECCV 2022

2203.01222 2026-03-27 cs.RO

Chance-Constrained Iterative Linear-Quadratic Stochastic Games

Hai Zhong, Yutaka Shimizu, Jianyu Chen

Comments Updated version of the published IEEE RA-L paper. Assumption 1 and strategy space definition revised to make the information structure explicit. Theorem 1 assumptions are more explict. No changes to algorithm or experimental results

2110.11736 2026-03-27 cs.LG

MANDERA: Malicious Node Detection in Federated Learning via Ranking

Wanchuang Zhu, Benjamin Zi Hao Zhao, Simon Luo, Tongliang Liu, Ke Deng

Comments 21 pages, 11 figures, The Annals of Applied Statistics

2603.25097 2026-03-27 cs.AI

ElephantBroker: A Knowledge-Grounded Cognitive Runtime for Trustworthy AI Agents

Cristian Lupascu, Alexandru Lupascu

2603.25093 2026-03-27 cs.LG

Process-Aware AI for Rainfall-Runoff Modeling: A Mass-Conserving Neural Framework with Hydrological Process Constraints

Mohammad A. Farmani, Hoshin V. Gupta, Ali Behrangi, Muhammad Jawad, Sadaf Moghisi, Guo-Yue Niu

2603.25091 2026-03-27 cs.CV cs.AI

Pixelis: Reasoning in Pixels, from Seeing to Acting

Yunpeng Zhou

Comments 28pages, 16figures, 18tables

2603.25089 2026-03-27 cs.CV

THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics

Tzu-Yen Ma, Bo Zhang, Zichen Tang, Junpeng Ding, Haolin Tian, Yuanze Li, Zhuodi Hao, Zixin Ding, Zirui Wang, Xinyu Yu, Shiyao Peng, Yizhuo Zhao, Ruomeng Jiang, Yiling Huang, Peizhi Zhao, Jiayuan Chen, Weisheng Tan, Haocheng Gao, Yang Liu, Jiacheng Liu, Zhongjun Yang, Jiayu Huang, Haihong E

Comments Accepted to ICLR 2026

2603.25088 2026-03-27 cs.CV

Visual Attention Drifts,but Anchors Hold:Mitigating Hallucination in Multimodal Large Language Models via Cross-Layer Visual Anchors

Chengxu Yang, Jingling Yuan, Chuang Hu, Jiawei Jiang

2603.25083 2026-03-27 cs.CV cs.AI

Learning domain-invariant features through channel-level sparsification for Out-Of Distribution Generalization

Haoran Pei, Yuguang Yang, Kexin Liu, Juan Zhang, Baochang Zhang

2603.25077 2026-03-27 cs.CV

Bridging Perception and Reasoning: Token Reweighting for RLVR in Multimodal LLMs

Jinda Lu, Junkang Wu, Jinghan Li, Kexin Huang, Shuo Yang, Guoyin Wang, Jiancan Wu, Xiang Wang, Xiangnan He

2603.25075 2026-03-27 cs.AI

Sparse Visual Thought Circuits in Vision-Language Models

Yunpeng Zhou