arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2512.10720 2026-04-03 cs.LG

Beyond the Black Box: Identifiable Interpretation and Control in Generative Models via Causal Minimality

Lingjing Kong, Shaoan Xie, Guangyi Chen, Yuewen Sun, Xiangchen Song, Eric P. Xing, Kun Zhang

详情

英文摘要

Deep generative models, while revolutionizing fields like image and text generation, largely operate as opaque ``black boxes'', hindering human understanding, control, and alignment. While methods like sparse autoencoders (SAEs) show remarkable empirical success, they often lack theoretical guarantees, risking subjective insights. Our primary objective is to establish a principled foundation for interpretable generative models. We demonstrate that the principle of causal minimality -- favoring the simplest causal explanation -- can endow the latent representations of modern generative models with clear causal interpretation and robust, component-wise identifiable control. We introduce a novel theoretical framework for hierarchical selection models, where higher-level concepts emerge from the constrained composition of lower-level variables, better capturing the complex dependencies in data generation. Under theoretically derived minimality conditions, we show that learned representations can be equivalent to the true latent variables of the data-generating process. Empirically, applying these constraints to leading text-to-image diffusion models allows us to extract their innate hierarchical concept graphs, offering fresh insights into their internal knowledge organization. Furthermore, these causally grounded concepts serve as levers for fine-grained model steering, paving the way for transparent, reliable systems.

URL PDF HTML ☆

赞 0 踩 0

2511.09219 2026-04-03 cs.LG

Planning in Branch-and-Bound: Model-Based Reinforcement Learning for Exact Combinatorial Optimization

Paul Strang, Zacharie Alès, Côme Bissuel, Olivier Juan, Safia Kedad-Sidhoum, Emmanuel Rachelson

2510.22028 2026-04-03 cs.CL

Penalizing Length: Uncovering Systematic Bias in Quality Estimation Metrics

Yilin Zhang, Wenda Xu, Zhongtao Liu, Tetsuji Nakagawa, Markus Freitag

2510.21884 2026-04-03 cs.CL cs.AI

Support-Contra Asymmetry in LLM Explanations

Avinash Patil

Comments 17 Pages, 12 Figures, 4 tables

2510.18520 2026-04-03 cs.LG stat.ME

Partial VOROS: A Cost-aware Performance Metric for Binary Classifiers with Precision and Capacity Constraints

Christopher Ratigan, Kyle Heuton, Carissa Wang, Lenore Cowen, Michael C. Hughes

Comments In Proceedings of the International Conference of Artificial Intelligence and Statistics (AISTATS), 2026

2510.15555 2026-04-03 cs.LG

Doubly Robust Estimation of Causal Effects in Strategic Equilibrium Systems

Sibo Xiao

Comments In systems with causal effects, a large majority of individuals are mistakenly classified as using a certain strategy by the strategic equilibrium solver, resulting in the introduction of this feature as an independent variable in causal inference without specificity. This method may have an inherent error

2510.05509 2026-04-03 cs.CV

Be Tangential to Manifold: Discovering Riemannian Metric for Diffusion Models

Shinnosuke Saito, Takashi Matsubara

2509.07013 2026-04-03 cs.LG q-bio.PE stat.ME

Generalized Machine Learning for Fast Calibration of Agent-Based Epidemic Models

Sima Najafzadehkhoei, George Vega Yon, Derek S. Meyer, Bernardo Modenesi

2508.10703 2026-04-03 cs.AI

GenOM: Ontology Matching with Description Generation and Large Language Model

Yiping Song, Jiaoyan Chen, Renate A. Schmidt

Comments Accepted for publication in World Wide Web (Springer). This version includes revisions based on peer review

2508.06763 2026-04-03 cs.CV cs.AI

SafePLUG: Empowering Multimodal LLMs with Pixel-Level Insight and Temporal Grounding for Traffic Accident Understanding

Zihao Sheng, Zilin Huang, Yansong Qu, Jiancong Chen, Yuhao Luo, Yen-Jung Chen, Yue Leng, Sikai Chen

Comments The code, dataset, and model checkpoints will be made publicly available at: https://zihaosheng.github.io/SafePLUG

详情

DOI: 10.23919/CHAIN.2026.000005

英文摘要

Multimodal large language models (MLLMs) have achieved remarkable progress across a range of vision-language tasks and demonstrate strong potential for traffic accident understanding. However, existing MLLMs in this domain primarily focus on coarse-grained image-level or video-level comprehension and often struggle to handle fine-grained visual details or localized scene components, limiting their applicability in complex accident scenarios. To address these limitations, we propose SafePLUG, a novel framework that empowers MLLMs with both Pixel-Level Understanding and temporal Grounding for comprehensive traffic accident analysis. SafePLUG supports both arbitrary-shaped visual prompts for region-aware question answering and pixel-level segmentation based on language instructions, while also enabling the recognition of temporally anchored events in traffic accident scenarios. To advance the development of MLLMs for traffic accident understanding, we curate a new dataset containing multimodal question-answer pairs centered on diverse accident scenarios, with detailed pixel-level annotations and temporal event boundaries. Experimental results show that SafePLUG achieves strong performance on multiple tasks, including region-based question answering, pixel-level segmentation, temporal event localization, and accident event understanding. These capabilities lay a foundation for fine-grained understanding of complex traffic scenes, with the potential to improve driving safety and enhance situational awareness in smart transportation systems. The code, dataset, and model checkpoints will be made publicly available at: https://zihaosheng.github.io/SafePLUG

URL PDF HTML ☆

赞 0 踩 0

2508.00855 2026-04-03 cs.LG cs.CE physics.flu-dyn

A Residual Guided strategy with Generative Adversarial Networks in training Physics-Informed Transformer Networks

Ziyang Zhang, Feifan Zhang, Weidong Tang, Lei Shi, Tailai Chen

2504.04665 2026-04-03 cs.LG cs.SY eess.SY

A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations

Laurens R. Lueg, Victor Alves, Daniel Schicksnus, John R. Kitchin, Carl D. Laird, Lorenz T. Biegler

详情

英文摘要

Scientific machine learning is an emerging field that broadly describes the combination of scientific computing and machine learning to address challenges in science and engineering. Within the context of differential equations, this has produced highly influential methods, such as neural ordinary differential equations (NODEs). Recent works extend this line of research to consider neural differential-algebraic systems of equations (DAEs), where some unknown relationships within the DAE are learned from data. Training neural DAEs, similarly to neural ODEs, is computationally expensive, as it requires the solution of a DAE for every parameter update. Further, the rigorous consideration of algebraic constraints is difficult within common deep learning training algorithms such as stochastic gradient descent. In this work, we apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem, which is solved to local optimality and simultaneously obtains the neural network parameters and the solution to the corresponding DAE. We extend recent work demonstrating the simultaneous approach for neural ODEs, by presenting a general framework to solve neural DAEs, with explicit consideration of hybrid models, where some components of the DAE are known, e.g. physics-informed constraints. Furthermore, we present a general strategy for improving the performance and convergence of the nonlinear programming solver, based on solving an auxiliary problem for initialization and approximating Hessian terms. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings such as sparse data, unobserved states and multiple trajectories. Lastly, we provide several promising future directions to improve the scalability and robustness of our approach.

URL PDF HTML ☆

赞 0 踩 0

2503.03485 2026-04-03 cs.LG q-bio.QM

TEDDY: A Family Of Foundation Models For Understanding Single Cell Biology

Alexis Chevalier, Soumya Ghosh, Urvi Awasthi, James Watkins, Julia Bieniewska, Nichita Mitrea, Olga Kotova, Kirill Shkura, Andrew Noble, Michael Steinbaugh, Vijay Sadashivaiah, George Dasoulas, Julien Delile, Christoph Meier, Leonid Zhukov, Iya Khalil, Srayanta Mukherjee, Judith Mueller

Comments ICML 2025 Generative AI and Biology (GenBio) Workshop

2502.13024 2026-04-03 cs.LG math.OC

Fragility-aware Classification for Understanding Risk and Improving Generalization

Chen Yang, Zheng Cui, Daniel Zhuoyu Long, Jin Qi, Ruohan Zhan

2310.19603 2026-04-03 cs.LG cs.NA cs.NE math.NA math.PR stat.ML

Transformers Can Solve Non-Linear and Non-Markovian Filtering Problems in Continuous Time For Conditionally Gaussian Signals

Blanka Horvath, Anastasis Kratsios, Yannick Limmer, Xuwei Yang

2210.13277 2026-04-03 cs.LG cs.DC math.OC

CompressedScaffnew: The First Theoretical Double Acceleration of Communication from Local Training and Compression in Distributed Optimization

Laurent Condat, Ivan Agarský, Peter Richtárik

2604.02071 2026-04-03 cs.CV cs.AI cs.LG

Mining Instance-Centric Vision-Language Contexts for Human-Object Interaction Detection

Soo Won Seo, KyungChae Lee, Hyungchan Cho, Taein Son, Nam Ik Cho, Jun Won Choi

Comments Accepted to CVPR 2026. Code: https://github.com/nowuss/InCoM-Net

2604.02068 2026-04-03 cs.CV econ.EM

Network Structure in UK Payment Flows: Evidence on Economic Interdependencies and Implications for Real-Time Measurement

Aditya Humnabadkar

Comments Accepted for Poster presentation at the ESCoE Conference on Economic Measurement 2026

2604.02061 2026-04-03 cs.AI

Diff-KD: Diffusion-based Knowledge Distillation for Collaborative Perception under Corruptions

Pengcheng Lyu, Chaokun Zhang, Gong Chen, Tao Tang, Zhaoxiang Luo

2604.02055 2026-04-03 cs.CV

True to Tone? Quantifying Skin Tone Fidelity and Bias in Photographic-to-Virtual Human Pipelines

Gabriel Ferri Schneider, Erick Menezes, Rafael Mecenas, Paulo Knob, Victor Araujo, Soraia Raupp Musse

Comments 20 pages, 10 figures

2604.02051 2026-04-03 cs.LG cs.CL

Ouroboros: Dynamic Weight Generation for Recursive Transformers via Input-Conditioned LoRA Modulation

Jaber Jaber, Osama Jaber

Comments 10 pages, 5 tables, 1 figure, 1 algorithm. Code: https://github.com/RightNow-AI/ouroboros

2604.02048 2026-04-03 cs.CV

Jagle: Building a Large-Scale Japanese Multimodal Post-Training Dataset for Vision-Language Models

Issa Sugiura, Keito Sasagawa, Keisuke Nakao, Koki Maeda, Ziqi Yin, Zhishen Yang, Shuhei Kurita, Yusuke Oda, Ryoko Tokuhisa, Daisuke Kawahara, Naoaki Okazaki

Comments 18 pages, 7 figures

2604.02047 2026-04-03 cs.CL cs.AI

Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding

Tao Jin, Phuong Minh Nguyen, Naoya Inoue

2604.02045 2026-04-03 cs.CL cs.AI

BidirLM: From Text to Omnimodal Bidirectional Encoders by Adapting and Composing Causal LLMs

Nicolas Boizard, Théo Deschamps-Berger, Hippolyte Gisserot-Boukhlef, Céline Hudelot, Pierre Colombo

Comments 30 pages, 16 figures, 10 tables

2604.02043 2026-04-03 cs.CL cs.AI eess.AS

Tracking the emergence of linguistic structure in self-supervised models learning from speech

Marianne de Heer Kloots, Martijn Bentum, Hosein Mohebbi, Charlotte Pouw, Gaofei Shen, Willem Zuidema

2604.02040 2026-04-03 cs.CV

Efficient Reasoning via Thought Compression for Language Segmentation

Qing Zhou, Shiyu Zhang, Yuyu Jia, Junyu Gao, Weiping Ni, Junzheng Wu, Qi Wang

2604.02038 2026-04-03 cs.RO

O-ConNet: Geometry-Aware End-to-End Inference of Over-Constrained Spatial Mechanisms

Haoyu Sun, Meng Zhao, Tianhao Wang, Jianxu Wu

Comments 8 pages, 5 figures

2604.02034 2026-04-03 cs.AI

AI in Insurance: Adaptive Questionnaires for Improved Risk Profiling

Diogo Silva, João Teixeira, Bruno Lima

2604.02032 2026-04-03 cs.CV cs.LG

IndoorCrowd: A Multi-Scene Dataset for Human Detection, Segmentation, and Tracking with an Automated Annotation Pipeline

Sebastian-Ion Nae, Radu Moldoveanu, Alexandra Stefania Ghita, Adina Magda Florea

Comments Accepted at Conference on Computer Vision and Pattern Recognition Workshops 2026

2604.02031 2026-04-03 cs.CV cs.AI

Rare-Aware Autoencoding: Reconstructing Spatially Imbalanced Data

Alejandro Castañeda Garcia, Jan van Gemert, Daan Brinks, Nergis Tömen