arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.20157 2026-02-24 cs.CV

Flow3r: Factored Flow Prediction for Scalable Visual Geometry Learning

Zhongxiao Cong, Qitao Zhao, Minsik Jeon, Shubham Tulsiani

Comments CVPR 2026. Project website: https://flow3r-project.github.io/

详情

英文摘要

Current feed-forward 3D/4D reconstruction systems rely on dense geometry and pose supervision -- expensive to obtain at scale and particularly scarce for dynamic real-world scenes. We present Flow3r, a framework that augments visual geometry learning with dense 2D correspondences (`flow') as supervision, enabling scalable training from unlabeled monocular videos. Our key insight is that the flow prediction module should be factored: predicting flow between two images using geometry latents from one and pose latents from the other. This factorization directly guides the learning of both scene geometry and camera motion, and naturally extends to dynamic scenes. In controlled experiments, we show that factored flow prediction outperforms alternative designs and that performance scales consistently with unlabeled data. Integrating factored flow into existing visual geometry architectures and training with ${\sim}800$K unlabeled videos, Flow3r achieves state-of-the-art results across eight benchmarks spanning static and dynamic scenes, with its largest gains on in-the-wild dynamic videos where labeled data is most scarce.

URL PDF HTML ☆

赞 0 踩 0

2602.20153 2026-02-24 stat.ML cs.LG stat.ME

JUCAL: Jointly Calibrating Aleatoric and Epistemic Uncertainty in Classification Tasks

Jakob Heiss, Sören Lambrecht, Jakob Weissteiner, Hanna Wutte, Žan Žurič, Josef Teichmann, Bin Yu

Comments 11 pages + appendix. Preliminary version of an ongoing project that will be expanded with furhter evaluations

详情

英文摘要

We study post-calibration uncertainty for trained ensembles of classifiers. Specifically, we consider both aleatoric (label noise) and epistemic (model) uncertainty. Among the most popular and widely used calibration methods in classification are temperature scaling (i.e., pool-then-calibrate) and conformal methods. However, the main shortcoming of these calibration methods is that they do not balance the proportion of aleatoric and epistemic uncertainty. Not balancing these uncertainties can severely misrepresent predictive uncertainty, leading to overconfident predictions in some input regions while being underconfident in others. To address this shortcoming, we present a simple but powerful calibration algorithm Joint Uncertainty Calibration (JUCAL) that jointly calibrates aleatoric and epistemic uncertainty. JUCAL jointly calibrates two constants to weight and scale epistemic and aleatoric uncertainties by optimizing the negative log-likelihood (NLL) on the validation/calibration dataset. JUCAL can be applied to any trained ensemble of classifiers (e.g., transformers, CNNs, or tree-based methods), with minimal computational overhead, without requiring access to the models' internal parameters. We experimentally evaluate JUCAL on various text classification tasks, for ensembles of varying sizes and with different ensembling strategies. Our experiments show that JUCAL significantly outperforms SOTA calibration methods across all considered classification tasks, reducing NLL and predictive set size by up to 15% and 20%, respectively. Interestingly, even applying JUCAL to an ensemble of size 5 can outperform temperature-scaled ensembles of size up to 50 in terms of NLL and predictive set size, resulting in up to 10 times smaller inference costs. Thus, we propose JUCAL as a new go-to method for calibrating ensembles in classification.

URL PDF HTML ☆

赞 0 踩 0

2602.20152 2026-02-24 cs.LG cs.AI stat.ML

Behavior Learning (BL): Learning Hierarchical Optimization Structures from Data

Zhenyao Ma, Yue Liang, Dongxu Li

Comments ICLR 2026

2602.20151 2026-02-24 stat.ME cs.LG math.ST stat.ML stat.TH

Conformal Risk Control for Non-Monotonic Losses

Anastasios N. Angelopoulos

2602.20144 2026-02-24 eess.SY cs.AI cs.NI cs.SY

Agentic AI for Scalable and Robust Optical Systems Control

Zehao Wang, Mingzhe Han, Wei Cheng, Yue-Kai Huang, Philip Ji, Denton Wu, Mahdi Safari, Flemming Holtorf, Kenaish AlQubaisi, Norbert M. Linke, Danyang Zhuo, Yiran Chen, Ting Wang, Dirk Englund, Tingjun Chen

2602.20137 2026-02-24 cs.CV

Do Large Language Models Understand Data Visualization Rules?

Martin Sinnona, Valentin Bonas, Emmanuel Iarussi, Viviana Siless

2602.20134 2026-02-24 cs.GT cs.AI

Modeling Epidemiological Dynamics Under Adversarial Data and User Deception

Yiqi Su, Christo Kurisummoottil Thomas, Walid Saad, Bud Mishra, Naren Ramakrishnan

2602.20133 2026-02-24 cs.NE cs.AI cs.CL

AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization

Mert Cemri, Shubham Agrawal, Akshat Gupta, Shu Liu, Audrey Cheng, Qiuyang Mang, Ashwin Naren, Lutfi Eren Erdogan, Koushik Sen, Matei Zaharia, Alex Dimakis, Ion Stoica

2602.20132 2026-02-24 cs.LG

LAD: Learning Advantage Distribution for Reasoning

Wendi Li, Sharon Li

2602.20130 2026-02-24 cs.CL cs.AI

To Reason or Not to: Selective Chain-of-Thought in Medical Question Answering

Zaifu Zhan, Min Zeng, Shuang Zhou, Yiran Song, Xiaoyi Chen, Yu Hou, Yifan Wu, Yang Ruan, Rui Zhang

2602.20127 2026-02-24 cs.IT math.IT

Enormous Fluid Antenna Systems (E-FAS)--Part II: Channel Estimation

Farshad Rostami Ghadi, Kai-Kit Wong, Masoud Kaveh, Hao Xu, Baiyang Liu, Kin-Fai Tong, Chan-Byoung Chae

详情

英文摘要

Enormous fluid antenna systems (E-FAS) have recently emerged as a new wireless architecture in which intelligent metasurfaces act as guided electromagnetic interfaces, enabling surface-wave (SW) propagation with much lower attenuation and more control than conventional space-wave transmission. While prior work has reported substantial power gains under perfect channel state information (CSI), the impact of practical channel acquisition on E-FAS performance remains largely unexplored. This paper presents the first comprehensive analysis of E-FAS-assisted downlink transmission under pilot-based channel estimation. We develop an estimation framework for the equivalent end-to-end channel and derive closed-form expressions for the statistics of the minimum mean-square-error (MMSE) channel estimate and its estimation error. Building on these results, we analyze both single-user and multiuser operation while explicitly accounting for the training overhead. For the single-user case, we characterize the outage probability and achievable rate with imperfect CSI, and reveal an inherent signal-to-noise ratio (SNR) saturation phenomenon caused by residual self-interference. For the multiuser case, we study zero-forcing (ZF) precoding based on imperfect channel estimates and show that the system becomes interference-limited in the high SNR regime because of residual inter-user interference. Furthermore, we quantify the trade-off between spatial multiplexing gains and pilot overhead when the number of users increases. Analytical findings are validated via Monte Carlo simulations and benchmarked against least-squares (LS) estimation and conventional non-E-FAS transmission. The results reveal that despite CSI imperfections and training costs, E-FAS retains substantial performance advantages and provides robustness enabled by its amplified large-scale channel gain.

URL PDF HTML ☆

赞 0 踩 0

2602.20126 2026-02-24 cs.LG cs.IT math.IT math.ST stat.ML stat.TH

Adaptation to Intrinsic Dependence in Diffusion Language Models

Yunxiao Zhao, Changxiao Cai

2602.20120 2026-02-24 cs.CY

Enhancing Capstone Program Workflow: A Case Study on a Platform for Managing Academic-Industry Projects

Rafael Corsi Ferrao, Luciano Pereira Soares

2602.20119 2026-02-24 cs.RO cs.AI cs.CV

NovaPlan: Zero-Shot Long-Horizon Manipulation via Closed-Loop Video Language Planning

Jiahui Fu, Junyu Nan, Lingfeng Sun, Hongyu Li, Jianing Qian, Jennifer L. Barry, Kris Kitani, George Konidaris

Comments 25 pages, 15 figures. Project webpage: https://nova-plan.github.io/

2602.20117 2026-02-24 cs.AI cs.LG

ReSyn: Autonomously Scaling Synthetic Environments for Reasoning Models

Andre He, Nathaniel Weir, Kaj Bostrom, Allen Nie, Darion Cassel, Sam Bayless, Huzefa Rangwala

2602.20114 2026-02-24 cs.CV cs.AI

Benchmarking Unlearning for Vision Transformers

Kairan Zhao, Iurie Luca, Peter Triantafillou

2602.20113 2026-02-24 cs.SD cs.AI

StyleStream: Real-Time Zero-Shot Voice Style Conversion

Yisi Liu, Nicholas Lee, Gopala Anumanchipalli

2602.20111 2026-02-24 cs.LG

Reliable Abstention under Adversarial Injections: Tight Lower Bounds and New Upper Bounds

Ezra Edelman, Surbhi Goel

详情

英文摘要

We study online learning in the adversarial injection model introduced by [Goel et al. 2017], where a stream of labeled examples is predominantly drawn i.i.d.\ from an unknown distribution $\mathcal{D}$, but may be interspersed with adversarially chosen instances without the learner knowing which rounds are adversarial. Crucially, labels are always consistent with a fixed target concept (the clean-label setting). The learner is additionally allowed to abstain from predicting, and the total error counts the mistakes whenever the learner decides to predict and incorrect abstentions when it abstains on i.i.d.\ rounds. Perhaps surprisingly, prior work shows that oracle access to the underlying distribution yields $O(d^2 \log T)$ combined error for VC dimension $d$, while distribution-agnostic algorithms achieve only $\tilde{O}(\sqrt{T})$ for restricted classes, leaving open whether this gap is fundamental. We resolve this question by proving a matching $Ω(\sqrt{T})$ lower bound for VC dimension $1$, establishing a sharp separation between the two information regimes. On the algorithmic side, we introduce a potential-based framework driven by \emph{robust witnesses}, small subsets of labeled examples that certify predictions while remaining resilient to adversarial contamination. We instantiate this framework using two combinatorial dimensions: (1) \emph{inference dimension}, yielding combined error $\tilde{O}(T^{1-1/k})$ for classes of inference dimension $k$, and (2) \emph{certificate dimension}, a new relaxation we introduce. As an application, we show that halfspaces in $\mathbb{R}^2$ have certificate dimension $3$, obtaining the first distribution-agnostic bound of $\tilde{O}(T^{2/3})$ for this class. This is notable since [Blum et al. 2021] showed halfspaces are not robustly learnable under clean-label attacks without abstention.

URL PDF HTML ☆

赞 0 踩 0

2602.20104 2026-02-24 cs.AI cs.HC cs.LG

Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI Collaboration

Hasan Amin, Ming Yin, Rajiv Khanna

Comments AAAI 2026

2602.20100 2026-02-24 cs.CV cs.AI eess.IV

Transcending the Annotation Bottleneck: AI-Powered Discovery in Biology and Medicine

Soumick Chatterjee

2602.20097 2026-02-24 cs.DC

Mitigating Artifacts in Pre-quantization Based Scientific Data Compressors with Quantization-aware Interpolation

Pu Jiao, Sheng Di, Jiannan Tian, Mingze Xia, Xuan Wu, Yang Zhang, Xin Liang, Franck Cappello

2602.20094 2026-02-24 cs.AI

CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching

Yuzhe Wang, Yaochen Zhu, Jundong Li

Comments 8 pages plus references, 3 figures, 3 tables. Under review

详情

英文摘要

As large language models (LLMs) witness increasing deployment in complex, high-stakes decision-making scenarios, it becomes imperative to ground their reasoning in causality rather than spurious correlations. However, strong performance on traditional reasoning benchmarks does not guarantee true causal reasoning ability of LLMs, as high accuracy may still arise from memorizing semantic patterns instead of analyzing the underlying true causal structures. To bridge this critical gap, we propose a new causal reasoning benchmark, CausalFlip, designed to encourage the development of new LLM paradigm or training algorithms that ground LLM reasoning in causality rather than semantic correlation. CausalFlip consists of causal judgment questions built over event triples that could form different confounder, chain, and collider relations. Based on this, for each event triple, we construct pairs of semantically similar questions that reuse the same events but yield opposite causal answers, where models that rely heavily on semantic matching are systematically driven toward incorrect predictions. To further probe models' reliance on semantic patterns, we introduce a noisy-prefix evaluation that prepends causally irrelevant text before intermediate causal reasoning steps without altering the underlying causal relations or the logic of the reasoning process. We evaluate LLMs under multiple training paradigms, including answer-only training, explicit Chain-of-Thought (CoT) supervision, and a proposed internalized causal reasoning approach that aims to mitigate explicit reliance on correlation in the reasoning process. Our results show that explicit CoT can still be misled by spurious semantic correlations, where internalizing reasoning steps yields substantially improved causal grounding, suggesting that it is promising to better elicit the latent causal reasoning capabilities of base LLMs.

URL PDF HTML ☆

赞 0 踩 0

2602.20093 2026-02-24 cs.IR

ManCAR: Manifold-Constrained Latent Reasoning with Adaptive Test-Time Computation for Sequential Recommendation

Kun Yang, Yuxuan Zhu, Yazhe Chen, Siyao Zheng, Bangyang Hong, Kangle Wu, Yabo Ni, Anxiang Zeng, Cong Fu, Hui Li

Comments 15 pages, 7 figures

2602.20084 2026-02-24 cs.CV

Do Large Language Models Understand Data Visualization Principles?

Martin Sinnona, Valentin Bonas, Viviana Siless, Emmanuel Iarussi

2602.20082 2026-02-24 cs.PL

Machine-Generated, Machine-Checked Proofs for a Verified Compiler (Experience Report)

Zoe Paraskevopoulou

2602.20080 2026-02-24 cs.CY

The Digital Gorilla: Rebalancing Power in the Age of AI

M. Alejandra Parra-Orlandoni, Roxanne A. Schnyder, Christopher J. Mallet

Comments 49 pages, 2 figures, preprint

详情

英文摘要

Contemporary artificial intelligence (AI) policy suffers from a basic categorical error. Existing frameworks rely on analogizing AI to inherited technology types -- such as products, platforms, or infrastructure -- and in doing so generate overlapping, often contradictory governance regimes. This "analogy trap" obscures a fundamental transformation: certain advanced AI systems no longer function solely as instruments through which existing institutions exercise power, but as de facto centers of power that shape information, coordinate behavior, and structure social and economic realities at scale. This article offers a new conceptual foundation for AI governance by treating such systems as a fourth societal actor -- what we term the "Digital Gorilla" -- alongside People, the State, and Enterprises. It develops a Four Societal Actors framework that maps how power flows among these actors across five power modalities (economic, epistemic, narrative, authoritative, physical) and uses this map to diagnose where AI capabilities disturb established allocations of authority, concentrate power, or erode accountability. Drawing on constitutional principles of separated powers and federalism, the article advances a federalized, polycentric governance architecture and institutionalizes dynamic checks and balances among the four actors, rather than today's more reactive and compliance-driven approaches. Reframing AI governance in this way shifts the inquiry from how to control a risky technology to how to design institutions capable of accommodating these increasingly powerful and autonomous digital systems without sacrificing democratic legitimacy, the rule of law, or the production of public goods, and it recasts familiar debates in administrative, constitutional, and corporate law as questions of power allocation in a four-actor system.

URL PDF HTML ☆

赞 0 踩 0

2602.20079 2026-02-24 cs.CV

SemanticNVS: Improving Semantic Scene Understanding in Generative Novel View Synthesis

Xinya Chen, Christopher Wewer, Jiahao Xie, Xinting Hu, Jan Eric Lenssen

2602.20076 2026-02-24 eess.SY cs.AI cs.RO cs.SY

Robust Taylor-Lagrange Control for Safety-Critical Systems

Wei Xiao, Christos Cassandras, Anni Li

Comments 7 pages

2602.20074 2026-02-24 cs.GT

Computational Social Choice: Research & Development

Dorothea Baumeister, Ratip Emin Berker, Niclas Boehmer, Sylvain Bouveret, Andreas Darmann, Piotr Faliszewski, Martin Lackner, Jérôme Lang, Nicholas Mattei, Arianna Novaro

Comments Accepted to AAMAS '26: Blue Sky Track

2602.20068 2026-02-24 cs.CV cs.LG

The Invisible Gorilla Effect in Out-of-distribution Detection

Harry Anthony, Ziyun Liang, Hermione Warr, Konstantinos Kamnitsas

Comments Accepted at CVPR 2026