arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2605.00501 2026-05-04 cs.LG

LambdaRankIC: Directly Optimizing Rank IC for Financial Prediction

Yan Lin, Yihong Su, Yi Yang

详情

英文摘要

In financial predictions, the performance of machine learning models is often assessed by Rank IC, which is the Spearman rank correlation between the model predictions and the realized asset returns. Despite its wide adoption, most existing models are trained using regression losses or ranking objectives that may not align with Rank IC. We propose LambdaRankIC, a novel learning-to-rank approach that directly optimizes Rank IC. We circumvent the non-differentiability of the ranking operator by deriving the closed-form expression for the lambda gradients induced by the pairwise rank swaps, which enables efficient gradient-based optimization within the LambdaRank framework. We implement LambdaRankIC as a custom objective in XGBoost. Theoretically, we show that our approach optimizes an upper bound on Rank IC. We evaluate the proposed approach on both simulated and real-world financial data. In simulation studies, LambdaRankIC accurately recovers the true ranking structure in noiseless settings and consistently outperforms regression-based and NDCG-oriented ranking methods under low signal-to-noise ratios and heavy-tailed noise regimes. In empirical experiments using real market data, LambdaRankIC achieves the best out-of-sample performance on evaluation metrics commonly used in finance, including Rank IC, ICIR, monthly return, and Sharpe ratio. These results show that directly optimizing Rank IC can yield substantial improvements over conventional learning objectives in financial predictions when the full-order ranking quality is the primary goal.

URL PDF HTML ☆

赞 0 踩 0

2605.00500 2026-05-04 cs.LG

Scaling Federated Linear Contextual Bandits via Sketching

Hantao Yang, Hong Xie, Xutong Liu, Defu Lian

2605.00498 2026-05-04 cs.CV

GOR-IS: 3D Gaussian Object Removal in the Intrinsic Space

Yonghao Zhao, Yupeng Gao, Jian Yang, Jin Xie, Beibei Wang

2605.00496 2026-05-04 cs.CV cs.RO

High-Speed Vision Improves Zero-Shot Semantic Understanding of Human Actions

Yongpeng Cao, Yuji Yamakawa

2605.00495 2026-05-04 cs.SD cs.CV

MMAudio-LABEL: Audio Event Labeling via Audio Generation for Silent Video

Kazuya Tateishi, Akira Takahashi, Atsuo Hiroe, Hirofumi Takeda, Shusuke Takahashi, Yuki Mitsufuji

Comments Accepted to the CVPR 2026 Sight and Sound Workshop

2605.00490 2026-05-04 cs.LG

Distance metric learning for conditional anomaly detection

Michal Valko, Milos Hauskrecht

Comments Published at FLAIRS 2008 (21st International Florida AI Research Society Conference)

2605.00489 2026-05-04 cs.LG

Revealing graph bandits for maximizing local influence

Alexandra Carpentier, Michal Valko

Comments Published at AISTATS 2016 (19th International Conference on Artificial Intelligence and Statistics)

2605.00488 2026-05-04 cs.LG

Trading off rewards and errors in multi-armed bandits

Akram Erraqabi, Alessandro Lazaric, Michal Valko, Emma Brunskill, Yun-En Liu

Comments Published at AISTATS 2017 (20th International Conference on Artificial Intelligence and Statistics)

2605.00482 2026-05-04 cs.LG cs.AI

Scalable Context-Aware Graph Attention for Unsupervised Anomaly Detection in Large-Scale Mobile Networks

Sara Malacarne, Eirik Hoel-Høiseth, Erlend Aune, David Zsolt Biró, Massimiliano Ruocco

Comments This work has been submitted to the IEEE for possible publication

2605.00480 2026-05-04 cs.CV

Leveraging Vision-Language Models as Weak Annotators in Active Learning

Phuong Ngoc Nguyen, Kaito Shiku, Ryoma Bise, Seiichi Uchida, Shinnosuke Matsuo

Comments Accepted at ICIP2026

2605.00475 2026-05-04 cs.RO cs.CV

MSACT: Multistage Spatial Alignment for Stable Low-Latency Fine Manipulation

Xianbo Cai, Hideyuki Ichiwara, Masaki Yoshikawa, Tetsuya Ogata

Comments 8 pages, 6 figures

2605.00474 2026-05-04 cs.CV

From Local to Global to Mechanistic: An iERF-Centered Unified Framework for Interpreting Vision Models

Yearim Kim, Sangyu Han, Nojun Kwak

Comments Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2026

详情

DOI: 10.1109/TPAMI.2026.3688582
Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2026

英文摘要

Modern vision models achieve remarkable accuracy, but explaining where evidence arises, what the model encodes, and how internal computations assemble that evidence remains fragmented. We introduce an iERF-centric framework that unifies local, global, and mechanistic interpretability around a single analysis unit: the pointwise feature vector (PFV) paired with its instance-specific Effective Receptive Field (iERF). On the local side, Sharing Ratio Decomposition (SRD) expresses each PFV as a mixture of upstream PFVs via sharing ratios and propagates iERFs to construct class-discriminative saliency maps. SRD yields high-resolution, activation-faithful explanations, is robust to targeted manipulation and noise, and remains activation-agnostic across common nonlinearities. For the global view, we introduce Concept-Anchored Feature Explanation (CAFE), which utilizes the iERF as a semantic label, grounding abstract latent vectors in verifiable pixel-level evidence. With CAFE, we address the challenge of non-localized sparse autoencoder latents--especially in Transformers, where early self-attention mixes distant context. To answer how representations are composed through depth, we propose the Interlayer Concept Graph with Interlayer Concept Attribution (ICAT), which quantifies concept-to-concept influence while isolating layer pairs; an interlayer insertion, deletion protocol identifies Integrated Gradients as the most faithful instantiation. Empirically, across ResNet50, VGG16, and ViTs, our framework outperforms baselines in both fidelity and robustness, successfully interprets dispersed SAE features, and exposes dominant concept routes in correct, misclassified, and adversarial cases. Grounded in iERFs, our approach provides a coherent, evidence-backed map from pixels to concepts to decisions.

URL PDF HTML ☆

赞 0 踩 0

2605.00473 2026-05-04 cs.LG math.OC

Near-optimal and Efficient First-Order Algorithm for Multi-Task Learning with Shared Linear Representation

Shihong Ding, Fangyu Du, Cong Fang

2605.00471 2026-05-04 cs.RO

Stereo Multistage Spatial Attention for Real-Time Mobile Manipulation Under Visual Scale Variation and Disturbances

Xianbo Cai, Hideyuki Ichiwara, Hyogo Hiruma, Masaki Yoshikawa, Hiroshi Ito, Tetsuya Ogata

Comments 8 pages, 10 figures

2605.00468 2026-05-04 cs.CL

ReLay: Personalized LLM-Generated Plain-Language Summaries for Better Understanding, but at What Cost?

Joey Chan, Yikun Han, Jingyuan Chen, Samuel Fang, Lauren D. Gryboski, Alexandra Lee, Sheel Tanna, Qingqing Zhu, Zhiyong Lu, Lucy Lu Wang, Yue Guo

2605.00467 2026-05-04 cs.LG stat.ML

Batch Normalization for Neural Networks on Complex Domains

Xuan Son Nguyen, Nistor Grozavu

2605.00466 2026-05-04 cs.LG cs.AI

PAMod: Modeling Cyclical Shifts via Phase-Amplitude Modulation for Non-stationary Time Series Forecasting

Yingbo Zhou, Yutong Ye, Shuhao Li, Rui Qian, Qiang Huang, Lemao Liu, Li Sun, Dejing Dou

2605.00458 2026-05-04 cs.LG eess.SP

Federated Learning with Hypergradient-based Online Update of Aggregation Weights

Ayano Nakai-Kasai, Tadashi Wadayama

2605.00448 2026-05-04 cs.CV eess.IV

Learning from Compressed CT: Feature Attention Style Transfer and Structured Factorized Projections for Resource-Efficient Medical Image Analysis

Shadid Yousuf, S. M. Mahbubur Rahman, Mohammed Imamul Hassan Bhuiyan

2605.00444 2026-05-04 cs.CV

Scaling Video Understanding via Compact Latent Multi-Agent Collaboration

Kerui Chen, Jinglu Wang, Jianrong Zhang, Ming Li, Yan Lu, Hehe Fan

Comments 12 pages

2605.00443 2026-05-04 cs.LG cs.CV

Adaptive Equilibrium: Dynamic Weighting Framework for Generalized Interruption of DeepFake Models

Hongrui Zheng, Liejun Wang, Zhiqing Guo

Comments 11pages,5 figures

2605.00440 2026-05-04 cs.AI cs.CL cs.HC

On the Role of Artificial Intelligence in Human-Machine Symbiosis

Ching-Chun Chang, Yuchen Guo, Hanrui Wang, Timo Spinde, Isao Echizen

2605.00438 2026-05-04 cs.AI cs.RO

Thinking in Text and Images: Interleaved Vision--Language Reasoning Traces for Long-Horizon Robot Manipulation

Jinkun Liu, Haohan Chi, Lingfeng Zhang, Yifan Xie, YuAn Wang, Long Chen, Hangjun Ye, Xiaoshuai Hao, Wenbo Ding

2605.00436 2026-05-04 cs.CL cs.AI

Impact of Task Phrasing on Presumptions in Large Language Models

Kenneth J. K. Ong

2605.00434 2026-05-04 cs.CV

LIMSSR: LLM-Driven Sequence-to-Score Reasoning under Training-Time Incomplete Multimodal Observations

Huangbiao Xu, Huanqi Wu, Xiao Ke, Yuxin Peng

Comments ICML 2026 [Spotlight]

2605.00431 2026-05-04 cs.SD cs.CV cs.LG eess.AS

MMAudioReverbs: Video-Guided Acoustic Modeling for Dereverberation and Room Impulse Response Estimation

Akira Takahashi, Ryosuke Sawata, Shusuke Takahashi, Yuki Mitsufuji

Comments Accepted to the CVPR 2026 Sight and Sound Workshop

2605.00423 2026-05-04 cs.LG

GD4: Graph-based Discrete Denoising Diffusion for MIMO Detection

Qincheng Lu, Sitao Luan, Xiao-Wen Chang

2605.00422 2026-05-04 cs.LG cs.AI

BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs

Zhixiong Zhao, Zukang Xu, Dawei Yang

Comments Accepted by ACL-Main 2026

2605.00421 2026-05-04 cs.CL cs.AI cs.LG

RadLite: Multi-Task LoRA Fine-Tuning of Small Language Models for CPU-Deployable Radiology AI

Pankaj Gupta, Kartik Bose

2605.00410 2026-05-04 cs.CL cs.AI

Agent Capsules: Quality-Gated Granularity Control for Multi-Agent LLM Pipelines

Aninda Ray

Comments 17 pages, 7 figures. Code: https://github.com/aray-17/agent-capsules

详情

英文摘要

A multi-agent pipeline with N agents typically issues N LLM calls per run. Merging agents into fewer calls (compound execution) promises token savings, but naively merged calls silently degrade quality through tool loss and prompt compression. We present Agent Capsules, an adaptive execution runtime that treats multi-agent pipeline execution as an optimization problem with empirical quality constraints. The runtime instruments coordination overhead per group, scores composition opportunity, selects among three compound execution strategies, and gates every mode switch on rolling-mean output quality. A controlled negative result confirms that injecting more context into a merged call worsens compression rather than relieving it, so the framework's escalation ladder (standard, then two-phase, then sequential) recovers quality by moving toward per-agent dispatch rather than by rewriting merged prompts. On LLM-judged quality, the controller matches a hand-tuned oracle on every measured (model, group, mode) cell: routing compound whenever the oracle would, and reverting to fine whenever quality would fail the floor, without per-model configuration. Against a hand-crafted LangGraph implementation of a 14-agent competitive intelligence pipeline, Agent Capsules uses 51% fewer fine-mode input tokens and 42% fewer compound-mode input tokens, at +0.020 and +0.017 quality respectively. Against a DSPy implementation of a 5-agent due diligence pipeline, the framework uses 19% fewer tokens than uncompiled DSPy at quality parity, and 68% fewer tokens than MIPROv2 at +0.052 quality. Even before compound mode fires, the runtime delivers efficiency through automatic policy resolution, cache-aligned prompts, and topology-aware context injection, matching both hand-tuned and compile-time baselines without training data or per-pipeline engineering.

URL PDF HTML ☆

赞 0 踩 0