arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.09439 2026-04-13 cs.IR cs.AI

TME-PSR: Time-aware, Multi-interest, and Explanation Personalization for Sequential Recommendation

Qingzhuo Wang, Leilei Wen, Juntao Chen, Kunyu Peng, Ruiyang Qin, Zhihua Wei, Wen Shen

详情

英文摘要

In this paper, we propose a sequential recommendation model that integrates Time-aware personalization, Multi-interest personalization, and Explanation personalization for Personalized Sequential Recommendation (TME-PSR). That is, we consider the differences across different users in temporal rhythm preference, multiple fine-grained latent interests, and the personalized semantic alignment between recommendations and explanations. Specifically, the proposed TME-PSR model employs a dual-view gated time encoder to capture personalized temporal rhythms, a lightweight multihead Linear Recurrent Unit architecture that enables fine-grained sub-interest modeling with improved efficiency, and a dynamic dual-branch mutual information weighting mechanism to achieve personalized alignment between recommendations and explanations. Extensive experiments on real-world datasets demonstrate that our method consistently improves recommendation accuracy and explanation quality, at a lower computational cost.

URL PDF HTML ☆

赞 0 踩 0

2604.09426 2026-04-13 cs.HC cs.AI cs.IR

Three Modalities, Two Design Probes, One Prototype, and No Vision: Experience-Based Co-Design of a Multi-modal 3D Data Visualization Tool

Sanchita S. Kamath, Aziz N Zeidieh, Venkatesh Potluri, Sile O'Modhrain, Kenneth Perry, JooYoung Seo

2604.09421 2026-04-13 eess.IV cs.CV cs.MM

Multi-task Just Recognizable Difference for Video Coding for Machines: Database, Model, and Coding Application

Junqi Liu, Yun Zhang, Xiaoxia Huang, Long Xu, Weisi Lin

Comments Submitted to IEEE Transactions on Circuits and Systems for Video Technology

详情

英文摘要

Just Recognizable Difference (JRD) boosts coding efficiency for machine vision through visibility threshold modeling, but is currently limited to a single-task scenario. To address this issue, we propose a Multi-Task JRD (MT-JRD) dataset and an Attribute-assisted MT-JRD (AMT-JRD) model for Video Coding for Machines (VCM), enhancing both prediction accuracy and coding efficiency. First, we construct a dataset comprising 27,264 JRD annotations from machines, supporting three representative tasks including object detection, instance segmentation, and keypoint detection. Secondly, we propose the AMT-JRD prediction model, which integrates Generalized Feature Extraction Module (GFEM) and Specialized Feature Extraction Module (SFEM) to facilitate joint learning across multiple tasks. Thirdly, we innovatively incorporate object attribute information into object-wise JRD prediction through the Attribute Feature Fusion Module (AFFM), which introduces prior knowledge about object size and location. This design effectively compensates for the limitations of relying solely on image features and enhances the model's capacity to represent the perceptual mechanisms of machine vision. Finally, we apply the AMT-JRD model to VCM, where the accurately predicted JRDs are applied to reduce the coding bit rate while preserving accuracy across multiple machine vision tasks. Extensive experimental results demonstrate that AMT-JRD achieves precise and robust multi-task prediction with a mean absolute error of 3.781 and error variance of 5.332 across three tasks, outperforming the state-of-the-art single-task prediction model by 6.7% and 6.3%, respectively. Coding experiments further reveal that compared to the baseline VVC and JPEG, the AMT-JRD-based VCM improves an average of 3.861% and 7.886% Bjontegaard Delta-mean Average Precision (BD-mAP), respectively.

URL PDF HTML ☆

赞 0 踩 0

2604.09413 2026-04-13 cs.CY cs.AI

Yes, But Not Always. Generative AI Needs Nuanced Opt-in

Wiebke Hutiri, Morgan Scheuerman, Shruti Nagpal, Austin Hoag, Alice Xiang

2604.09378 2026-04-13 cs.CR cs.AI

BadSkill: Backdoor Attacks on Agent Skills via Model-in-Skill Poisoning

Guiyao Tie, Jiawen Shi, Pan Zhou, Lichao Sun

Comments 4 pages, 4 fIGURES

详情

英文摘要

Agent ecosystems increasingly rely on installable skills to extend functionality, and some skills bundle learned model artifacts as part of their execution logic. This creates a supply-chain risk that is not captured by prompt injection or ordinary plugin misuse: a third-party skill may appear benign while concealing malicious behavior inside its bundled model. We present BadSkill, a backdoor attack formulation that targets this model-in-skill threat surface. In BadSkill, an adversary publishes a seemingly benign skill whose embedded model is backdoor-fine-tuned to activate a hidden payload only when routine skill parameters satisfy attacker-chosen semantic trigger combinations. To realize this attack, we train the embedded classifier with a composite objective that combines classification loss, margin-based separation, and poison-focused optimization, and evaluate it in an OpenClaw-inspired simulation environment that preserves third-party skill installation and execution while enabling controlled multi-model study. Our benchmark spans 13 skills, including 8 triggered tasks and 5 non-trigger control skills, with a combined main evaluation set of 571 negative-class queries and 396 trigger-aligned queries. Across eight architectures (494M--7.1B parameters) from five model families, BadSkill achieves up to 99.5\% average attack success rate (ASR) across the eight triggered skills while maintaining strong benign-side accuracy on negative-class queries. In poison-rate sweeps on the standard test split, a 3\% poison rate already yields 91.7\% ASR. The attack remains effective across the evaluated model scales and under five text perturbation types. These findings identify model-bearing skills as a distinct model supply-chain risk in agent ecosystems and motivate stronger provenance verification and behavioral vetting for third-party skill artifacts.

URL PDF HTML ☆

赞 0 踩 0

2604.09370 2026-04-13 q-bio.QM cs.CV

Cluster-First Labelling: An Automated Pipeline for Segmentation and Morphological Clustering in Histology Whole Slide Images

Muhammad Haseeb Ahmad, Sharmila Rajendran, Damion Young, Jon Mason

Comments 7 pages, 4 figures

2604.09369 2026-04-13 q-bio.BM cs.LG q-bio.QM

Biologically-Grounded Multi-Encoder Architectures as Developability Oracles for Antibody Design

Simon J. Crouzet

Comments ICLR 2026 Workshop on Generative and Experimental Perspectives for Biomolecular Design

2604.09368 2026-04-13 cs.MM cs.CV

Through Their Eyes: Fixation-aligned Tuning for Personalized User Emulation

Lingfeng Huang, Huizhong Guo, Tianjun Wei, Yingpeng Du, Zhu Sun

2604.09360 2026-04-13 cs.SE cs.AI

LLM-Rosetta: A Hub-and-Spoke Intermediate Representation for Cross-Provider LLM API Translation

Peng Ding

2604.09351 2026-04-13 eess.SY cs.MA cs.RO cs.SY

Decentralized Opinion-Integrated Decision making at Unsignalized Intersections via Signed Networks

Bhaskar Varma, Ying Shuai Quan, Karl D. von Ellenrieder, Paolo Falcone

Comments Submitted to CDC 2026 with L-CSS Parallel option

2604.09321 2026-04-13 eess.IV cs.CV

UHD Low-Light Image Enhancement via Real-Time Enhancement Methods with Clifford Information Fusion

Xiaohan Wang, Chen Wu, Dawei Zhao, Guangwei Gao, Dianjie Lu, Guijuan Zhang, Linwei Fan, Xu Lu, Shuai Wu, Hang Wei, Zhuoran Zheng

2604.09320 2026-04-13 physics.chem-ph cs.LG

Transferable FB-GNN-MBE Framework for Potential Energy Surfaces: Data-Adaptive Transfer Learning in Deep Learned Many-Body Expansion Theory

Siqi Chen, Zhiqiang Wang, Yili Shen, Xianqi Deng, Xi Cheng, Cheng-Wei Ju, Jun Yi, Guo Ling, Dieaa Alhmoud, Hui Guan, Zhou Lin

Comments Under review with The Journal of Chemical Physics. Main text: 23 pages, 11 figures, and 1 table. Supplementary Materials: 28 pages, 6 figures, 15 tables, 4 pseudo-algorithms

2604.09313 2026-04-13 eess.IV cs.CV

Compositional-Degradation UAV Image Restoration: Conditional Decoupled MoE Network and A Benchmark

Jinquan Yan, Zhicheng Zhao, Zhengzheng Tu, Chenglong Li, Jin Tang, Bin Luo

详情

英文摘要

UAV images are critical for applications such as large-area mapping, infrastructure inspection, and emergency response. However, in real-world flight environments, a single image is often affected by multiple degradation factors, including rain, haze, and noise, undermining downstream task performance. Current unified restoration approaches typically rely on implicit degradation representations that entangle multiple factors into a single condition, causing mutual interference among heterogeneous corrections. To this end, we propose DAME-Net, a Degradation-Aware Mixture-of-Experts Network that decouples explicit degradation perception from degradation-conditioned reconstruction for compositional UAV image restoration. Specifically, we design a Factor-wise Degradation Perception module(FDPM) to provide explicit per-factor degradation cues for the restoration stage through multi-label prediction with label-similarity-guided soft alignment, replacing implicit entangled conditions with interpretable and generalizable degradation descriptions. Moreover, we develop a Conditioned Decoupled MoE module(CDMM) that leverages these cues for stage-wise conditioning, spatial-frequency hybrid processing, and mask-constrained decoupled expert routing, enabling selective factor-specific correction while suppressing irrelevant interference. In addition, we construct the Multi-Degradation UAV Restoration benchmark (MDUR), the first large-scale UAV benchmark for compositional UAV image restoration, with 43 degradation configurations from single degradations to four-factor composites and standardized seen/unseen splits.Extensive experiments on MDUR demonstrate consistent improvements over representative unified restoration methods, with greater gains on unseen and higher-order composite degradations. Downstream experiments further validate benefits for UAV object detection.

URL PDF HTML ☆

赞 0 踩 0

2604.09309 2026-04-13 stat.ML cs.LG stat.CO

Iterative Identification Closure: Amplifying Causal Identifiability in Linear SEMs

Ziyi Ding, Xiao-Ping Zhang

2604.09306 2026-04-13 quant-ph cs.AI cs.NI

SatQNet: Satellite-assisted Quantum Network Entanglement Routing Using Directed Line Graph Neural Networks

Tobias Meuser, Jannis Weil, Aninda Lahiri, Marius Paraschiv

2604.09263 2026-04-13 math.OC cs.LG cs.NA math.NA

Natural Riemannian gradient for learning functional tensor networks

Nikolas Klug, Michael Ulbrich, André Uschmajew, Marius Willner

2604.07248 2026-04-13 physics.optics cs.CV

TurPy: a physics-based and differentiable optical turbulence simulator for algorithmic development and system optimization

Joseph L. Greene, Alfred Moore, Iris Ochoa, Emily Kwan, Patrick Marano, Christopher R. Valenta

Comments 19 pages, 7 figures, 1 table. Presented at 2026 SPIE DS Synthetic Data for Artificial Intelligence and Machine Learning: Tools, Techniques, and Applications IV

2604.02598 2026-04-13 cs.HC cs.AI cs.PL

Explorable Theorems: Making Written Theorems Explorable by Grounding Them in Formal Representations

Hita Kambhamettu, Will Crichton, Sean Welleck, Harrison Goldstein, Andrew Head

2602.05862 2026-04-13 stat.ML cs.LG math.ST stat.TH

Distribution-free two-sample testing with blurred total variation distance

Rohan Hore, Rina Foygel Barber

Comments 47 pages, 4 figures

2601.22160 2026-04-13 cs.GR cs.AI

Screen, Cache, and Match: A Training-Free Causality-Consistent Reference Frame Framework for Human Animation

Jianan Wang, Nailei Hei, Li He, Huanzhen Wang, Aoxing Li, Yingkai Zhao, Yuxuan Lin, Haofen Wang, Chunyang Wang, Yan Wang, Wenqiang Zhang

2511.22321 2026-04-13 quant-ph cs.AI cs.NI

RELiQ: Scalable Entanglement Routing via Reinforcement Learning in Quantum Networks

Tobias Meuser, Jannis Weil, Aninda Lahiri, Marius Paraschiv

2510.21588 2026-04-13 q-bio.NC cs.LG

Contribution of task-irrelevant stimuli to drift of neural representations

Farhad Pashakhanloo

Comments NeurIPS 2025 (camera ready)

详情

Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 39 (2025)

英文摘要

Biological and artificial learners are inherently exposed to a stream of data and experience throughout their lifetimes and must constantly adapt to, learn from, or selectively ignore the ongoing input. Recent findings reveal that, even when the performance remains stable, the underlying neural representations can change gradually over time, a phenomenon known as representational drift. Studying the different sources of data and noise that may contribute to drift is essential for understanding lifelong learning in neural systems. However, a systematic study of drift across architectures and learning rules, and the connection to task, are missing. Here, in an online learning setup, we characterize drift as a function of data distribution, and specifically show that the learning noise induced by task-irrelevant stimuli, which the agent learns to ignore in a given context, can create long-term drift in the representation of task-relevant stimuli. Using theory and simulations, we demonstrate this phenomenon both in Hebbian-based learning -- Oja's rule and Similarity Matching -- and in stochastic gradient descent applied to autoencoders and a supervised two-layer network. We consistently observe that the drift rate increases with the variance and the dimension of the data in the task-irrelevant subspace. We further show that this yields different qualitative predictions for the geometry and dimension-dependency of drift than those arising from Gaussian synaptic noise. Overall, our study links the structure of stimuli, task, and learning rule to representational drift and could pave the way for using drift as a signal for uncovering underlying computation in the brain.

URL PDF HTML ☆

赞 0 踩 0

2502.08691 2026-04-13 cs.SI cs.AI

AgentSociety: Large-Scale Simulation of LLM-Driven Generative Agents Advances Understanding of Human Behaviors and Society

Jinghua Piao, Yuwei Yan, Jun Zhang, Nian Li, Junbo Yan, Xiaochong Lan, Zhihong Lu, Zhiheng Zheng, Jing Yi Wang, Di Zhou, Chen Gao, Fengli Xu, Fang Zhang, Ke Rong, Jun Su, Yong Li

2501.19038 2026-04-13 stat.ML cs.LG

Conformal Prediction in Hierarchical Classification with Constrained Representation Complexity

Thomas Mortier, Alireza Javanmardi, Yusuf Sale, Eyke Hüllermeier, Willem Waegeman

2408.01257 2026-04-13 cs.SI cs.AI cs.CY cs.HC cs.LG

Detection and Characterization of Coordinated Online Behavior: A Survey

Lorenzo Mannocci, Michele Mazza, Anna Monreale, Maurizio Tesconi, Stefano Cresci

2604.09280 2026-04-13 eess.IV cs.CV

AMO-ENE: Attention-based Multi-Omics Fusion Model for Outcome Prediction in Extra Nodal Extension and HPV-associated Oropharyngeal Cancer

Gautier Hénique, William Le, Gabriel Dayan, Coralie Brodeur, Kristoff Nelson, Apostolos Christopoulos, Edith Filion, Phuc-Felix Nguyen-Tan, Laurent Letourneau-Guillon, Houda Bahig, Samuel Kadoury

详情

英文摘要

Extranodal extension (ENE) is an emerging prognostic factor in human papillomavirus (HPV)-associated oropharyngeal cancer (OPC), although it is currently omitted as a clinical staging criteria. Recent works have advocated for the inclusion of iENE as a prognostic marker in HPV-positive OPC staging. However, several practical limitations continue to hinder its clinical integration, including inconsistencies in segmentation, low contrast in the periphery of metastatic lymph nodes on CT imaging, and laborious manual annotations. To address these limitations, we propose a fully automated end-to-end pipeline that uses computed tomography (CT) images with clinical data to assess the status of nodal ENE and predict treatment outcomes. Our approach includes a hierarchical 3D semi-supervised segmentation model designed to detect and delineate relevant iENE from radiotherapy planning CT scans. From these segmentations, a set of radiomics and deep features are extracted to train an imaging-detected ENE grading classifier. The predicted ENE status is then evaluated for its prognostic value and compared with existing staging criteria. Furthermore, we integrate these nodal features with primary tumor characteristics in a multimodal, attention-based outcome prediction model, providing a dynamic framework for outcome prediction. Our method is validated in an internal cohort of 397 HPV-positive OPC patients treated with radiation therapy or chemoradiotherapy between 2009 and 2020. For outcome prediction at the 2-year mark, our pipeline surpassed baseline models with 88.2% (4.8) in AUC for metastatic recurrence, 79.2% (7.4) for overall survival, and 78.1% (8.6) for disease-free survival. We also obtain a concordance index of 83.3% (6.5) for metastatic recurrence, 71.3% (8.9) for overall survival, and 70.0% (8.1) for disease-free survival, making it feasible for clinical decision making.

URL PDF HTML ☆

赞 0 踩 0

2604.09229 2026-04-13 cs.NE cs.AI q-bio.NC

The Fast Lane Hypothesis: Von Economo Neurons Implement a Biological Speed-Accuracy Tradeoff

Esila Keskin

Comments 7 pages, 5 figures. Code available at https://github.com/esila-keskin/fast-lane-hypothesis

详情

英文摘要

Von Economo neurons (VENs) are large bipolar projection neurons found exclusively in the anterior cingulate cortex (ACC) and frontal insula of species with complex social cognition, including humans, great apes, and cetaceans. Their selective depletion in frontotemporal dementia (FTD) and altered development in autism implicate them in rapid social decision-making, yet no computational model of VEN function has previously existed. We introduce the Fast Lane Hypothesis: VENs implement a biological speed-accuracy tradeoff (SAT) by providing a sparse, fast projection pathway that enables rapid social decisions at the cost of deliberate processing accuracy. We model VENs as fast leaky integrate-and-fire (LIF) neurons with membrane time constant 5 ms and sparse dendritic fan-in of eight afferents, compared to 20 ms and eighty afferents for standard pyramidal neurons, within a spiking cortical circuit of 2,000 neurons trained on a social discrimination task. Networks are evaluated under three clinically motivated conditions across 10 independent random seeds: typical (2% VENs), autism-like (0.4% VENs), and FTD-like (post-training VEN ablation). All configurations achieve equivalent asymptotic classification accuracy (99.4%), consistent with the prediction that VENs modulate decision speed rather than representational capacity. Temporal analysis confirms that VENs produce median first-spike latencies 4 ms earlier than pyramidal neurons. At a fixed decision threshold, the typical condition is significantly faster than FTD-like (t=-23.31, p<0.0001), while autism-like is intermediate (mean RT=26.91+/-9.01 ms vs. typical 20.70+/-2.02 ms; p=0.078). A preliminary evolutionary analysis shows qualitative correspondence between model-optimal VEN fraction and the primate phylogenetic gradient. To our knowledge, this is the first computational model that asks what a Von Economo neuron actually computes.

URL PDF HTML ☆

赞 0 踩 0

2604.09227 2026-04-13 eess.IV cs.CV

Training-free, Perceptually Consistent Low-Resolution Previews with High-Resolution Image for Efficient Workflows of Diffusion Models

Wongi Jeong, Hoigi Seo, Se Young Chun

2604.09208 2026-04-13 stat.ML cs.LG

A Predictive View on Streaming Hidden Markov Models

Gerardo Duran-Martin

2604.09200 2026-04-13 cs.CY cs.AI cs.HC

Artificial intelligence can persuade people to take political actions

Kobi Hackenburg, Luke Hewitt, Caroline Wagner, Ben M. Tappin, Christopher Summerfield

Comments 13 pages, 4 figures