arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.12628 2026-03-16 q-bio.NC cs.AI eess.SP

Towards unified brain-to-text decoding across speech production and perception

Zhizhang Yuan, Yang Yang, Gaorui Zhang, Baowen Cheng, Zehan Wu, Yuhao Xu, Xiaoying Liu, Liang Chen, Ying Mao, Meng Li

Comments 37 pages, 9 figures

详情

英文摘要

Speech production and perception are the main ways humans communicate daily. Prior brain-to-text decoding studies have largely focused on a single modality and alphabetic languages. Here, we present a unified brain-to-sentence decoding framework for both speech production and perception in Mandarin Chinese. The framework exhibits strong generalization ability, enabling sentence-level decoding when trained only on single-character data and supporting characters and syllables unseen during training. In addition, it allows direct and controlled comparison of neural dynamics across modalities. Mandarin speech is decoded by first classifying syllable components in Hanyu Pinyin, namely initials and finals, from neural signals, followed by a post-trained large language model (LLM) that maps sequences of toneless Pinyin syllables to Chinese sentences. To enhance LLM decoding, we designed a three-stage post-training and two-stage inference framework based on a 7-billion-parameter LLM, achieving overall performance that exceeds larger commercial LLMs with hundreds of billions of parameters or more. In addition, several characteristics were observed in Mandarin speech production and perception: speech production involved neural responses across broader cortical regions than auditory perception; channels responsive to both modalities exhibited similar activity patterns, with speech perception showing a temporal delay relative to production; and decoding performance was broadly comparable across hemispheres. Our work not only establishes the feasibility of a unified decoding framework but also provides insights into the neural characteristics of Mandarin speech production and perception. These advances contribute to brain-to-text decoding in logosyllabic languages and pave the way toward neural language decoding systems supporting multiple modalities.

URL PDF HTML ☆

赞 0 踩 0

2603.12627 2026-03-16 stat.ML cs.IT cs.LG math.IT

Batched Kernelized Bandits: Refinements and Extensions

Chenkai Ma, Keqin Chen, Jonathan Scarlett

2603.12625 2026-03-16 cs.IR cs.AI cs.CV

VLM4Rec: Multimodal Semantic Representation for Recommendation with Large Vision-Language Models

Ty Valencia, Burak Barlas, Varun Singhal, Ruchir Bhatia, Wei Yang

Comments 13 pages, 4 figures, 1 table

2603.12615 2026-03-16 cs.CY cs.AI cs.CL cs.HC

Literary Narrative as Moral Probe : A Cross-System Framework for Evaluating AI Ethical Reasoning and Refusal Behavior

David C. Flynn

Comments 27 pages, 6 tables. Target: Minds and Machines (Springer)

2603.12586 2026-03-16 cs.IR cs.LG

Deferred is Better: A Framework for Multi-Granularity Deferred Interaction of Heterogeneous Features

Yi Xu, Moyu Zhang, Chaofan Fan, Jinxin Hu, Yu Zhang, Xiaoyi Zeng

详情

英文摘要

Click-through rate (CTR) prediction models estimates the probability of a user-item click by modeling interactions across a vast feature space. A fundamental yet often overlooked challenge is the inherent heterogeneity of these features: their sparsity and information content vary dramatically. For instance, categorical features like item IDs are extremely sparse, whereas numerical features like item price are relatively dense. Prevailing CTR models have largely ignored this heterogeneity, employing a uniform feature interaction strategy that inputs all features into the interaction layers simultaneously. This approach is suboptimal, as the premature introduction of low-information features can inject significant noise and mask the signals from information-rich features, which leads to model collapse and hinders the learning of robust representations. To address the above challenge, we propose a Multi-Granularity Information-Aware Deferred Interaction Network (MGDIN), which adaptively defers the introduction of features into the feature interaction process. MGDIN's core mechanism operates in two stages: First, it employs a multi-granularity feature grouping strategy to partition the raw features into distinct groups with more homogeneous information density in different granularities, thereby mitigating the effects of extreme individual feature sparsity and enabling the model to capture feature interactions from diverse perspectives. Second, a delayed interaction mechanism is implemented through a hierarchical masking strategy, which governs when and how each group participates by masking low-information groups in the early layers and progressively unmasking them as the network deepens. This deferred introduction allows the model to establish a robust understanding based on high-information features before gradually incorporating sparser information from other groups...

URL PDF HTML ☆

赞 0 踩 0

2603.12581 2026-03-16 eess.IV cs.AI cs.CV

Multiscale Structure-Guided Latent Diffusion for Multimodal MRI Translation

Jianqiang Lin, Zhiqiang Shen, Peng Cao, Jinzhu Yang, Osmar R. Zaiane, Xiaoli Liu

2603.12562 2026-03-16 stat.ML cs.CV cs.LG

Variational Garrote for Sparse Inverse Problems

Kanghun Lee, Hyungjoon Soh, Junghyo Jo

Comments 10 pages, 4 figures

2603.12525 2026-03-16 stat.ML cond-mat.dis-nn cs.LG

EB-RANSAC: Random Sample Consensus based on Energy-Based Model

Muneki Yasuda, Nao Watanabe, Kaiji Sekimoto

2603.12508 2026-03-16 cs.HC cs.AI

ELLA: Generative AI-Powered Social Robots for Early Language Development at Home

Victor Nikhil Antony, Shiye Cao, Shuning Wang, Chien-Ming Huang

2603.12500 2026-03-16 cs.CE cs.AI

TRACE: Temporal Rule-Anchored Chain-of-Evidence on Knowledge Graphs for Interpretable Stock Movement Prediction

Qianggang Ding, Haochen Shi, Luis Castejón Lozano, Miguel Conner, Juan Abia, Luis Gallego-Ledesma, Joshua Fellowes, Gerard Conangla Planes, Adam Elwood, Bang Liu

2603.12475 2026-03-16 cs.SE cs.AI cs.HC

The Perfection Paradox: From Architect to Curator in AI-Assisted API Design

Mak Ahmad, Andrew Macvean, JJ Geewax, David Karger

Comments 6 pages, 2 figures, 3 tables; Poster paper at CHI EA 2026 (Extended Abstracts of the ACM CHI Conference on Human Factors in Computing Systems)

2603.12470 2026-03-16 physics.space-ph cs.AI

CLARE: Classification-based Regression for Electron Temperature Prediction

Michael Liang, Blake DeHaas, Naomi Maruyama, Xiangning Chu, Takumi Abe, Koh-Ichiro Oyama

Comments 19 pages, 8 figures. Submitted to JGR: Machine Learning and Computation. Research conducted at CU Boulder LASP with support from NASA and JAXA

2603.12465 2026-03-16 cs.DC cs.LG cs.PF

TaxBreak: Unmasking the Hidden Costs of LLM Inference Through Overhead Decomposition

Prabhu Vellaisamy, Shreesh Tripathi, Vignesh Natarajan, Surya Santhan Thenarasu, Shawn Blanton, John P. Shen

Comments Accepted at IEEE ISPASS 2026. Copyright assigned to IEEE

2603.12455 2026-03-16 cs.CR cs.AI

Operationalising Cyber Risk Management Using AI: Connecting Cyber Incidents to MITRE ATT&CK Techniques, Security Controls, and Metrics

Emad Sherif, Iryna Yevseyeva, Vitor Basto-Fernandes, Allan Cook

2603.12450 2026-03-16 cs.CR cs.LG

Bridging the Gap Between Security Metrics and Key Risk Indicators: An Empirical Framework for Vulnerability Prioritization

Emad Sherif, Iryna Yevseyeva, Vitor Basto-Fernandes, Allan Cook

2603.12449 2026-03-16 physics.ao-ph cs.LG

FloeNet: A mass-conserving global sea ice emulator that generalizes across climates

William Gregory, Mitchell Bushuk, James Duncan, Elynn Wu, Adam Subel, Spencer K. Clark, Bill Hurlin, Oliver Watt-Meyer, Alistair Adcroft, Chris Bretherton, Laure Zanna

Comments 4 Figures, 18 supplementary figures

2603.12446 2026-03-16 cs.NI cs.SD

RadEar: A Self-Supervised RF Backscatter System for Voice Eavesdropping and Separation

Qijun Wang, Peihao Yan, Chunqi Qian, Huacheng Zeng

Comments Accepted by IEEE INFOCOM 2026

2603.12445 2026-03-16 eess.IV cs.AI cs.CV cs.LG

Unmasking Biases and Reliability Concerns in Convolutional Neural Networks Analysis of Cancer Pathology Images

Michael Okonoda, Eder Martinez, Abhilekha Dalal, Lior Shamir

Comments Electronics, published

2603.12440 2026-03-16 cs.DC cs.LG

KernelFoundry: Hardware-aware evolutionary GPU kernel optimization

Nina Wiedemann, Quentin Leboutet, Michael Paulitsch, Diana Wofk, Benjamin Ummenhofer

2603.12396 2026-03-16 cs.IR cs.AI

Test-Time Strategies for More Efficient and Accurate Agentic RAG

Brian Zhang, Deepti Guntur, Zhiyang Zuo, Abhinav Sharma, Shreyas Chaudhari, Wenlong Zhao, Franck Dernoncourt, Puneet Mathur, Ryan Rossi, Nedim Lipka

2603.12374 2026-03-16 econ.EM cs.LG

The Privacy-Utility Trade-Off of Location Tracking in Ad Personalization

Mohammad Mosaffa, Omid Rafieian

Comments 57 pages, 11 figures. Digital advertising, causal inference, and machine learning

2603.12368 2026-03-16 cs.IR cs.CL

Multi-Step Semantic Reasoning in Generative Retrieval

Steven Dong, Yubao Tang, Maarten de Rijke

Comments Accepted at ECIR2026

2603.12351 2026-03-16 stat.ML cs.LG q-bio.QM stat.CO stat.ME

Probabilistic Joint and Individual Variation Explained (ProJIVE) for Data Integration

Raphiel J. Murden, Ganzhong Tian, Deqiang Qiu, Benajmin B. Risk

2603.12316 2026-03-16 cond-mat.dis-nn cs.LG cs.NE

Pruning-induced phases in fully-connected neural networks: the eumentia, the dementia, and the amentia

Haining Pan, Nakul Aggarwal, J. H. Pixley

Comments 14 pages, 15 figures

2603.12290 2026-03-16 cs.IR cs.AI

Detecting Miscitation on the Scholarly Web through LLM-Augmented Text-Rich Graph Learning

Huidong Wu, Haojia Xiang, Jingtong Gao, Xiangyu Zhao, Dengsheng Wu, Jianping Li

2603.12274 2026-03-16 cs.MA cs.CE cs.CL

DIALECTIC: A Multi-Agent System for Startup Evaluation

Jae Yoon Bae, Simon Malberg, Joyce Galang, Andre Retterath, Georg Groh

Comments Accepted at EACL 2026 Industry Track

2603.12269 2026-03-16 cs.AR cs.AI cs.LG

DART: Input-Difficulty-AwaRe Adaptive Threshold for Early-Exit DNNs

Parth Patne, Mahdi Taheri, Christian Herglotz, Maksim Jenihhin, Milos Krstic, Michael Hübner

2603.12268 2026-03-16 cs.DC cs.LG

A Holistic Framework for Automated Configuration Recommendation for Cloud Service Monitoring

Anson Bastos, Shreeya Venneti, Anjaly Parayil, Ayush Choure, Chetan Bansal, Rujia Wang

2603.11850 2026-03-16 eess.IV cs.CV cs.DC

Deep Learning-based Assessment of the Relation Between the Third Molar and Mandibular Canal on Panoramic Radiographs using Local, Centralized, and Federated Learning

Johan Andreas Balle Rubak, Sara Haghighat, Sanyam Jain, Mostafa Aldesoki, Akhilanand Chaurasia, Sarah Sadat Ehsani, Faezeh Dehghan Ghanatkaman, Ahmad Badruddin Ghazali, Julien Issa, Basel Khalil, Rishi Ramani, Ruben Pauwels

2603.11647 2026-03-16 cs.MM cs.CV cs.SD

OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

Yaofeng Su, Yuming Li, Zeyue Xue, Jie Huang, Siming Fu, Haoran Li, Ying Li, Zezhong Qian, Haoyang Huang, Nan Duan

Comments 14 pages