arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.23103 2026-02-27 cs.CV

SpectralMamba-UNet: Frequency-Disentangled State Space Modeling for Texture-Structure Consistent Medical Image Segmentation

Fuhao Zhang, Lei Liu, Jialin Zhang, Ya-Nan Zhang, Nan Mu

详情

英文摘要

Accurate medical image segmentation requires effective modeling of both global anatomical structures and fine-grained boundary details. Recent state space models (e.g., Vision Mamba) offer efficient long-range dependency modeling. However, their one-dimensional serialization weakens local spatial continuity and high-frequency representation. To this end, we propose SpectralMamba-UNet, a novel frequency-disentangled framework to decouple the learning of structural and textural information in the spectral domain. Our Spectral Decomposition and Modeling (SDM) module applies discrete cosine transform to decompose low- and high-frequency features, where low frequency contributes to global contextual modeling via a frequency-domain Mamba and high frequency preserves boundary-sensitive details. To balance spectral contributions, we introduce a Spectral Channel Reweighting (SCR) mechanism to form channel-wise frequency-aware attention, and a Spectral-Guided Fusion (SGF) module to achieve adaptively multi-scale fusion in the decoder. Experiments on five public benchmarks demonstrate consistent improvements across diverse modalities and segmentation targets, validating the effectiveness and generalizability of our approach.

URL PDF HTML ☆

赞 0 踩 0

2602.23101 2026-02-27 cs.CV

Locally Adaptive Decay Surfaces for High-Speed Face and Landmark Detection with Event Cameras

Paul Kielty, Timothy Hanley, Peter Corcoran

详情

英文摘要

Event cameras record luminance changes with microsecond resolution, but converting their sparse, asynchronous output into dense tensors that neural networks can exploit remains a core challenge. Conventional histograms or globally-decayed time-surface representations apply fixed temporal parameters across the entire image plane, which in practice creates a trade-off between preserving spatial structure during still periods and retaining sharp edges during rapid motion. We introduce Locally Adaptive Decay Surfaces (LADS), a family of event representations in which the temporal decay at each location is modulated according to local signal dynamics. Three strategies are explored, based on event rate, Laplacian-of-Gaussian response, and high-frequency spectral energy. These adaptive schemes preserve detail in quiescent regions while reducing blur in regions of dense activity. Extensive experiments on the public data show that LADS consistently improves both face detection and facial landmark accuracy compared to standard non-adaptive representations. At 30 Hz, LADS achieves higher detection accuracy and lower landmark error than either baseline, and at 240 Hz it mitigates the accuracy decline typically observed at higher frequencies, sustaining 2.44 % normalized mean error for landmarks and 0.966 mAP50 in face detection. These high-frequency results even surpass the accuracy reported in prior works operating at 30 Hz, setting new benchmarks for event-based face analysis. Moreover, by preserving spatial structure at the representation stage, LADS supports the use of much lighter network architectures while still retaining real-time performance. These results highlight the importance of context-aware temporal integration for neuromorphic vision and point toward real-time, high-frequency human-computer interaction systems that exploit the unique advantages of event cameras.

URL PDF HTML ☆

赞 0 踩 0

2602.23093 2026-02-27 cs.AI cs.SI physics.soc-ph

Three AI-agents walk into a bar . . . . `Lord of the Flies' tribalism emerges among smart AI-Agents

Dhwanil M. Mori, Neil F. Johnson

2602.23088 2026-02-27 cs.CV

Cytoarchitecture in Words: Weakly Supervised Vision-Language Modeling for Human Brain Microscopy

Matthew Sutton, Katrin Amunts, Timo Dickscheid, Christian Schiffer

Comments 8 pages, 3 figures, submitted for inclusion at a conference

2602.23079 2026-02-27 cs.CL cs.CR cs.LG

Assessing Deanonymization Risks with Stylometry-Assisted LLM Agent

Boyang Zhang, Yang Zhang

2602.23075 2026-02-27 cs.CL cs.IR

CiteLLM: An Agentic Platform for Trustworthy Scientific Reference Discovery

Mengze Hong, Di Jiang, Chen Jason Zhang, Zichang Guo, Yawen Li, Jun Chen, Shaobo Cui, Zhiyang Su

Comments Accepted by TheWebConf 2026 Demo Track

2602.23071 2026-02-27 cs.CL cs.AI

Quantity Convergence, Quality Divergence: Disentangling Fluency and Accuracy in L2 Mandarin Prosody

Yuqi Shi, Hao Yang, Xiyao Lu, Jinsong Zhang

2602.23070 2026-02-27 cs.SD cs.AI cs.CL eess.AS

Make It Hard to Hear, Easy to Learn: Long-Form Bengali ASR and Speaker Diarization via Extreme Augmentation and Perfect Alignment

Sanjid Hasan, Risalat Labib, A H M Fuad, Bayazid Hasan

Comments 4 pages, 2 figures

2602.23068 2026-02-27 cs.SD

TADA: A Generative Framework for Speech Modeling via Text-Acoustic Dual Alignment

Trung Dang, Sharath Rao, Ananya Gupta, Christopher Gagne, Panagiotis Tzirakis, Alice Baird, Jakub Piotr Cłapa, Peter Chin, Alan Cowen

2602.23062 2026-02-27 cs.CL

Toward Automatic Filling of Case Report Forms: A Case Study on Data from an Italian Emergency Department

Gabriela Anna Kaczmarek, Pietro Ferrazzi, Lorenzo Porta, Vicky Rubini, Bernardo Magnini

2602.23060 2026-02-27 cs.LG

RhythmBERT: A Self-Supervised Language Model Based on Latent Representations of ECG Waveforms for Heart Disease Detection

Xin Wang, Burcu Ozek, Aruna Mohan, Amirhossein Ravari, Or Zilbershot, Fatemeh Afghah

2602.23057 2026-02-27 cs.CL cs.AI

Affine-Scaled Attention: Towards Flexible and Stable Transformer Attention

Jeongin Bae, Baeseong Park, Gunho Park, Minsub Kim, Joonhyung Lee, Junhee Yoo, Sunghyeon Woo, Jiwon Ryu, Se Jung Kwon, Dongsoo Lee

Comments Preprint. 14 pages, 11 figures

2602.23056 2026-02-27 cs.AI cs.SY eess.SY

Learning-based Multi-agent Race Strategies in Formula 1

Giona Fieni, Joschua Wüthrich, Marc-Philippe Neumann, Christopher H. Onder

2602.23053 2026-02-27 cs.RO

Marinarium: a New Arena to Bring Maritime Robotics Closer to Shore

Ignacio Torroba, David Dorner, Victor Nan Fernandez-Ayala, Mart Kartasev, Joris Verhagen, Elias Krantz, Gregorio Marchesini, Carl Ljung, Pedro Roque, Chelsea Sidrane, Linda Van der Spaa, Nicola De Carli, Petter Ogren, Christer Fuglesang, Jana Tumova, Dimos V. Dimarogonas, Ivan Stenius

2602.23051 2026-02-27 cs.RO

An Empirical Analysis of Cooperative Perception for Occlusion Risk Mitigation

Aihong Wang, Tenghui Xie, Fuxi Wen, Jun Li

Comments Accepted for publication in IEEE Internet of Things Journal (Regular Article), 2026. DOI: 10.1109/JIOT.2026.3668184

2602.23050 2026-02-27 cs.LG

Latent Matters: Learning Deep State-Space Models

Alexej Klushyn, Richard Kurle, Maximilian Soelch, Botond Cseke, Patrick van der Smagt

Comments Published at NeurIPS 2021

2602.23043 2026-02-27 cs.CV

D-FINE-seg: Object Detection and Instance Segmentation Framework with multi-backend deployment

Argo Saakyan, Dmitry Solntsev

Comments 6 pages, 4 figures, 5 tables

2602.23035 2026-02-27 cs.LG

Learning Disease-Sensitive Latent Interaction Graphs From Noisy Cardiac Flow Measurements

Viraj Patel, Marko Grujic, Philipp Aigner, Theodor Abart, Marcus Granegger, Deblina Bhattacharjee, Katharine Fraser

2602.23031 2026-02-27 cs.CV

Small Object Detection Model with Spatial Laplacian Pyramid Attention and Multi-Scale Features Enhancement in Aerial Images

Zhangjian Ji, Huijia Yan, Shaotong Qiao, Kai Feng, Wei Wei

2602.23017 2026-02-27 cs.RO

DigiArm: An Anthropomorphic 3D-Printed Prosthetic Hand with Enhanced Dexterity for Typing Tasks

Dean Zadok, Tom Naamani, Yuval Bar-Ratson, Elisha Barash, Oren Salzman, Alon Wolf, Alex M. Bronstein, Nili Krausz

2602.22998 2026-02-27 cs.RO

A Perspective on Open Challenges in Deformable Object Manipulation

Ryan Paul McKennaa, John Oyekan

Comments 28 pages, 7 Figures

2602.22988 2026-02-27 cs.LG cs.AI

Residual Koopman Spectral Profiling for Predicting and Preventing Transformer Training Instability

Bum Jun Kim, Shohei Taniguchi, Makoto Kawano, Yusuke Iwasawa, Yutaka Matsuo

Comments 23 pages, 7 figures

2602.22981 2026-02-27 cs.AI

RepSPD: Enhancing SPD Manifold Representation in EEGs via Dynamic Graphs

Haohui Jia, Zheng Chen, Lingwei Zhu, Xu Cao, Yasuko Matsubara, Takashi Matsubara, Yasushi Sakurai

2602.22973 2026-02-27 cs.AI

Modeling Expert AI Diagnostic Alignment via Immutable Inference Snapshots

Dimitrios P. Panagoulias, Evangelia-Aikaterini Tsichrintzi, Georgios Savvidis, Evridiki Tsoureli-Nikita

2602.22963 2026-02-27 cs.AI

FactGuard: Agentic Video Misinformation Detection via Reinforcement Learning

Zehao Li, Hongwei Yu, Hao Jiang, Qiang Sheng, Yilong Xu, Baolong Bi, Yang Li, Zhenlong Yuan, Yujun Cai, Zhaoqi Wang

2602.22960 2026-02-27 cs.CV

UCM: Unifying Camera Control and Memory with Time-aware Positional Encoding Warping for World Models

Tianxing Xu, Zixuan Wang, Guangyuan Wang, Li Hu, Zhongyi Zhang, Peng Zhang, Bang Zhang, Song-Hai Zhang

Comments Project Page: https://humanaigc.github.io/ucm-webpage/

2602.22955 2026-02-27 cs.CV cs.AI

MM-NeuroOnco: A Multimodal Benchmark and Instruction Dataset for MRI-Based Brain Tumor Diagnosis

Feng Guo, Jiaxiang Liu, Yang Li, Qianqian Shi, Mingkun Xu

2602.22952 2026-02-27 cs.RO

Automated Robotic Needle Puncture for Percutaneous Dilatational Tracheostomy

Yuan Tang, Bruno V. Adorno, Brendan A. McGrath, Andrew Weightman

2602.22945 2026-02-27 cs.CV

Cross-Task Benchmarking of CNN Architectures

Kamal Sherawat, Vikrant Bhati

2602.22940 2026-02-27 cs.RO

Considering Perspectives for Automated Driving Ethics: Collective Risk in Vehicular Motion Planning

Leon Tolksdorf, Arturo Tejada, Christian Birkner, Nathan van de Wouw

Comments 17 pages, 6 figures, 2 tables