arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2601.16187 2026-01-23 cs.MA cs.GT cs.SY eess.SY

Average Unfairness in Routing Games

Pan-Yang Su, Arwa Alanqary, Bryce L. Ferguson, Manxi Wu, Alexandre M. Bayen, Shankar Sastry

Comments 14 pages, 5 figures, 1 table. Accepted for publication at the 25th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2026)

2408.09253 2026-01-23 cs.RO cs.SY eess.SY

Reinforcement Learning Compensated Model Predictive Control for Off-road Driving on Unknown Deformable Terrain

Prakhar Gupta, Jonathon M. Smereka, Yunyi Jia

Comments Submitted to IEEE Transactions on Intelligent Vehicles as a Regular Paper; was withdrawn in March 2025. A revised version of this manuscript was submitted to ACC 2025 review as a regular paper in Sep 2025

2601.16149 2026-01-23 eess.SY cs.SY math.OC

Interconnection-based Model Reduction for Linear Hybrid Systems

Zirui Niu, Giordano Scarciotti, Alessandro Astolfi

Comments 17 pages

2601.16077 2026-01-23 eess.AS

Loose coupling of spectral and spatial models for multi-channel diarization and enhancement of meetings in dynamic environments

Adrian Meise, Tobias Cord-Landwehr, Christoph Boeddeker, Marc Delcroix, Tomohiro Nakatani, Reinhold Haeb-Umbach

Comments Accepted at ICASSP 2026

2601.16064 2026-01-23 eess.IV cs.CV

Phi-SegNet: Phase-Integrated Supervision for Medical Image Segmentation

Shams Nafisa Ali, Taufiq Hasan

Comments 10 pages, 7 figures

详情

英文摘要

Deep learning has substantially advanced medical image segmentation, yet achieving robust generalization across diverse imaging modalities and anatomical structures remains a major challenge. A key contributor to this limitation lies in how existing architectures, ranging from CNNs to Transformers and their hybrids, primarily encode spatial information while overlooking frequency-domain representations that capture rich structural and textural cues. Although few recent studies have begun exploring spectral information at the feature level, supervision-level integration of frequency cues-crucial for fine-grained object localization-remains largely untapped. To this end, we propose Phi-SegNet, a CNN-based architecture that incorporates phase-aware information at both architectural and optimization levels. The network integrates Bi-Feature Mask Former (BFMF) modules that blend neighboring encoder features to reduce semantic gaps, and Reverse Fourier Attention (RFA) blocks that refine decoder outputs using phase-regularized features. A dedicated phase-aware loss aligns these features with structural priors, forming a closed feedback loop that emphasizes boundary precision. Evaluated on five public datasets spanning X-ray, US, histopathology, MRI, and colonoscopy, Phi-SegNet consistently achieved state-of-the-art performance, with an average relative improvement of 1.54+/-1.26% in IoU and 0.98+/-0.71% in F1-score over the next best-performing model. In cross-dataset generalization scenarios involving unseen datasets from the known domain, Phi-SegNet also exhibits robust and superior performance, highlighting its adaptability and modality-agnostic design. These findings demonstrate the potential of leveraging spectral priors in both feature representation and supervision, paving the way for generalized segmentation frameworks that excel in fine-grained object localization.

URL PDF HTML ☆

赞 0 踩 0

2601.16062 2026-01-23 cs.RO cs.SY eess.SY

Improve the autonomy of the SE2(3) group based Extended Kalman Filter for Integrated Navigation: Theoretical Analysis

Jiarui Cui, Maosong Wang, Wenqi Wu, Peiqi Li, Xianfei Pan

2601.16061 2026-01-23 eess.SY cs.SY

Dynamic Tactile Sensing System and Soft Actor Critic Reinforcement Learning for Inclusion Characterization

John Bannan, Nazia Rahman, Chang-Hee Won

2601.16054 2026-01-23 eess.SP

Hybrid Channel Estimation with Quantized Phase Feedback for Over-the-Air Computation

Martin Dahl, Erik G. Larsson

Comments ICASSP 2026

2601.16023 2026-01-23 eess.AS cs.HC

Timbre-Aware LLM-based Direct Speech-to-Speech Translation Extendable to Multiple Language Pairs

Lalaram Arya, Mrinmoy Bhattacharjee, Adarsh C. R., S. R. Mahadeva Prasanna

Comments 13 pages

详情

英文摘要

Direct Speech-to-Speech Translation (S2ST) has gained increasing attention for its ability to translate speech from one language to another, while reducing error propagation and latency inherent in traditional cascaded pipelines. However, existing direct S2ST systems continue to face notable challenges, including instability in semantic-acoustic alignment when parallel speech data is scarce, difficulty in preserving speaker identity, and limited multilingual scalability. In this work, we introduce DS2ST-LM, a scalable, single-stage direct S2ST framework leveraging a multilingual Large Language Model (LLM). The architecture integrates a Whisper speech encoder, a learnable projection module, a Qwen2-0.5B LLM, and a timbre-controlled vocoder. We construct GigaS2S-1000, a 1000-hour bilingual corpus by extending the GigaST dataset with high-fidelity synthetic target speech, and show that this synthetic data alleviates data scarcity to some extent. We investigate two semantic token generation strategies: speech-derived S3 tokens and text-derived tokens generated by a pre-trained LLM, and analyze their impact on training stability and semantic consistency. We further evaluate three projection architectures (Linear, Conv1D-Linear, and Q-Former) and observe that while higher-capacity projectors converge faster, the simple Linear projector achieves higher performance. Extensive experiments demonstrate that DS2ST-LM outperforms traditional cascaded and ST (Qwen-Audio) + TTS baselines across both lexical (BLEU, METEOR) and semantic (BLEURT, COMET) metrics, while extending to multiple language pairs, including French, Spanish, German, Hindi, Bengali, and Urdu. Furthermore, we incorporate timbre-aware speech synthesis to preserve speaker information, enabling DS2ST-LM to surpass prior direct S2ST systems in both speaker similarity and perceptual naturalness.

URL PDF HTML ☆

赞 0 踩 0

2601.16014 2026-01-23 eess.SY cs.SY

Stability Analysis of Power-Electronics-Dominated Grids Using Scaled Relative Graphs

Eder Baron-Prada, Adolfo Anta, Florian Dörfler

Comments Submitted to possible publication

2601.16012 2026-01-23 eess.SP

Low-Complexity Sparse Superimposed Coding for Ultra Reliable Low Latency Communications

Yanfeng Zhang, Xi'an Fan, Xu Zhu, Jinkai Zheng, Hui Liang, Weiwei Yang, Tom H. Luan

2601.16011 2026-01-23 eess.IV cs.AI

THOR: A Versatile Foundation Model for Earth Observation Climate and Society Applications

Theodor Forgaard, Jarle H. Reksten, Anders U. Waldeland, Valerio Marsocci, Nicolas Longépé, Michael Kampffmeyer, Arnt-Børre Salberg

Comments 25 pages

2601.15973 2026-01-23 eess.SP cs.IT math.IT

Performance Scaling Laws for PD Array-based Receivers in IM/DD Optical Wireless Communication Systems

Aravindh Krishnamoorthy, Robert Schober, Harald Haas

Comments 5 pages, 4 figures. This work has been submitted to the IEEE for possible publication

2601.15952 2026-01-23 eess.SP q-bio.QM

Reconstructing Patched or Partial Holograms to allow for Whole Slide Imaging with a Self-Referencing Holographic Microscope

Philip Groult, Julia D. Sistermanns, Ellen Emken, Oliver Hayden, Wolfgang Utschick

Comments \c{opyright} 2026 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

2601.15872 2026-01-23 cs.SD cs.CV cs.LG cs.MM eess.AS

PF-D2M: A Pose-free Diffusion Model for Universal Dance-to-Music Generation

Jaekwon Im, Natalia Polouliakh, Taketo Akama

Comments 4 pages, 2 figures

2601.15863 2026-01-23 eess.SP

Time-Varying Rician K-factor in Measured Vehicular Channels at cmWave and mmWave Bands

Faruk Pasic, Markus Hofer, Thomas Zemen, Andreas F. Molisch, Christoph F. Mecklenbräuker

Comments Published at the 19th European Conference on Antennas and Propagation (EuCAP), 2025

Journal ref 2025 19th European Conference on Antennas and Propagation (EuCAP), 2025

2601.15831 2026-01-23 eess.SP

Performance Analysis of Digital Beamforming mmWave MIMO with Low-Resolution DACs/ADCs

Faruk Pasic, Mariam Mussbah, Stefan Schwarz, Markus Rupp, Fredrik Tufvesson, Christoph F. Mecklenbräuker

Comments Published at the IEEE Radio and Antenna Days of the Indian Ocean (RADIO), 2025

Journal ref 2025 IEEE Radio and Antenna Days of the Indian Ocean (RADIO), 2025

2601.15819 2026-01-23 eess.SP

Dual-Mapping Sparse Vector Transmission for Short Packet URLLC

Yanfeng Zhang, Xu Zhu, Jinkai Zheng, Weiwei Yang, Xianhua Yu, Haiyong Zeng, Yujie Liu, Yong Liang Guan

2601.15816 2026-01-23 eess.SY cs.AI cs.SY

Virtual Traffic Police: Large Language Model-Augmented Traffic Signal Control for Unforeseen Incidents

Shiqi Wei, Qiqing Wang, Kaidi Yang

2601.15733 2026-01-23 eess.SP

Bistatic ISAC: Practical Challenges and Solutions

Lucas Giroto, Marcus Henninger, Alexander Felix, Maximilian Bauhofer, Taewon Jeong, Umut Utku Erdem, Stephan ten Brink, Thomas Zwick, Benjamin Nuss, Silvio Mandelli

2601.15729 2026-01-23 cs.RO cs.AI cs.SY eess.SY

DualShield: Safe Model Predictive Diffusion via Reachability Analysis for Interactive Autonomous Driving

Rui Yang, Lei Zheng, Ruoyu Yao, Jun Ma

Comments 8 pages, 5 figures

2601.15676 2026-01-23 cs.SD cs.LG eess.AS

Bridging the Perception Gap: A Lightweight Coarse-to-Fine Architecture for Edge Audio Systems

Hengfan Zhang, Yueqian Lin, Hai Helen Li, Yiran Chen

Comments 10 pages, 3 figures, 2 tables. Preprint

2601.15653 2026-01-23 eess.AS eess.SP

Distributed Multichannel Active Noise Control with Asynchronous Communication

Junwei Ji, Dongyuan Shi, Boxiang Wang, Ziyi Yang, Haowen Li, Woon-Seng Gan

2601.15626 2026-01-23 eess.SY cs.AI cs.SY

Bridging Qualitative Rubrics and AI: A Binary Question Framework for Criterion-Referenced Grading in Engineering

Lili Chen, Winn Wing-Yiu Chow, Stella Peng, Bencheng Fan, Sachitha Bandara

Comments Proceedings of the 36th Annual Conference of the Australasian Association for Engineering Education (AAEE 2025)

2601.15622 2026-01-23 eess.SY cs.SY

Design, Modelling, and Control of Magnetic Ball Suspension System

Sampson E. Nwachukwu

Comments 8 pages

2601.15621 2026-01-23 cs.SD cs.CL eess.AS

Qwen3-TTS Technical Report

Hangrui Hu, Xinfa Zhu, Ting He, Dake Guo, Bin Zhang, Xiong Wang, Zhifang Guo, Ziyue Jiang, Hongkun Hao, Zishan Guo, Xinyu Zhang, Pei Zhang, Baosong Yang, Jin Xu, Jingren Zhou, Junyang Lin

Comments https://github.com/QwenLM/Qwen3-TTS

2601.15602 2026-01-23 eess.SP cs.IT math.IT

Does 6G Need a New Waveform: Comparing Zak-OTFS with CP-OFDM

Imran Ali Khan, Saif Khan Mohammed, Ronny Hadani, Ananthanarayanan Chockalingam, Robert Calderbank, Anton Monk, Shachar Kons, Shlomo Rakib, Yoav Hebron

Comments This work has been submitted to the IEEE for possible publication

2601.15597 2026-01-23 cs.LG eess.SP

Neural Nonlinear Shrinkage of Covariance Matrices for Minimum Variance Portfolio Optimization

Liusha Yang, Siqi Zhao, Shuqi Chai

2601.15596 2026-01-23 cs.SD cs.AI eess.AS

DeepASMR: LLM-Based Zero-Shot ASMR Speech Generation for Anyone of Any Voice

Leying Zhang, Tingxiao Zhou, Haiyang Sun, Mengxiao Bi, Yanmin Qian

2601.15584 2026-01-23 eess.SP

Amalgamated CHIRP and OFDM for ISAC

Pankaj Kumar, Mohammed El-Hajjar, Ibrahim A. Hemadeh, Yasser Mestrah, Suraj Srivastava, Aditya K. Jagannatham, Lajos Hanzo