arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2506.12007 2026-02-11 cs.LG cs.CV physics.comp-ph

SIMSHIFT: A Benchmark for Adapting Neural Surrogates to Distribution Shifts

Paul Setinek, Gianluca Galletti, Thomas Gross, Dominik Schnürer, Johannes Brandstetter, Werner Zellinger

2506.10357 2026-02-11 cs.AI

Optimus-3: Dual-Router Aligned Mixture-of-Experts Agent with Dual-Granularity Reasoning-Aware Policy Optimization

Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Weili Guan, Dongmei Jiang, Yaowei Wang, Liqiang Nie

Comments 16 pages, 12 figures

详情

英文摘要

Developing generalist agents capable of solving open-ended tasks in visually rich, dynamic environments remains a core pursuit of embodied AI. While Minecraft has emerged as a compelling benchmark, existing agents often suffer from fragmented cognitive abilities, lacking the synergy between reflexive execution (System 1) and deliberative reasoning (System 2). In this paper, we introduce Optimus-3, a generalist agent that organically integrates these dual capabilities within a unified framework. To achieve this, we address three fundamental challenges. First, to overcome the scarcity of reasoning data, we propose a Knowledge-Enhanced Automated Data Generation Pipeline. It synthesizes high-quality System 2 reasoning traces from raw System 1 interaction trajectories, effectively mitigating hallucinations via injection of domain knowledge. We release the resulting dataset, \textbf{OptimusM$^{4}$}, to the community. Second, to reconcile the dichotomous computational requirements of the dual systems, we design a Dual-Router Aligned MoE Architecture. It employs a Task Router to prevent task interference via parameter decoupling, and a Layer Router to dynamically modulate reasoning depth, creating a computational ``Fast Path'' for System 1 and a ``Deep Path'' for System 2. Third, to activate the reasoning capabilities of System 2, we propose Dual-Granularity Reasoning-Aware Policy Optimization (DGRPO) algorithm. It enforces Process-Outcome Co-Supervision via dual-granularity dense rewards, ensuring consistency between the thought process and the answer. Extensive evaluations demonstrate that Optimus-3 surpasses existing state-of-the-art methods on both System~2 (21$\%$ on Planning, 66\% on Captioning, 76\% on Embodied QA, 3.4$\times$ on Grounding, and 18\% on Reflection) and System~1 (3\% on Long-Horizon Action) tasks, with a notable 60\% success rate on open-ended tasks.

URL PDF HTML ☆

赞 0 踩 0

2506.09278 2026-02-11 cs.CV cs.LG cs.RO

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu, Bhuvan Jhamb, Yutian Chen, Yuheng Qiu, Jay Karhade, Shreyas Jha, Yaoyu Hu, Deva Ramanan, Sebastian Scherer, Wenshan Wang

Comments Project Page: https://uniflowmatch.github.io/

2506.01102 2026-02-11 cs.CV

Keystep Recognition using Graph Neural Networks

Julia Lee Romero, Kyle Min, Subarna Tripathi, Morteza Karimzadeh

Journal ref Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2025, pp. 7624-7633

2505.19013 2026-02-11 cs.LG cs.AI econ.GN q-fin.EC stat.ML

Faithful Group Shapley Value

Kiljae Lee, Ziqi Liu, Weijing Tang, Yuan Zhang

Comments Accepted to NeurIPS 2025

2505.18083 2026-02-11 cs.LG cs.RO

What Do You Need for Compositional Generalization in Diffusion Planning?

Quentin Clark, Florian Shkurti

Comments 8 Pages

2505.17653 2026-02-11 cs.AI

GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs

Shixian Luo, Zezhou Zhu, Yu Yuan, Yuncheng Yang, Lianlei Shan, Yong Wu

Comments Accepted to ICLR 2026

2505.15355 2026-02-11 cs.CL cs.LG cs.NE cs.SD eess.AS

Decoding Phone Pairs from MEG Signals Across Speech Modalities

Xabier de Zuazo, Eva Navas, Ibon Saratxaga, Mathieu Bourguignon, Nicola Molinaro

Comments 21 pages, 4 figures, 1 graphical abstract, submitted to Computer Speech and Language (special issue on Iberian Languages)

详情

DOI: 10.1016/j.csl.2026.101939

英文摘要

Understanding the neural mechanisms underlying speech production is essential for both advancing cognitive neuroscience theory and developing practical communication technologies. In this study, we investigated magnetoencephalography signals to decode phones from brain activity during speech production and perception (passive listening and voice playback) tasks. Using a dataset comprising 17 participants, we performed pairwise phone classification, extending our analysis to 15 phonetic pairs. Multiple machine learning approaches, including regularized linear models and neural network architectures, were compared to determine their effectiveness in decoding phonetic information. Our results demonstrate significantly higher decoding accuracy during speech production (76.6%) compared to passive listening and playback modalities (~51%), emphasizing the richer neural information available during overt speech. Among the models, the Elastic Net classifier consistently outperformed more complex neural networks, highlighting the effectiveness of traditional regularization techniques when applied to limited and high-dimensional MEG datasets. Besides, analysis of specific brain frequency bands revealed that low-frequency oscillations, particularly Delta (0.2-3 Hz) and Theta (4-7 Hz), contributed the most substantially to decoding accuracy, suggesting that these bands encode critical speech production-related neural processes. Despite using advanced denoising methods, it remains unclear whether decoding solely reflects neural activity or if residual muscular or movement artifacts also contributed, indicating the need for further methodological refinement. Overall, our findings underline the critical importance of examining overt speech production paradigms, which, despite their complexity, offer opportunities to improve brain-computer interfaces to help individuals with severe speech impairments.

URL PDF HTML ☆

赞 0 踩 0

2505.13027 2026-02-11 cs.LG

Deconstructing Positional Information: From Attention Logits to Training Biases

Zihan Gu, Ruoyu Chen, Han Zhang, Hua Zhang, Yue Hu

Comments Accepted by ICLR 2026

2505.12709 2026-02-11 cs.LG

Pave Your Own Path: Graph Gradual Domain Adaptation on Fused Gromov-Wasserstein Geodesics

Zhichen Zeng, Ruizhong Qiu, Wenxuan Bao, Tianxin Wei, Xiao Lin, Yuchen Yan, Tarek F. Abdelzaher, Jiawei Han, Hanghang Tong

Comments 35 pages, 10 figures

2505.11239 2026-02-11 cs.LG

Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks

Wilson Wongso, Hao Xue, Flora D. Salim

2505.10936 2026-02-11 cs.CL

Cochain: Balancing Insufficient and Excessive Collaboration in LLM Agent Workflows

Jiaxing Zhao, Hongbin Xie, Yuzhen Lei, Xuan Song, Zhuoran Shi, Lianxin Li, Shuangxue Liu, Linguo Xie, Haoran Zhang

Comments 35 pages, 23 figures

2505.10438 2026-02-11 cs.LG cs.SY eess.SY

Koopman Eigenfunction-Based Identification and Optimal Nonlinear Control of Turbojet Engine

David Grasev

Comments 35 pages, 29 figures Accepted for publication in Springer Nonlinear Dynamics

Journal ref Nonlinear Dyn 114, 205 (2026)

2505.02076 2026-02-11 cs.AI cs.MA

Leveraging LLM Agents and Digital Twins for Fault Handling in Process Plants

Milapji Singh Gill, Javal Vyas, Artan Markaj, Felix Gehlhoff, Mehmet Mercangöz

2504.17490 2026-02-11 cs.LG cs.AI

Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning

Mingqi Yuan, Qi Wang, Guozheng Ma, Caihao Sun, Bo Li, Xin Jin, Yunbo Wang, Xiaokang Yang, Wenjun Zeng, Dacheng Tao, Jiayu Chen

Comments 21 pages, 7 figures

2504.16612 2026-02-11 cs.CV cs.LG

Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections

Max Kirchner, Alexander C. Jenke, Sebastian Bodenstedt, Fiona R. Kolbinger, Oliver L. Saldanha, Jakob N. Kather, Martin Wagner, Stefanie Speidel

Comments Preprint submitted to MIDL

2504.16081 2026-02-11 cs.CV cs.CL

Survey of Video Diffusion Models: Foundations, Implementations, and Applications

Yimu Wang, Xuye Liu, Wei Pang, Li Ma, Shuai Yuan, Paul Debevec, Ning Yu

Comments Accepted by TMLR

2504.15688 2026-02-11 cs.CL

Subject islands do not reduce to construction-specific discourse function

Mandy Cartner, Matthew Kogan, Nikolas Webster, Matthew Wagers, Ivy Sichel

Journal ref Cognition, Volume 271, 2026, 106467, ISSN 0010-0277

详情

DOI: 10.1016/j.cognition.2026.106467

英文摘要

The term islands in linguistics refers to phrases from which extracting an element results in ungrammaticality (Ross, 1967). Grammatical subjects are considered islands because extracting a sub-part of a subject results in an ill-formed sentence, despite having a clear intended meaning (e.g., "Which topic did the article about inspire you?"). The generative tradition, which views syntax as autonomous of meaning and function, attributes this ungrammaticality to the abstract movement dependency between the wh-phrase and the subject-internal position with which it is associated for interpretation. However, research on language that emphasizes its communicative function suggests instead that syntactic constraints, including islands, can be explained based on the way different constructions package information. Accordingly, Abeillé et al. (2020) suggest that the islandhood of subjects is specific to the information structure of wh-questions, and propose that subjects are not islands for movement, but for focusing, due to their discourse-backgroundedness. This predicts that other constructions that differ in their information structure from wh-questions, but still involve movement, should not create a subject island effect. We test this prediction in three large-scale acceptability studies, using a super-additive design that singles out subject island violations, in three different constructions: wh-questions, relative clauses, and topicalization. We report evidence for a subject island effect in each construction type, despite only wh-questions introducing what Abeillé et al. (2020) call "a clash in information structure." We argue that this motivates an account of islands in terms of abstract, syntactic representations, independent of the communicative function associated with the constructions.

URL PDF HTML ☆

赞 0 踩 0

2504.01928 2026-02-11 cs.CL cs.LG

Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure

Boshi Wang, Huan Sun

Comments ICLR 2026

2503.22516 2026-02-11 cs.LG cs.CV

Ice-FMBench: A Foundation Model Benchmark for Sea Ice Type Segmentation

Samira Alkaee Taleghan, Morteza Karimzadeh, Andrew P. Barrett, Walter N. Meier, Farnoush Banaei-Kashani

Journal ref ACM ACM SIGSPATIAL PoIDS 2025

详情

DOI: 10.1145/3764922.3771202

英文摘要

Accurate segmentation and mapping of sea ice types is crucial for safe polar navigation, offshore operations, and climate monitoring. While deep learning has demonstrated strong potential for automating sea ice type segmentation, its success often relies on access to extensive expert labeled datasets, which is both resource intensive and time consuming to create. However, foundation models (FMs), recently developed through self-supervised training on large-scale datasets, have demonstrated impressive performance. Nevertheless, their applicability to sea ice type segmentation based on Synthetic Aperture Radar (SAR) imagery remains uncertain due to the unique challenges posed by sea ice such as intricate geophysical patterns, pronounced seasonal variability, and SAR-specific artifacts like banding, scalloping, and heterogeneous backscatter as well as the fact that SAR data in polar regions are often acquired using specialized sensor modes that differ markedly from those used to collect FM training data at lower latitudes, limiting their direct transferability to polar environments. To address this gap, we contribute: (1) IceFMBench, a comprehensive benchmark framework for evaluation of the state-of-the-art remote sensing FMs on the sea ice type segmentation task using Sentinel1 SAR imagery, where IceFMBench is composed of a widely used standardized dataset, diverse evaluation metrics, and a representative set of selected remote sensing FM models suitable for sea ice type segmentation, with the ability to include new models side by side the existing models; (2) an extensive comparative evaluation of the representative FMs using IceFMBench, with additional case studies to assess performance of the top-performing model in terms of transferability across temporal and spatial domains and (3) a multi teacher knowledge distillation approach to address lack of spatiotemporal transferability.

URL PDF HTML ☆

赞 0 踩 0

2503.20240 2026-02-11 cs.CV

Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models

Prin Phunyaphibarn, Phillip Y. Lee, Jaihoon Kim, Minhyuk Sung

Comments WACV 2026; Project Page: https://unconditional-priors-matter.github.io/

2503.20127 2026-02-11 cs.RO cs.NI

TURBO: Utility-Aware Bandwidth Allocation for Cloud-Augmented Autonomous Control

Peter Schafhalter, Alexander Krentsel, Hongbo Wei, Joseph E. Gonzalez, Sylvia Ratnasamy, Scott Shenker, Ion Stoica

Comments 34 pages, 13 figures

2503.18753 2026-02-11 cs.CV

Self-Supervised Learning Based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation

Qin Wang, Alessio Quercia, Benjamin Bruns, Abigail Morrison, Hanno Scharr, Kai Krajsek

Comments AAAI2026 oral

详情

英文摘要

Self-supervised learning (SSL) methods have achieved remarkable success in learning image representations allowing invariances in them - but therefore discarding transformation information that some computer vision tasks actually require. While recent approaches attempt to address this limitation by learning equivariant features using linear operators in feature space, they impose restrictive assumptions that constrain flexibility and generalization. We introduce a weaker definition for the transformation relation between image and feature space denoted as equivariance-coherence. We propose a novel SSL auxiliary task that learns equivariance-coherent representations through intermediate transformation reconstruction, which can be integrated with existing joint embedding SSL methods. Our key idea is to reconstruct images at intermediate points along transformation paths, e.g. when training on 30-degree rotations, we reconstruct the 10-degree and 20-degree rotation states. Reconstructing intermediate states requires the transformation information used in augmentations, rather than suppressing it, and therefore fosters features containing the augmented transformation information. Our method decomposes feature vectors into invariant and equivariant parts, training them with standard SSL losses and reconstruction losses, respectively. We demonstrate substantial improvements on synthetic equivariance benchmarks while maintaining competitive performance on downstream tasks requiring invariant representations. The approach seamlessly integrates with existing SSL methods (iBOT, DINOv2) and consistently enhances performance across diverse tasks, including segmentation, detection, depth estimation, and video dense prediction. Our framework provides a practical way for augmenting SSL methods with equivariant capabilities while preserving invariant performance.

URL PDF HTML ☆

赞 0 踩 0

2503.17684 2026-02-11 cs.CL cs.AI

Can LLMs Automate Fact-Checking Article Writing?

Dhruv Sahnan, David Corney, Irene Larraz, Giovanni Zagni, Ruben Miguez, Zhuohan Xie, Iryna Gurevych, Elizabeth Churchill, Tanmoy Chakraborty, Preslav Nakov

Comments Accepted to TACL 2026, pre-MIT Press publication version

2503.11146 2026-02-11 cs.LG

Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning

Jisoo Kim, Sungmin Kang, Sunwoo Lee

Comments NeurIPS 2025

2503.07038 2026-02-11 cs.CV

Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

Michael Green, Matan Levy, Issar Tzachor, Dvir Samuel, Nir Darshan, Rami Ben-Ari

Comments Accepted to NeurIPS 2025

Journal ref The Thirty-nine Annual Conference on Neural Information Processing Systems, 2025

2503.03200 2026-02-11 cs.CV cs.RO

Transformer-Based Spatio-Temporal Association of Apple Fruitlets

Harry Freeman, George Kantor

Journal ref 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, China, 2025, pp. 3018-3025

2502.11067 2026-02-11 cs.LG

A Survey on Active Feature Acquisition Strategies

Linus Aronsson, Arman Rahbar, Morteza Haghir Chehreghani

2502.08987 2026-02-11 cs.LG cs.AI

Neural Force Field: Few-shot Learning of Generalized Physical Reasoning

Shiqian Li, Ruihong Shen, Yaoyu Tao, Chi Zhang, Yixin Zhu

Comments 27 pages, ICLR 2026

2502.06747 2026-02-11 cs.CV

Wandering around: A bioinspired approach to visual attention through object motion sensitivity

Giulia D'Angelo, Victoria Clerico, Chiara Bartolozzi, Matej Hoffmann, P. Michael Furlong, Alexander Hadjiivanov

Journal ref Neuromorphic Computing and Engineering 5.2 (2025): 024019

详情

DOI: 10.1088/2634-4386/addc90

英文摘要

Active vision enables dynamic visual perception, offering an alternative to static feedforward architectures in computer vision, which rely on large datasets and high computational resources. Biological selective attention mechanisms allow agents to focus on salient Regions of Interest (ROIs), reducing computational demand while maintaining real-time responsiveness. Event-based cameras, inspired by the mammalian retina, enhance this capability by capturing asynchronous scene changes enabling efficient low-latency processing. To distinguish moving objects while the event-based camera is in motion the agent requires an object motion segmentation mechanism to accurately detect targets and center them in the visual field (fovea). Integrating event-based sensors with neuromorphic algorithms represents a paradigm shift, using Spiking Neural Networks to parallelize computation and adapt to dynamic environments. This work presents a Spiking Convolutional Neural Network bioinspired attention system for selective attention through object motion sensitivity. The system generates events via fixational eye movements using a Dynamic Vision Sensor integrated into the Speck neuromorphic hardware, mounted on a Pan-Tilt unit, to identify the ROI and saccade toward it. The system, characterized using ideal gratings and benchmarked against the Event Camera Motion Segmentation Dataset, reaches a mean IoU of 82.2% and a mean SSIM of 96% in multi-object motion segmentation. The detection of salient objects reaches 88.8% accuracy in office scenarios and 89.8% in low-light conditions on the Event-Assisted Low-Light Video Object Segmentation Dataset. A real-time demonstrator shows the system's 0.12 s response to dynamic scenes. Its learning-free design ensures robustness across perceptual scenes, making it a reliable foundation for real-time robotic applications serving as a basis for more complex architectures.

URL PDF HTML ☆

赞 0 踩 0

AI 大模型

视觉与机器人

科学与医疗

SIMSHIFT: A Benchmark for Adapting Neural Surrogates to Distribution Shifts

Optimus-3: Dual-Router Aligned Mixture-of-Experts Agent with Dual-Granularity Reasoning-Aware Policy Optimization

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Keystep Recognition using Graph Neural Networks

Faithful Group Shapley Value

What Do You Need for Compositional Generalization in Diffusion Planning?

GeoGramBench: Benchmarking the Geometric Program Reasoning in Modern LLMs

Decoding Phone Pairs from MEG Signals Across Speech Modalities

Deconstructing Positional Information: From Attention Logits to Training Biases

Pave Your Own Path: Graph Gradual Domain Adaptation on Fused Gromov-Wasserstein Geodesics

Massive-STEPS: Massive Semantic Trajectories for Understanding POI Check-ins -- Dataset and Benchmarks

Cochain: Balancing Insufficient and Excessive Collaboration in LLM Agent Workflows

Koopman Eigenfunction-Based Identification and Optimal Nonlinear Control of Turbojet Engine

Leveraging LLM Agents and Digital Twins for Fault Handling in Process Plants

Plasticine: Accelerating Research in Plasticity-Motivated Deep Reinforcement Learning

Federated EndoViT: Pretraining Vision Transformers via Federated Learning on Endoscopic Image Collections

Survey of Video Diffusion Models: Foundations, Implementations, and Applications

Subject islands do not reduce to construction-specific discourse function

Is the Reversal Curse a Binding Problem? Uncovering Limitations of Transformers from a Basic Generalization Failure

Ice-FMBench: A Foundation Model Benchmark for Sea Ice Type Segmentation

Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models

TURBO: Utility-Aware Bandwidth Allocation for Cloud-Augmented Autonomous Control

Self-Supervised Learning Based on Transformed Image Reconstruction for Equivariance-Coherent Feature Representation

Can LLMs Automate Fact-Checking Article Writing?

Layer-wise Update Aggregation with Recycling for Communication-Efficient Federated Learning

Find your Needle: Small Object Image Retrieval via Multi-Object Attention Optimization

Transformer-Based Spatio-Temporal Association of Apple Fruitlets

A Survey on Active Feature Acquisition Strategies

Neural Force Field: Few-shot Learning of Generalized Physical Reasoning

Wandering around: A bioinspired approach to visual attention through object motion sensitivity