A Simple State Space Model Excels at Multivariate Time Series Classification
一个简单的状态空间模型在多变量时间序列分类中表现出色
Hassan Saadatmand, Geoffrey I. Webb, Hamid Rezatofighi, Mahsa Salehi
AI总结 本文系统研究对角状态空间模型(S4D)和输入相关状态空间模型(Mamba系列)在大规模时间序列分类任务中的表现,发现S4D在准确性和效率上均优于Mamba变体,并提出了轻量级改进MS4和MS4N,在多个基准上达到或超越参数量大2-10倍的深度学习模型。
详情
结构化状态空间模型(SSM)最近作为序列建模的有前景基础出现,基于Mamba的架构通过输入相关的状态转换展示了强大的性能,尽管复杂度相当高。然而,它们在时间序列分类(TSC)中的应用主要局限于Mamba风格的架构,更广泛的SSM设计空间尚未充分探索。我们首次在大规模TSC基准上进行了涵盖对角SSM(S4D)和输入相关SSM(Mamba系列)的系统研究,探究这种复杂性是否对顶级性能是必要的。我们的结果揭示了一个令人惊讶的发现:S4D在准确性和效率上始终优于基于Mamba的变体,挑战了增加复杂性会在TSC中带来有意义收益的假设。基于此,我们引入了MS4,通过线性输入投影和通道混合机制对S4D进行轻量级修改,以及MS4N,一种归一化变体,以可忽略的开销稳定状态动态。在MONSTER(多达6000万样本、5万时间步、82个类别)和UEA基准上的59个数据集上,与15个基线相比,MS4和MS4N始终优于基于Mamba的模型,同时保持更高的效率,并且MS4N匹配或超越了参数量大约2倍和10倍的竞争性深度学习模型。这些结果将轻量级结构化SSM定位为在TSC中扩展复杂性的有吸引力替代方案。
Structured state space models (SSMs) have recently emerged as a promising foundation for sequence modeling, with Mamba-based architectures demonstrating strong performance through input-dependent state transitions, albeit at considerable complexity. However, their application to time-series classification (TSC) has been largely limited to Mamba-style architectures, leaving the broader SSM design space underexplored. We present the first systematic study spanning diagonal SSMs (S4D) and input-dependent SSMs (Mamba family) on large-scale TSC benchmarks, asking whether such complexity is necessary for top performance. Our results reveal a surprising finding: S4D consistently outperforms Mamba-based variants in both accuracy and efficiency, challenging the assumption that increased complexity translates to meaningful gains in TSC. Building on this, we introduce MS4, lightweight modifications to S4D via a linear input projection and channel-mixing mechanism, and MS4N, a normalized variant that stabilizes state dynamics with negligible overhead. Evaluated on 59 datasets across MONSTER (up to 60 million samples, 50K timesteps, 82 classes) and the UEA benchmark, against 15 baselines, MS4 and MS4N consistently outperform Mamba-based models while remaining more efficient, and MS4N matches or surpasses competing deep learning models that are roughly 2x and 10x larger in parameters. These results position lightweight structured SSMs as a compelling alternative to scaling complexity for TSC.