Phase transitions for the noisy transformer model in arbitrary dimension
任意维噪声变压器模型的相变
Kyunghoo Mun, Matthew Rosenzweig
AI总结 研究任意维球面上未归一化自注意力(USA)噪声变压器模型的McKean-Vlasov自由能,证明了全局最小化子的尖锐二分法,并给出了连续与不连续相变的临界条件。
详情
- Comments
- 18 pages
我们研究了与噪声变压器动力学中未归一化自注意力(USA)模型相关的单位球面上的McKean-Vlasov自由能。我们在每个维度$d\ge2$中证明了尖锐的全局最小化子二分法。存在唯一的$\beta_*^{(d)}>0$使得\begin{equation*} \frac{I_{d/2+1}(\beta_*^{(d)})}{I_{d/2}(\beta_*^{(d)})}=\frac1d, \end{equation*}其中$I_\nu$是第一类修正贝塞尔函数。对于$0<\beta\le \beta_*^{(d)}$,均匀密度在达到线性稳定阈值\begin{equation*} K_\#^{(d)}(\beta)=\frac{\beta^{d/2}}{2^{d/2}\Gamma(d/2)I_{d/2}(\beta)} \end{equation*}之前仍是唯一的全局最小化子,且相变是连续的。对于$\beta>\beta_*^{(d)}$,均匀密度在$K_\#^{(d)}(\beta)$处不是全局最小化子,因此临界耦合满足$K_c<K_\#^{(d)}(\beta)$且相变是不连续的。这一结果将作者近期关于$d=2$的工作(arXiv:2604.16288)推广到了任意维度。证明使用了球面上的尖锐Beckner-Onofri/对数Hardy-Littlewood-Sobolev(HLS)不等式,结合Funk-Hecke/贝塞尔系数计算和二次四次阻碍。
We study the McKean--Vlasov free energy on the unit sphere associated with the unnormalized self-attention (USA) model for noisy transformer dynamics. We prove a sharp global-minimizer dichotomy in every dimension $d\ge2$. There is a unique $β_*^{(d)}>0$ such that \begin{equation*} \frac{I_{d/2+1}(β_*^{(d)})}{I_{d/2}(β_*^{(d)})}=\frac1d, \end{equation*} where $I_ν$ is the modified Bessel function of the first kind. For $0<β\le β_*^{(d)}$, the uniform density remains the unique global minimizer up to the linear-stability threshold \begin{equation*} K_\#^{(d)}(β)=\frac{β^{d/2}}{2^{d/2}Γ(d/2)I_{d/2}(β)}, \end{equation*} and the phase transition is continuous. For $β>β_*^{(d)}$, the uniform density is not globally minimizing at $K_\#^{(d)}(β)$, so the critical coupling satisfies $K_c<K_\#^{(d)}(β)$ and the transition is discontinuous. This result generalizes the authors' recent $d=2$ work arXiv:2604.16288 to arbitrary dimension. The proof uses the sharp Beckner--Onofri/logarithmic Hardy-Littlewood-Sobolev (HLS) inequality on the sphere, together with a Funk--Hecke/Bessel coefficient computation and a degree-two quartic obstruction.