Near-Optimal Generalized Private Testing
近优的通用隐私测试
Anamay Chaturvedi, Monika Henzinger, Jalaj Upadhyay
AI总结 本文提出了一种通用隐私测试机制,解决了在满足差分隐私的前提下,如何高效选择满足阈值条件的机制的问题,并证明了其在准确性和样本复杂度上的近最优性。
Comments 67 pages, 3 tables
详情
在差分隐私(DP)中,通用隐私测试问题由Liu和Talwar(STOC 2019)引入。给定一个数据集$X \in \mathcal{X}$和一系列黑盒$\varepsilon_t$-DP机制$M_t:\mathcal{X} o\{+1,-1\}$,分析人员必须接受第一个成功概率$p_t=\Pr[M_t(X)=+1]$超过给定阈值$p^*\in(0,1)$的机制,同时保持DP。准确度由$p^*$与拒绝阈值$ar{p}$之间的差距测量,使得对于所有$t\geq1$,以概率$1-β$,如果$p_t\leqar{p}$,则$M_t$被拒绝,如果$p_t\geq p^*$,则被接受。这扩展了标准隐私测试问题,其解决方案,稀疏向量技术,在DP中普遍存在。我们引入了通用阈值机制(GTM)用于通用隐私测试。对于$\varepsilon>0$和任何序列的$(\varepsilon_t,δ_t)$-DP机制$M_t$,GTM是纯$\varepsilon$-DP。对于$ heta>0$,$\gamma\in(1,2]$,和$eta\in(0,1)$,$ar{p}_t=\max(p^*/\gammaΛ_t, 1 - \gammaΛ_t(1-p^*))-\delta_t/\varepsilon_t$对于$Λ_t=(5t\ln^3(t+2))^{(2+ heta)\varepsilon_t/\varepsilon}(4/eta)^{(3+ heta+2/ heta)\varepsilon_t/\varepsilon}$。以概率$1-eta$,对于所有$t\geq 1$,$M_t$的评估次数至多为$O((\ln(t/eta)/(\gamma-1)^2)\max(Λ_t/p^*,(1-p^*)^{-1}))$。我们的下界证明了我们的准确性和样本复杂度保证的近最优性。通过GTM,我们给出了从持续观察(CO)设置到批量设置的DP优化的黑盒减少。这使我们获得了许多最大化问题的第一个DP-CO算法。此外,GTM允许自适应地选择接受阈值$(p^*_t)_{t\geq1}$,解决了先前工作中在使用通用隐私测试进行超参数优化时提到的挑战。
In differential privacy (DP), the generalized private testing problem was introduced by Liu and Talwar (STOC 2019). Given a dataset $X \in \mathcal{X}$ and a sequence of black-box $\varepsilon_t$-DP mechanisms $M_t:\mathcal{X}\to\{+1,-1\}$, the analyst must accept the first mechanism whose success probability $p_t=\Pr[M_t(X)=+1]$ exceeds a given threshold $p^*\in(0,1)$, while achieving DP. Accuracy is measured by the gap between $p^*$ and a rejection threshold $\bar{p}$, such that with probability $1-β$ for all $t\geq1$, if $p_t\leq\bar{p}$, then $M_t$ is rejected, and if $p_t\geq p^*$, then it is accepted. This generalizes the standard private testing problem, whose solution, the Sparse Vector Technique, is ubiquitous in DP. We introduce the Generalized Thresholding Mechanism (GTM) for generalized private testing. For $\varepsilon>0$ and any sequence of $(\varepsilon_t,δ_t)$-DP mechanisms $M_t$, the GTM is pure $\varepsilon$-DP. For $θ>0$, $γ\in(1,2]$, and $β\in(0,1)$, $\bar{p}_t=\max(p^*/γΛ_t, 1 - γΛ_t(1-p^*))-δ_t/\varepsilon_t$ for $Λ_t=(5t\ln^3(t+2))^{(2+θ)\varepsilon_t/\varepsilon}(4/β)^{(3+θ+2/θ)\varepsilon_t/\varepsilon}$. With probability $1-β$, the number of evaluations of $M_t$ is at most $O((\ln(t/β)/(γ-1)^2)\max(Λ_t/p^*,(1-p^*)^{-1}))$ for all $t\geq 1$. Our lower bounds prove near-optimality of our accuracy and sample complexity guarantees. Via the GTM, we give a black-box reduction for DP optimization from the continual observation (CO) setting to the batch setting. This gives us the first DP-CO algorithms for many maximization problems. Further, the GTM permits an adaptive choice of acceptance thresholds $(p^*_t)_{t\geq1}$, addressing a challenge mentioned in prior work on using generalized private testing for hyperparameter optimization (Papernot and Steinke (ICLR 2022)).