DP of Gibbs Posterior without Sensitivity Bounds

Differential Privacy
without
Sensitivity
南賢太郎（東大情報理工 D1）
2017/1/19@NIPS2016読み会

Overview
Differential privacy (DP)
• Degrees of privacy protection [Dwork+06]
Gibbs posterior
• A generalization of the Bayesian posterior
Contribution
We proved (𝜀, 𝛿)-DP of the Gibbs posterior without boundedness
of the loss
2

Outline
1. Differential privacy
2. Differentially private learning
1. Background
2. Main result Differential privacy of Gibbs posterior [Minami+16]
3. Applications
1. Logistic regression
2. Posterior approximation method
3

Outline
1. Background
3. Applications
4

Privacy constraint in ML & statistics
5
𝑋1 𝑋2 𝑋 𝑛
⋯
User’s data 𝐷 Curator Statistic 𝜃

Privacy constraint in ML & statistics
6
⋯
User’s data 𝐷 Curator Statistic 𝜃
In many applications of ML & statistics, the data 𝐷 =
{𝑋1, … , 𝑋 𝑛} contains user’s personal information
Problem: Calculate a statistic of interest 𝜃 privately
TBD.

Adversarial formulation of privacy
Example: Mean of binary-valued query (Yes: 1, No: 0)
7
⋯

8
⋯
𝑋1
′
𝑋2 𝑋 𝑛
⋯
Auxiliary info. 𝐷′

9
⋯
Noise

10
⋯
Noise
𝑋1
′
𝑋2 𝑋 𝑛
⋯

11
⋯
Noise
𝑋1
′
𝑋2 𝑋 𝑛
⋯
Small noise for 𝜃
 Adding noise may not
deteriorate the accuracy
Large noise for 𝑋𝑖
 Privacy preservation

Differential privacy
Idea:
1. Generate a random 𝜃 from a data-dependent distribution 𝜌 𝐷
12
⋯

Idea:
2. Two “adjacent” datasets differing in a single individual
should be statistically indistinguishable
13
⋯
𝑋1
′
𝑋2 𝑋 𝑛
⋯
Close in the sense of
a “statistical distance”

Def: Differential Privacy [Dwork+06]
• 𝜀 > 0, 𝛿 ∈ [0, 1) privacy parameters
• 𝜌 𝐷 satisfies (𝜀, 𝛿)-differential privacy if
1. for any adjacent datasets 𝐷, 𝐷′, and
2. for any set 𝐴 ⊂ Θ of outputs,
the following inequality holds:
14

Interpretation of DP
• DP prevents identification with statistical significance
• e.g. Adversary cannot construct power 𝛾-test for
𝐻0: 𝑋𝑖 = 𝑋 𝑣. 𝑠. 𝐻1: 𝑋𝑖 ≠ 𝑋
at 5% significance level
• See also:
15

DP and statistical learning
Example: Linear classification
• Find a 𝜀, 𝛿 -DP distribution of hyperplanes
that minimizes the expected classification error
16

Differentially private learning
Question: What kind of random estimators should we use?
1. Noise addition to a deterministic estimator
• e.g. maximum likelihood estimator + noise
2. Modification of the Bayesian posterior (this work)
17

Outline
1. Background
3. Applications
18

Gibbs posterior
• Bayesian posterior
• Introduce a “scale parameter” 𝛽 > 0
19

Gibbs posterior
A natural data-dependent distribution in statistics & ML
• Contains the Bayesian posterior
ℓ 𝜃, 𝑥 = − log 𝑝 𝑥 𝜃 , 𝛽 = 1
• Important in PAC-Bayes theory [Catoni07][Zhang06]
20
Loss function
ℓ(𝜃, 𝑥)
Prior distribution
𝜋
Inverse temperature
𝛽 > 0

Gibbs posterior
Problem
• If 𝛽 ↓ 0, 𝐺 𝛽 𝜃 𝐷 is flattened and get close to the prior
• Is DP satisfied if we choose 𝛽 > 0 sufficiently small?
22
𝛽 → 0

Gibbs posterior
Problem
• If 𝛽 ↓ 0, 𝐺 𝛽 𝜃 𝐷 is flattened and get close to the prior
• Is DP satisfied if we choose 𝛽 > 0 sufficiently small?
23
𝛽 → 0
Answer
Yes, if…
• ℓ is bounded (Previously known)
• 𝛻ℓ is bounded (This work)

The exponential mechanism
Theorem [MT07]
An algorithm that draws 𝜃 from a distribution
satisfies (𝜀, 0)-DP
24

The exponential mechanism
Theorem [MT07]
An algorithm that draws 𝜃 from a distribution
satisfies (𝜀, 0)-DP
• This is the Gibbs posterior if ℒ 𝜃, 𝐷 = 𝑖=1
𝑛
ℓ(𝜃, 𝑥𝑖)
• 𝛽 has to satisfy
𝛽 ≤
𝜀
2Δℒ
• Δℒ: sensitivity (TBD.)
25

Sensitivity
Definition: Sensitivity of ℒ: Θ × 𝒳 𝑛 → ℝ
• The exponential mechanism works if 𝛥ℒ < ∞ !
26
𝐿∞-norm
Supremum is taken over
adjacent datasets

Sensitivity
Theorem [Wang+15]
(A) ℓ 𝜃, 𝑥 ≤ 𝐴
⟹ Δℒ ≤ 2𝐴
(B) ℓ 𝜃, 𝑥 − ℓ 𝜃, 𝑥′
≤ 𝐴
⟹ Δℒ ≤ 𝐴
27
𝜃
𝐴
𝜃
𝐴

Loss function that does not satisfy (𝜀, 0)-
DP
• Logistic loss
ℓ 𝜃, (𝑧, 𝑦) = log 1 + exp −𝑦 𝜃, 𝑧
• The max difference of loss (≈ 𝑀) grows toward +∞
as DiamΘ → ∞
28𝜃
𝑀
ℓ(𝜃, 𝑧, +1 ) ℓ(𝜃, 𝑧, −1 )
+∞

Loss function that does not satisfy (𝜀, 0)-
DP
• Logistic loss
ℓ 𝜃, (𝑧, 𝑦) = log 1 + exp −𝑦 𝜃, 𝑧
• The max difference of loss (≈ 𝑀) grows toward +∞
as DiamΘ → ∞
29𝜃
𝑀
ℓ(𝜃, 𝑧, +1 ) ℓ(𝜃, 𝑧, −1 )
+∞
We need differential privacy
without sensitivity!

From bounded to Lipschitz
• In the example of logistic loss, the 1st derivative is
bounded
• The Lipschitz constant 𝐿 is not influenced by
the size of parameter space DiamΘ
30

Main theorem
31
Theorem [Minami+16]
Assumption:
1. For all 𝑥 ∈ 𝒳, ℓ(⋅, 𝑥) is 𝐿-Lipschitz and convex
2. The prior is log-strongly-concave i.e. − log 𝜋(⋅) is 𝑚 𝜋-strongly convex
3. Θ = ℝ 𝑑
 The Gibbs posterior 𝐺 𝛽,𝐷 satisfies (𝜀, 𝛿)-DP if 𝛽 > 0 is chosen as
(1)
Independent of the sensitivity!

Outline
1. Background
3. Applications
32

Example: Logistic Loss
Logistic loss
ℓ 𝜃, (𝑧, 𝑦) = log 1 + exp −𝑦( 𝑎, 𝑧 + 𝑏)
33
𝒵 = 𝑧 ∈ ℝ 𝑑, ∥ 𝑧 ∥2≤ 𝑅
𝒳 = 𝑧, 𝑦 ∣ 𝑧 ∈ 𝒵, 𝑦 ∈ −1, +1
𝜃 = (𝑎, 𝑏)

Example: Logistic Loss
• Gaussian prior
𝜋 𝜃 = 𝑁 𝜃 0, 𝑛𝜆 −1 𝐼
• The Gibbs posterior is given by:
• 𝐺 𝛽 satisfies (𝜀, 𝛿)-DP if
34

Langevin Monte Carlo method
• In practice, sampling from the Gibbs posterior can be a
computationally hard problem
• Some approximate sampling methods are used
(e.g. MCMC, VB)
35

• Langevin Monte Carlo (LMC)
36
GD LMC

• “Mixing-time” results have been derived for log-concave
distributions [Dalalyan14][Durmus & Moulines15]
• LMC can attain 𝛾-approximation after finite 𝑇 iterations
• Polynomial time in 𝑛 and 𝛾−1
:
𝑇 ∼ 𝑂
𝑛
𝛾
2
log
𝑛
𝛾
2
37

• I have a Privacy Preservation guarantee
• I have an Approximate Posterior
• (Ah…)
38

Privacy Preserving Approximate Posterior (PPAP)
• We can prove (𝜀, 𝛿′)-DP of LMC-Gibbs posterior
Proposition [Minami+16]
• Assume that ℓ and 𝜋 satisfies the assumption of Main Theorem.
• We also assume that ℓ(⋅, 𝑥) is 𝑀-smooth for every 𝑥 ∈ 𝒳
• After 𝑂
𝑛
𝛾
2
log
𝑛
𝛾
2
iterations, the output of the LMC satisfies
(𝜀, 𝛿 + 𝑒 𝜀 + 1 𝛾)-DP.
39

Summary
= Differential privacy + Statistical learning
2. We developed a new method to prove (𝜀, 𝛿)-DP
for Gibbs posteriors without “sensitivity”
• Applicable to Lipschitz & convex losses
• (+) Guarantee for an approximate sampling method
Thank you!
40

DP of Gibbs Posterior without Sensitivity Bounds

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (14)

Similar to DP of Gibbs Posterior without Sensitivity Bounds

Similar to DP of Gibbs Posterior without Sensitivity Bounds (20)

Recently uploaded

Recently uploaded (20)

DP of Gibbs Posterior without Sensitivity Bounds

Editor's Notes