Back to home page
We describe our approach to method selection.
We aim to build a comprehensive benchmark covering all PPDS milestone approaches from the papers accepted at top ML & security conferences.
We first filtered down the approaches by their publication venue (top ML & security conferences) and their references. In this step, we excluded some recent works that are covered in our review but are not accepted at any venue yet (e.g., DPDM [1] and DP-MEPF [2]). But we retained the works that have high citations though not published (e.g., DPGAN [3]).
We then excluded the approach that is known to have issues in privacy analysis. Concretely:
Next, We searched for released code implementations and excluded the approaches that do not have available code repositories on GitHub, e.g., PEARL [7].
We then made an initial attempt to run each approach following their README and using their reported hyper-parameters on their evaluated settings. We exclude the approaches that fail to work or show significant discrepancy in our re-implementation, in order to avoid unfairly representing their method. Concretely,
In the end, we obtained a list of six approaches, with three (DPGAN [3], DPCGAN [13], and GS-WGAN [8]) under the category of minimax training (Section 4.5), one (DP-MERF [14]) under feature alignment (Section 4.2), one (DPGEN [15]) under stochastic simulation (Section 4.3), and one (DP-Sinkhorn [16]) under optimal transport (Section 4.4).
Figure 1. Synthesized Images by G-PATE and DataLens when trained on MNIST under $\varepsilon=10$.
G-PATE | DataLens |
---|---|
![]() |
![]() |
We introduce our procedure of hyper-parameter tuning and selection.
Hyper-parameter tuning: We tune batch size, noise amplifier, gradient clipping norm, and other algorithm-specific hyper-parameters. For the combination of each approach and each scenario, we tune ~10 sets of hyper-parameters.
Hyper-parameter selection: The difficulty of hyper-parameter selection is mainly due to there not existing one sole evaluation criteria for the synthesized data. Particularly, we evaluate synthesized data against 13 classifiers (CNN, MLP, and other 11 scikit-learn classifiers). We describe our approach to evaluation and hyper-parameter selection as follows:
We first measureed the accuracy of the synthesized data against three “good” classifiers—MLP, LDA, and logistic regression—whose performance are among the top in 13 classifiers. For these three classifiers, we used default hyper-parameters (see the next section) without further tuning them. After training, we took the average accuracy of three classifiers.
For each combination of an approach and a scenario, we selected the top 3~5 sets of hyper-paramters based on their performance on the above metric (i.e., average accuracy on the three classifiers).
For the remaining 3~5 sets of hyper-parameters, we then trained CNNs on them. For training CNNs, we tuned the learning rate. We selected the best accuracy over all sets of CNN hyper-parameters.
We selected the best hyper-parameter based on the performance in step 1 and step 3. We then evaluate this hyper-parameter on the remaining 9 classifiers. Note that we repeat the data generation and classifier training for 5 times and take the average for our final evaluation.
We report the sets of hyper-parameters we experimented with and the best set of hyper-parameters we selected in Evaluation Results.
We describe our configurations and hyper-parameters for the 13 classifiers for evaluation.
CNN: We used the evaluation code eval_cnn.py
in the repository of [9]. The CNN architecture is LeNet [17]. We tuned the learning rate of the CNN training, among 0.01, 0.02, 0.05, and 0.1.
MLP: We used the scikit-learn MLP classifier (neural_network.MLPClassifier) with the default hyper-parameters.
logistic_reg, random_forest, gaussian_nb, bernoulli_nb, linear_svc, decision_tree, lda, adaboost, bagging, gbm, xgboost: We used the scikit-learn classifiers. We follow the implementation of synth_data_benchmark.py
in the repository of DP-MERF [14] and used the same hyper-parameters for these classifiers.
[1] Dockhorn, Tim, et al. “Differentially private diffusion models.” arXiv preprint arXiv:2210.09929 (2022).
[2] Harder, Fredrik, et al. “Differentially private data generation needs better features.” arXiv preprint arXiv:2205.12900 (2022).
[3] Xie, Liyang, et al. “Differentially private generative adversarial network.” arXiv preprint arXiv:1802.06739 (2018).
[4] Takagi, Shun, et al. “P3GM: Private high-dimensional data release via privacy preserving phased generative model.” 2021 IEEE 37th International Conference on Data Engineering (ICDE). IEEE, 2021.
[5] Sarwate, A. “Retraction for symmetric matrix perturbation for differentially-private principal component analysis.” (2017).
[6] Cao, Tianshi, et al. “Don’t generate me: Training differentially private generative models with sinkhorn divergence.” Advances in Neural Information Processing Systems 34 (2021): 12480-12492.
[7] Liew, Seng Pei, Tsubasa Takahashi, and Michihiko Ueno. “PEARL: Data Synthesis via Private Embeddings and Adversarial Reconstruction Learning.” International Conference on Learning Representations.
[8] Jordon, James, Jinsung Yoon, and Mihaela Van Der Schaar. “PATE-GAN: Generating synthetic data with differential privacy guarantees.” International conference on learning representations. 2019.
[9] Chen, Dingfan, Tribhuvanesh Orekondy, and Mario Fritz. “Gs-wgan: A gradient-sanitized approach for learning differentially private generators.” Advances in Neural Information Processing Systems 33 (2020): 12673-12684.
[10] Long, Yunhui, et al. “G-PATE: scalable differentially private data generator via private aggregation of teacher discriminators.” Advances in Neural Information Processing Systems 34 (2021): 2965-2977.
[11] Wang, Boxin, et al. “Datalens: Scalable privacy preserving training via gradient compression and aggregation.” Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 2021.
[12] NVIDIA. “Tensorflow: An Open Source Machine Learning Framework for Everyone.” https://github.com/NVIDIA/tensorflow.
[13] Torkzadehmahani, Reihaneh, Peter Kairouz, and Benedict Paten. “Dp-cgan: Differentially private synthetic data and label generation.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. 2019.
[14] Harder, Frederik, Kamil Adamczewski, and Mijung Park. “Dp-merf: Differentially private mean embeddings with randomfeatures for practical privacy-preserving data generation.” International conference on artificial intelligence and statistics. PMLR, 2021.
[15] Chen, Jia-Wei, et al. “DPGEN: Differentially Private Generative Energy-Guided Network for Natural Image Synthesis.” Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022.
[16] Cao, Tianshi, et al. “Don’t generate me: Training differentially private generative models with sinkhorn divergence.” Advances in Neural Information Processing Systems 34 (2021): 12480-12492.
[17] LeCun, Yann, et al. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE 86.11 (1998): 2278-2324.