SoK: Privacy-Preserving Data Synthesis

Issues with Existing Repositories

A Common Issue

We idendify a common flaw in the experimental evaluation in several works—when comparing with the baselines, they would directly copy the results from the prior works instead of re-running their code. What’s worse, errors can happen during the copy-paste. For example, the table 3 of DPGEN copied results from Table 1 of DataLens, which copied results from Table 1(c) of G-PATE. The result for CelebA-Gender under $\varepsilon=1$ by DataLens was 0.7058 in DataLens paper but changed to 0.6996 in DPGEN paper.

We strongly advocate re-running the code from the prior works to one’s best ability for academic rigor. Whether one can reproduce the results that are similar to those reported in the prior works, the practice of trying to reproducing the results can further consolidate the community’s understanding of the prior works, and the findings will invariably be of great significance.

DPGEN

Paper url: https://openaccess.thecvf.com/content/CVPR2022/html/Chen_DPGEN_Differentially_Private_Generative_Energy-Guided_Network_for_Natural_Image_Synthesis_CVPR_2022_paper.html

Repository url: https://github.com/chiamuyu/DPGEN

Missing documentation and unprofessional repository: The repository contains a README.md file with no actual content in it. The code is provided in DPGEN-SIMPLE.zip which is not very professional.
Un-runnable code: Extracting the content from DPGEN-SIMPLE.zip gives code that is not runnable. There’s a NCSNRunner in line 161 of main.py, but there’s no NCSNRunner class anywhere in the code. Changing NCSNRunner to Runner can fix the issue. In addition, the Runner class does not contain a test function which was required in main.py.
Incorrect configuration: The authors set random_flip: true for MNIST in ./configs/mnist.yml, which does not make sense. We never want our generator to produce digits that are flipped.
Missing configurations for $\varepsilon=1$ and $\varepsilon=0.2$: The paper reported results for MNIST and Fashion-MNIST on $\varepsilon=1$ and $\varepsilon=0.2$ in Tables 3 and 4, but neither the paper nor the repository provided the hyper-parameters for these two settings. Similarly, the paper reported results on two other datasets CelebA and LSUN-bedroom, but the associated code and configurations are missing in the repository.
Severe limitation of the approach: DPGEN can only generate images but not labels. The authors didn’t mention this in their paper. From our email communication with them, we learned that the way they produce labels is through training a classifier to produce labels. This was not clarified in the paper as well.
Potential privacy leakage: Training a classifier on the private dataset to label the generated data exhibits privacy concerns. We inquired the authors about the issue, and the authors confirmed the issue and further clarified that “To address this, we suggest training the classifier on a dataset different from the one used to train DPGEN. For instance, some studies split their dataset into public and private parts, and only the private dataset needs to be trained with DPGEN.” However, upon examining their code, we found that train_fashion_mnist_cls.ipynb actually trains a classifier on the private dataset, meaning that the results in the paper may not strictly respect the reported $\varepsilon$ values.
Failure to offer due credit: In fact, the approach for data generation comes from the paper Improved Techniques for Training Score-Based Generative Models and the repository is adapted upon theirs. But the authors didn’t even cite this work in their paper; nor did they cite the repository in their repository.

PATE-GAN

Paper url: https://openreview.net/forum?id=S1zk9iRqF7

Repository url: https://github.com/vanderschaarlab/mlforhealthlabpub/tree/main/alg/pategan

In pate_gan.py
- Variable value error: Line 190 goes as for _ in range(k):, yet the code needs to extract the corresponding segment in data by index, i.e., temp_x = x_partition[i] in line 195. The loop variable i here does not get updated; its value is inherited from the loop from line 106-109.
- Exceeding the privacy budget: Line 185 goes as while epsilon_hat < epsilon, which means that the accumulated privacy cost epsilon_hat has already exceeded the budget epsilon at the time of breaking out of the loop, presenting a privacy violation.
In main_pategan_experiment.py
- Variable type error: Line 171-175 specifies the argument “the number of teachers” $k$. Its type should be int instead of float as written in line 175.
Inapplicability to image datasets: PATE-GAN was proposed and evaluated on tabular data only in their paper. Although G-PATE evaluated PATE-GAN and reported non-trivial results, we had similar findings as what GS-WGAN reported in their Appendix C.4 — “the generated samples are classiﬁed as fake by all teacher discriminators and the learning signals (gradients) for student discriminator and the generator vanish.”

DP-Sinkhorn

Paper url: https://openreview.net/forum?id=waWmZSw0mn

Repository url: https://github.com/nv-tlabs/DP-Sinkhorn_code

Variable error: In line 163 of sinkhorn_poisson.py, the global_step is always 1, and therefore the calculated eps never grows, and line 165 will always be false. The variable global_step should be replaced by args.global_step.
Inconsistent hyper-parameter configurations: The best hyper-parameters reported in the paper are inconsistent with the commands in the repository.
Mistakes in statistics: In Appendix D.4, the authors reported that MNIST takes 160,000 training iterations, i.e., 130 epochs (1200 iters/epoch). But in our experiments, MNIST would take more than 3000 epochs.