SoK: Privacy-Preserving Data Synthesis

Back to home page

Issues with Existing Repositories

A Common Issue

We idendify a common flaw in the experimental evaluation in several works—when comparing with the baselines, they would directly copy the results from the prior works instead of re-running their code. What’s worse, errors can happen during the copy-paste. For example, the table 3 of DPGEN copied results from Table 1 of DataLens, which copied results from Table 1(c) of G-PATE. The result for CelebA-Gender under $\varepsilon=1$ by DataLens was 0.7058 in DataLens paper but changed to 0.6996 in DPGEN paper.

We strongly advocate re-running the code from the prior works to one’s best ability for academic rigor. Whether one can reproduce the results that are similar to those reported in the prior works, the practice of trying to reproducing the results can further consolidate the community’s understanding of the prior works, and the findings will invariably be of great significance.

DPGEN

Paper url: https://openaccess.thecvf.com/content/CVPR2022/html/Chen_DPGEN_Differentially_Private_Generative_Energy-Guided_Network_for_Natural_Image_Synthesis_CVPR_2022_paper.html

Repository url: https://github.com/chiamuyu/DPGEN

PATE-GAN

Paper url: https://openreview.net/forum?id=S1zk9iRqF7

Repository url: https://github.com/vanderschaarlab/mlforhealthlabpub/tree/main/alg/pategan

DP-Sinkhorn

Paper url: https://openreview.net/forum?id=waWmZSw0mn

Repository url: https://github.com/nv-tlabs/DP-Sinkhorn_code