New Individualized PATE Versions Support the Training of Machine Learning Models with Individualized Privacy Guarantees

On Jan 24, 2023

Differential privacy is a technique for protecting the privacy of individuals when their data, such as personal information or medical records, is used for research or analysis. Machine learning models trained on sensitive data can compromise individual privacy, so researchers have proposed methods to train these models while providing privacy guarantees.

PATE (Private Aggregation of Teacher Ensembles) is a differential privacy method that trains multiple teachers on private data and then uses the models to train a student model, allowing the student model to learn from the private data without compromising the privacy of the data. Traditional PATE methods provide a global privacy guarantee for the entire dataset but do not ensure that the privacy of each individual in the dataset is protected. This is particularly important when the dataset contains sensitive information about individuals, such as medical or financial data. Recently, a new paper entitled “Individualized PATE: Differentially Private Machine Learning with Individual Privacy Guarantees” was published to present a method for training machine learning models on sensitive data that guarantees differential privacy for each individual in the dataset. This extension of the PATE method provides a global privacy guarantee for the entire dataset.

The proposed method for Individualized PATE trains multiple teachers on different subsets of the data and then averages the teachers’ predictions to obtain a final model. The method utilizes the concept of differential privacy to ensure that private data is not compromised. The method also requires the use of a secure multi-party computation (MPC) protocol for the aggregation of the predictions of the teachers.

Concretely, the authors proposed to start by dividing the sensitive data into multiple disjoint subsets and training multiple teachers on each subset. These teachers are trained on private data but do not have access to the data themselves. Instead, they are given a differentially private summary of the data, which allows them to make predictions about the data without compromising the privacy of the individuals. Once the teachers are trained, they make predictions on a separate validation set. These predictions are then aggregated using a secure multi-party computation (MPC) protocol to obtain the final model. The MPC protocol ensures that the predictions are combined in a way that preserves the privacy of the individuals in the dataset. The final model is a combination of the predictions made by multiple teachers and can learn from the private data without compromising the privacy of the data.

An experimental study was carried out on multiple datasets to demonstrate the effectiveness of the proposed method. The experiments were conducted on multiple datasets, including both synthetic and real-world datasets. The authors used differentially private versions of well-known models such as logistic regression and neural networks as teachers. The obtained results show that the method can achieve accurate predictions while providing individual privacy guarantees. In addition, the investigation demonstrates that this new approach offers stronger privacy guarantees compared to traditional PATE methods, as it ensures that the privacy of each individual in the dataset is protected, regardless of the presence of other individuals in the dataset.

In this paper, we introduced a novel approach, Individualized PATE, which provides stronger privacy guarantees than traditional PATE methods, as it ensures that the privacy of each individual in the dataset is protected, regardless of the presence of other individuals in the dataset. The experimental results demonstrate the method can achieve accurate predictions while providing individual privacy guarantees. However, it requires the use of a secure multi-party computation (MPC) protocol for the aggregation of the predictions of the teachers.

Check out the Paper. All Credit For This Research Goes To the Researchers on This Project. Also, don’t forget to join our Reddit Page, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more.

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.

Credit: Source link