This AI Paper Studies the Impact of Anonymization for Training Computer Vision Models with a Focus on Autonomous Vehicles Datasets

On Jun 23, 2023

Image anonymization is the practice of modifying or removing sensitive information from images to protect privacy. While important for complying with privacy regulations, anonymization often reduces data quality, which hampers computer vision development. Several challenges exist, such as data degradation, balancing privacy and utility, creating efficient algorithms, and negotiating moral and legal issues. A suitable compromise must be achieved to secure privacy while improving computer vision research and applications.

Previous approaches to image anonymization include traditional methods such as blurring, masking, encryption, and clustering. Recent work focuses on realistic anonymization using generative models to replace identities. However, many methods lack formal guarantees of anonymity, and other cues in the image can still reveal identity. Limited studies have explored the impact on computer vision models, with varying effects depending on the task. Public anonymized datasets are scarce.

In recent research, researchers from the Norwegian University of Science and Technology have directed their attention towards crucial computer vision tasks in the context of autonomous vehicles, specifically instance segmentation and human pose estimation. They have evaluated the performance of full-body and face anonymization models implemented in DeepPrivacy2, aiming to compare the effectiveness of realistic anonymization approaches with conventional methods.

🚀 JOIN the fastest ML Subreddit Community

The steps proposed to assess the impact of anonymization by the article are as follows:

Anonymizing common computer vision datasets.
Training various models using anonymized data.
Evaluating the models on the original validation datasets

The authors propose three full-body and face anonymization techniques: blurring, mask-out, and realistic anonymization. They define the anonymization region based on instance segmentation annotations. Traditional methods include masking out and Gaussian blur, while realistic anonymization uses pre-trained models from DeepPrivacy2. The authors also address global context issues in full-body synthesis through histogram equalization and latent optimization.

The authors conducted experiments to evaluate models trained on anonymized data using three datasets: COCO Pose Estimation, Cityscapes Instance Segmentation, and BDD100K Instance Segmentation. Face anonymization techniques showed no significant performance difference on Cityscapes and BDD100K datasets. However, for COCO pose estimation, both mask-out and blurring techniques led to a significant drop in performance due to the correlation between blurring/masking artifacts and the human body. Full-body anonymization, whether traditional or realistic, resulted in a decline in performance compared to the original datasets. Realistic anonymization performed better but still degraded the results due to keypoint detection errors, synthesis limitations, and global context mismatch. The authors also explored the impact of model size and found that larger models performed worse for face anonymization on the COCO dataset. For full-body anonymization, both standard and multi-modal truncation methods improved performance.

To conclude, the study investigated the impact of anonymization on training computer vision models using autonomous vehicle datasets. Face anonymization had minimal effects on instance segmentation, while full-body anonymization significantly impaired performance. Realistic anonymization was superior to traditional methods but not a complete substitute for real data. Privacy protection without compromising model performance was highlighted. The study had limitations in annotation reliance and model architectures, calling for further research to improve anonymization techniques and address synthesis limitations. Challenges in synthesizing human figures for anonymization in autonomous vehicles were also highlighted.

Check Out The Paper. Don’t forget to join our 25k+ ML SubReddit, Discord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at Asif@marktechpost.com

Featured Tools From AI Tools Club

🚀 Check Out 100’s AI Tools in AI Tools Club

Mahmoud is a PhD researcher in machine learning. He also holds a
bachelor’s degree in physical science and a master’s degree in
telecommunications and networking systems. His current areas of
research concern computer vision, stock market prediction and deep
learning. He produced several scientific articles about person re-
identification and the study of the robustness and stability of deep
networks.

Credit: Source link