Privacy-Preserving Distributed Learning Techniques

Supervisor(s):	Immanuel Kunz
Status:	finished
Topic:	Others
Author:	Katharina Emde
Submission:	2020-09-15
Type of Thesis:	Masterthesis
Thesis topic in co-operation with the Fraunhofer Institute for Applied and Integrated Security AISEC, Garching
Description In the last few decades the number of machine learning applications has been steadily increasing. Although their structures and specific purpose vary throughout a wide range, they all have in common that they require a massive amount of input data to learn from. This development along with commercially available ”machine learning as a service” providers such as Google or Amazon require a rise of awareness of data privacy in the context of machine learning. On the one hand, distributed learning techniques, such as federated learning, promise to improve privacy by keeping training data at the sources rather than uploading it to a central (potentially untrusted) service. On the other hand, existing works have already shown different ways of attacking the privacy of ML models, e.g. regarding membership or feature inference. Privacy, however, is not a precise concept, and various goals and metrics have evolved to quantify it in different domains. This thesis addresses the issue of handling privacy within machine learning by formulating a guideline that serves as support during the selection process of fitting privacy metrics for a machine learning application. This is achieved by first examining the applicability of various known privacy metrics like k-anonymity, differential privacy or adversary’s success probability in the context of machine learning. The recommendations for the use of metrics depending on properties of the respective machine learning application are formulated subsequently. In order to evaluate the applicability of the guideline in combination with a distributed learning setup, a neural network classifying the items of the UCI adults dataset is implemented and examined according to the guideline.

Privacy-Preserving Distributed Learning Techniques

Description