Jaoua, M. (2023). Data exfiltration attacks on text classification models trained in a federated manner [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.97105
E194 - Institut für Information Systems Engineering
Number of Pages:
Data Exfiltration; Adversarial Attacks; Data Hiding; Federated Learning; Machine Learning; Deep Learning; Natural Language Processing; Text Classification
With the rise of federated learning as a privacy-preserving method for training machine learning models, many companies and organizations are interested in collaborating with one another, however, lack the necessary expertise and resources, or want to ensure time and cost efficiency. Therefore, they hire third-parties to develop machine learning and federated learning pipelines. The third-parties can be malicious and perform a data exfiltration attack, which involves exposing sensitive training data through the model parameters or predictions. We evaluate data exfiltration attacks in both centralized and federated settings, with a focus on text classification models. We explore the sign encoding white-box attack and the black-box trigger attack and investigate the parameters that enhance the attacks success rate and preserve the effectiveness of the models. We show that the success rate of white- and black-box data exfiltration attacks depends on the dataset characteristics, the model architecture, the model training, and the attack parameters. We also show that it is possible to exfiltrate sensitive data in the federated setting for both IID and non-IID partitioning.