E194 - Institut für Information Systems Engineering
-
Date (published):
2019
-
Number of Pages:
113
-
Keywords:
fingerprinting; relational database; data ownership verification; machine learning
en
Abstract:
Fingerprinting digital data is a method of embedding a traceable mark into the data to verify the owner and identify the specific recipient a certain copy of data set has been released to. This is crucial for releasing data sets to third parties, especially if the release involves a fee, or if the data contains sensitive information due to which further sharing and potential subsequent leaks should be discouraged and deterred from. Fingerprints generally involve distorting the data set to a certain degree, in a trade-off to preserve the utility of the data versus the robustness and traceability of the fingerprint. Different types of data require different approaches. Most of the state-of-art techniques are designed specifically for the numerical type of data. In this thesis, we will propose an approach for fingerprinting data sets containing categorical data. We further compare several approaches for fingerprinting according to their robustness against various types of attacks, such as subset or bit-flipping attacks, and evaluate the effects the fingerprinting has on the utility of the datasets, specifically for Machine Learning tasks.
en
Additional information:
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers