Staynov, S. (2023). A Blockchain-driven approach for secure and scalable provenance management in open data systems [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.112005
data management; provenance; scalable systems; blockchain
en
Abstract:
As the availability of Linked Open Data (LOD) expands and leads to the creation ofmore derived and aggregated data, the demand for increasing the trust factor associated with this data has grown. Provenance tracking has shown to be a feasible approach to increasing reliability of data, however, a uniform approach for handling provenance data of LOD at a global level has not yet been established.This presents a challenging issue as data provenance frequently possesses numerous domain-specific characteristics. This further enforces the limited interoperability of current solutions across different data management systems. In order to address these problems and achieve a unified and interoperable solution for provenance tracking, it is essential to develop an architecture that can cater to the varied needs of different domains and data catalogs while maintaining a high degree of adaptability and scalability.In recent years, decentralized networks have gained popularity with the emergence of blockchain and distributed ledger technologies. These technologies offer a promising approach to addressing the challenges associated with provenance tracking by providing a secure and tamper-resistant platform for storing and managing provenance information.In this thesis, we propose a platform capable of multi-domain support that leverages state-of-the-art research regarding blockchain-based storage and tamper-resistant man-agement of provenance information. The solution can be used with different existing datamanagement systems for LOD, regardless of their underlying technology or implementation. In this way we address current limitations in the field of provenance tracking and enhance data quality and trust worthiness across various data management and analytical use cases. A key emphasis is placed on data systems powered by knowledge graphs, which possess the capability to make inferences and offer a more in-depth comprehension of the data.