Title: Designing research repositories using automated workflows and machine actionable data management plans
Other Titles: Definition und Umsetzung von Anforderungen für Forschungsdatenrepositorien
Language: English
Authors: Bakos, Asztrik 
Qualification level: Diploma
Advisor: Miksa, Tomasz
Assisting Advisor: Rauber, Andreas
Issue Date: 2017
Number of Pages: 77
Qualification level: Diploma
Using a digital repository is an effective way to share research results. The task is not only to publish, but also to provide clear information on metadata, provenance and licenses. Repositories help the reuse of published scientific material and digital preservation techniques enable long-term access for the stored data. A repository requires a data management plan, which describes the correct means of maintenance. Uploading the research material however may occur for the researchers as yet another bureaucratic step. They tend to deposit data at a very late stage of the research project, when some of the earlier outputs are not available anymore, therefore the uploaded metadata and provenance information will not be complete. Depositing research results requires knowledge on digital preservation, assistance for the technical infrastructure - which costs time and effort. The aim of this work is to offer a method to automate the preservation as much as possible, and let researchers concentrate on the scientific aspects of a project. To achieve that, we have analyzed how research data management policies influence the data management plans and proposed a template which makes machine actionability possible for them. We have built an executable workflow model using the business process model notation for the data ingest processes and have set up a working demonstration on an Alfresco server. We have also extended Alfresco with a plugin that can run arbitrary preservation tools. To have a basis for comparison we have configured an Archivematica instance - a classical repository implementing the OAIS schema with a fixed preservation workflow. By comparing Alfresco and Archivematica we showed that Alfresco not only manages to preserve the files exactly as Archivematica does, but can also use a more complex preservation workflow. We concluded that properly depositing research files - according to the data management plan - is possible during the project with minimal effort required from the researchers. This means that the amount of user interaction can be reduced only to uploading the files and starting the workflow - the rest of the preservation will be done safely and silently in the background.
Keywords: digital preservation; digital repository; machine actionability; OAIS; workflow; data management; data management plan; BPMN; Alfresco; Archivematica
URI: https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-107838
Library ID: AC14543243
Organisation: E188 - Institut für Softwaretechnik und Interaktive Systeme 
Publication Type: Thesis
Appears in Collections:Thesis

Files in this item:

Show full item record

Page view(s)

checked on Apr 23, 2021


checked on Apr 23, 2021

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.