<div class="csl-bib-body">
<div class="csl-entry">Sauerwein, C. (2013). <i>Model-driven Benchmark data generation for digital preservation of webpages</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2013.22634</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2013.22634
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/2542
-
dc.description
Abweichender Titel laut Übersetzung der Verfasserin/des Verfassers
-
dc.description
Zsfassung in dt. Sprache
-
dc.description.abstract
Digital Preservation (DP) is the process of keeping digital information accessible and usable in an authentic manner for a long term. Preservation activities are used to guarantee long term and error free accessibility of data regardless of technological change. Different approaches based on continuous transformation of data are used to perform these preservation activities. Several tools exist for the execution of these activities. Digital objects have significant properties which must be preserved during the transformations. To evaluate these preservation activities information about these characteristics (e.g. structure, size) are necessary. The annotations of digital objects with this information are used as ground truth. A benchmark data set can be formed with real world data but the verification of the properties has to be done manually. Every automatic analysis is based on the correct interpretation of an analysis program (e.g. characterization tool). Due to the fact that these programs must be evaluated there is a profound lack of annotated benchmark data in Digital Preservation. For this reason the evaluation and improvement of digital preservation approaches and tools is hindered. This thesis introduces a model driven benchmark data generation framework with the purpose of automatic generation of benchmark data with corresponding ground truth. The system uses the Model Driven Architecture (MDA) as underlying concept which facilitates the usage of well-known model driven engineering tools and frameworks. Instead of analyzing existing benchmark data collections of computer science it generates the benchmark data sets referred to property distributions of different kinds of documents (e.g. webpages). The framework specifies ground truths for the Platform Independent and Platform Specific Models of the generated benchmark data. These ground truths together with the benchmark data are used for evaluation. The model driven benchmark data generation framework is evaluated by generating benchmark data for testing preservation action tools for web pages. They are widely used and a complex challenge in digital preservation settings. We define a Platform Independent and a Platform Specific Model for representing webpages and demonstrate how benchmark data can be created with these.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.title
Model-driven Benchmark data generation for digital preservation of webpages
en
dc.title.alternative
Model-Driven Benchmark Data Generation for Digital Preservation of Webpages
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2013.22634
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Clemens Sauerwein
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E188 - Institut für Softwaretechnik und Interaktive Systeme