<div class="csl-bib-body">
<div class="csl-entry">Rauber, A., Gößwein, B., Zwölf, C. M., Schubert, C., Wörister, F., Duncan, J., Flicker, K., Zettsu, K., Meixner, K., McIntosh, L., Jenkyns, R., Pröll, S., Miksa, T., & Parsons, M. A. (2021). Precisely and persistently identifying and citing arbitrary subsets of dynamic data. <i>Harvard Data Science Review</i>, <i>3</i>(4). https://doi.org/10.1162/99608f92.be565013</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/176619
-
dc.description.abstract
Precisely identifying arbitrary subsets of data so that these can be reproduced is a daunting challenge in data-driven science, the more so if the underlying data source is dynamically evolving. Yet an increasing number of settings exhibit exactly those characteristics. Larger amounts of data are being continuously ingested from a range of sources (be it sensor values, online questionnaires, documents, etc.), with error correction and quality improvement processes adding to the dynamics. Yet, for studies to be reproducible, for decision-making to be transparent, and for meta studies to be performed conveniently, having a precise identification mechanism to reference, retrieve, and work with such data is essential. The Research Data Alliance (RDA) Working Group on Dynamic Data Citation has published 14 recommendations that are centered around time-stamping and versioning evolving data sources and identifying subsets dynamically via persistent identifiers that are assigned to the queries selecting the respective subsets. These principles are generic and work for virtually any kind of data. In the past few years numerous repositories around the globe have implemented these recommendations and deployed solutions. We provide an overview of the recommendations, reference implementations, and pilot systems deployed and then analyze lessons learned from these implementations. This article provides a basis for institutions and data stewards considering adding this functionality to their data systems.
en
dc.language.iso
en
-
dc.publisher
MIT Press
-
dc.relation.ispartof
Harvard Data Science Review
-
dc.subject
data citation
-
dc.subject
data versioning
-
dc.subject
dynamic data
-
dc.subject
persistent identifiers
-
dc.subject
data stewardship
-
dc.title
Precisely and persistently identifying and citing arbitrary subsets of dynamic data
en
dc.type
Artikel
de
dc.type
Article
en
dc.contributor.affiliation
SBA Research, Austria
-
dc.type.category
Original Research Article
-
tuw.container.volume
3
-
tuw.container.issue
4
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
wb.publication.intCoWork
International Co-publication
-
dcterms.isPartOf.title
Harvard Data Science Review
-
tuw.publication.orgunit
E354-01 - Forschungsbereich Microwave and THz Electronics