<div class="csl-bib-body">
<div class="csl-entry">Rauber, A., Gößwein, B., Zwölf, C. M., Schubert, C., Wörister, F., Duncan, J., Flicker, K., Zettsu, K., Meixner, K., McIntosh, L. D., Jenkyns, R., Pröll, S., Miksa, T., & Parsons, M. A. (2021). Precisely and Persistently Identifying and Citing Arbitrary Subsets of Dynamic Data. <i>Harvard Data Science Review</i>, <i>3</i>(4). https://doi.org/10.1162/99608f92.be565013</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/142583
-
dc.description.abstract
Precisely identifying arbitrary subsets of data so that these can be reproduced is a daunting challenge in datadriven science, the more so if the underlying data source is dynamically evolving. Yet, an increasing number
of settings exhibit exactly those characteristics: larger amounts of data being continuously ingested from a
range of sources (be it sensor values, [online] questionnaires, documents, etc.), with error correction and quality
improvement processes adding to the dynamics. Yet, for studies to be reproducible, for decision-making to be
transparent, and for meta studies to be performed conveniently, having a precise identification mechanism to
reference, retrieve, and work with such data is essential. The Research Data Alliance (RDA) Working Group
on Dynamic Data Citation has published 14 recommendations that are centered around time-stamping and
versioning evolving data sources and identifying subsets dynamically via persistent identifiers that are assigned
to the queries selecting the respective subsets. These principles are generic and work for virtually any kind of
data. In the past few years numerous repositories around the globe have implemented these recommendations
and deployed solutions. We provide an overview of the recommendations, reference implementations, and pilot
systems deployed and then analyze lessons learned from these implementations. This article provides a basis
for institutions and data stewards considering adding this functionality to their data systems.
en
dc.description.sponsorship
CDG Christian Doppler Forschungsgesellschaft; CDG Christian Doppler Forschungsgesellschaft
-
dc.language.iso
en
-
dc.publisher
MIT Press
-
dc.relation.ispartof
Harvard Data Science Review
-
dc.subject
Dynamic Data
en
dc.subject
large amount of data
en
dc.subject
precise identification mechanism
-
dc.subject
data timestamped
en
dc.subject
data versioned
en
dc.subject
different data settings
en
dc.subject
different disciplines
en
dc.subject
different data types
en
dc.title
Precisely and Persistently Identifying and Citing Arbitrary Subsets of Dynamic Data
en
dc.type
Article
en
dc.type
Artikel
de
dc.contributor.affiliation
Sorbonne Université, France
-
dc.contributor.affiliation
Climate Change Centre Austria, Austria
-
dc.contributor.affiliation
University of Vermont, United States of America (the)
-
dc.contributor.affiliation
National Institute of Information and Communications Technology, Japan
-
dc.contributor.affiliation
Ripeta, Saint Louis, USA
-
dc.contributor.affiliation
Ocean Networks Canada Society, Canada
-
dc.contributor.affiliation
Cropster, Innsbruck, Austria
-
dc.contributor.affiliation
SBA Research, Austria
-
dc.relation.grantno
CDL SQI
-
dcterms.dateSubmitted
2021
-
dc.type.category
Original Research Article
-
tuw.container.volume
3
-
tuw.container.issue
4
-
tuw.journal.peerreviewed
true
-
tuw.peerreviewed
true
-
wb.publication.intCoWork
International Co-publication
-
tuw.project.title
Verbesserung der Sicherheit von Informationsprozessen in Produktionssystemen
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.id
I4a
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems