<div class="csl-bib-body">
<div class="csl-entry">Blatt, A. (2020). <i>Sampling DNS traffic: a day in the life of the .at-zone</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2021.80920</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2021.80920
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/17104
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
This thesis investigates the added utility of statistical sampling to DNS network traffic analysis, specifically with regards to issues of long-term storage and computation latency. Using DNS log data for a full "day in the life of the Austrian Internet" provided by the Austrian domain registry operator nic.at, three emblematic sampling methods, namely simple random sampling, systematic sampling and stratified random sampling, are applied to a selection of network traffic features to assess their effectiveness in preserving the "true" population parameters. Confirming theoretical considerations and previous research into Internet traffic, it was found that due to the query arrival process being highly self-similar, and thus also autocorrelated, systematic sampling leads to very precise estimates particularly for time-based traffic characteristics. For network traffic features independent of time, all sampling procedures perform essentially the same. Furthermore, it was shown that for tasks not involving very rare phenomena or the estimation of the number of distinct client IP addresses, sampling provides an easy way for fast data exploration with estimates for (frequent) traffic patterns that are either practically identical to or less than 10% away from the true parameter (for patterns occurring at least on the same level as the sampling fraction) for the analysed features. Used in conjunction with current Big Data technology, these findings could lead to great gains in computation speeds and reduced storage requirements. The method that consistently performed best or virtually indistinguishable from the others was systematic sampling, with the added benefit of also being the computationally cheapest.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
DNS data
en
dc.subject
Sampling
en
dc.title
Sampling DNS traffic: a day in the life of the .at-zone
en
dc.title.alternative
Stichprobenziehung von DNS Traffic Daten
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2021.80920
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Andreas Blatt
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
dc.type.qualificationlevel
Diploma
-
dc.identifier.libraryid
AC16172321
-
dc.description.numberOfPages
73
-
dc.thesistype
Diplomarbeit
de
dc.thesistype
Diploma Thesis
en
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
item.languageiso639-1
en
-
item.openairetype
master thesis
-
item.grantfulltext
open
-
item.fulltext
with Fulltext
-
item.cerifentitytype
Publications
-
item.mimetype
application/pdf
-
item.openairecristype
http://purl.org/coar/resource_type/c_bdcc
-
item.openaccessfulltext
Open Access
-
crisitem.author.dept
E105 - Institut für Stochastik und Wirtschaftsmathematik