<div class="csl-bib-body">
<div class="csl-entry">Illes, I. (2026). <i>Evaluating intrusion detection benchmark datasets via post-analysis of learned attack profiles</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2026.139421</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2026.139421
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/227468
-
dc.description.abstract
Intrusion Detection System datasets are commonly used to build models for detecting network threats and classifying network traffic. These datasets contain captured and generated data for different families of network attack categories. Machine learning algorithms show promising results in classifying and detecting various attack types in these datasets; however, the post-analysis of the key characteristics of the attack classes still needs to be explored in further detail. The question arises whether the attack classes derived from models trained on these datasets are truly representative of attack characteristics in real-world traffic and whether they are discriminatory and transferable, or merely accidental in nature.To address this, a testbed is constructed that handles flow aggregation, labeling, preprocessing, supervised analysis and post-analysis across selected intrusion detection system datasets, namely Kitsune and TII-SSRC-23. The post-analysis provides a framework combining visualization and interpretation methods with statistical metrics. The framework aims to provide insight into the intrinsic attack characteristics learned by machine learning models trained on the selected datasets. Qualitative profiles for each attack type are defined. These profiles are then assessed using domain expertise and applied to real-world network traces provided by the Measurement and Analysis on the Widely Integrated Distributed Environment Internet group to estimate their recurrence and relevance in real-world traffic.The results show that, although some of the extracted attack profiles largely align with domain knowledge, they are strongly influenced by specific dataset configurations and artifacts. Furthermore, among the selected datasets, the discriminative features defining the profiles for the same attack type differ entirely, limiting the transferability of these profiles. The real-world comparison also reveals weaknesses in the intrusion detection system datasets. The Kitsune dataset shows some realistic and distinct attack patterns, but under-represents real-world variability. TII-SSRC-23 exhibits a single dominant ray, lacking the complexity of real traffic behavior.The resulting insights highlight the importance of rigorous post-analysis in the evaluation of Intrusion Detection System datasets when training and deploying machine learning models. Post-analysis helps uncover dataset biases, artifacts and modeling limitations, enabling the development of intrusion detection systems that generalize beyond the specific datasets on which they are trained.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Intrusion Detection Systems (IDS)
en
dc.subject
Benchmark Datasets
en
dc.subject
Attack Profile Analysi
en
dc.subject
Post-analysis and Interpretability
en
dc.title
Evaluating intrusion detection benchmark datasets via post-analysis of learned attack profiles
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2026.139421
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Isabella Illes
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Zseby, Tanja
-
tuw.publication.orgunit
E389 - Institute of Telecommunications
-
dc.type.qualificationlevel
Diploma
-
dc.identifier.libraryid
AC17833645
-
dc.description.numberOfPages
134
-
dc.thesistype
Diplomarbeit
de
dc.thesistype
Diploma Thesis
en
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.assistant.staffStatus
staff
-
tuw.advisor.orcid
0000-0001-6081-969X
-
tuw.assistant.orcid
0000-0002-5391-467X
-
item.fulltext
with Fulltext
-
item.grantfulltext
open
-
item.cerifentitytype
Publications
-
item.openairetype
master thesis
-
item.openaccessfulltext
Open Access
-
item.openairecristype
http://purl.org/coar/resource_type/c_bdcc
-
item.mimetype
application/pdf
-
item.languageiso639-1
en
-
crisitem.author.dept
E384-01 - Forschungsbereich Software-intensive Systems