<div class="csl-bib-body">
<div class="csl-entry">Wurzenberger, M. (2021). <i>Resource-efficient log analysis to enable online anomaly detection in cyber security</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2021.90967</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2021.90967
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/17534
-
dc.description.abstract
The sheer number of different attack vectors and large amount of data produced by computer systems make it impossible to secure network infrastructures using traditional security measures such as anti-viruses, firewalls, and signature-based intrusion detection systems (IDS) that mostly allow detection of known attacks. Additionally, end-to-end encryption, virtualization and containerization make monitoring and analyzing network traffic non-trivial. Therefore, this thesis investigates the potential of anomaly-based intrusion detection that monitors textual log data, such as system logs, audit logs (syscalls), web logs (e.g., access logs), and application logs. The thesis identifies research gaps in state of the art log-based anomaly detection, including missing online analysis features and efficient log line parsing without loss of information, when analyzing un- and semi-structured log data. Furthermore, we propose a novel incremental clustering approach motivated by high-performance bio informatics tools that enables online analysis of large amounts of log lines. Moreover, we introduce a character- based template generator that solves the problem of computing multi-line alignments for arbitrary strings and provides detailed cluster descriptions. This enables the creation of meaningful log line templates that overcome the disadvantages of token-based templates, including handling of similar but not equal strings, and covering large parts of log lines with wildcards. State of the art parsers apply lists of regular expressions or signatures. Hence, they require large amounts of resources to process log lines and consequently remove large parts of log messages during parsing procedure, which leads to loss of information in the anomaly detection process. To overcome this weakness and enable detailed online log parsing requiring just a minimum amount of resources, the thesis proposes a parser generator that creates tree-like parsers, which effectively reduce complexity of parsing without information loss. Finally, we demonstrate the potential of the developed algorithms in three application cases. The first one introduces a time series analysis approach that uses the incremental clustering approach in combination with cluster evolution to detect frequency anomalies. Next, we describe a log-based anomaly detection system that applies the tree-like parser generator to enable online intrusion detection with a minimum amount of resources. Eventually, we propose a novel concept that enables automatic evaluation, comparison, and optimization of IDS and their configurations with respect to a specific network infrastructure.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
log data analysis
en
dc.subject
intrusion detection systems
en
dc.subject
anomaly detection
en
dc.subject
clustering
en
dc.subject
template generation
en
dc.subject
parser generation
en
dc.subject
machine learning
en
dc.subject
character-based log analysis
en
dc.subject
online data analysis
en
dc.subject
system behavior analysis
en
dc.title
Resource-efficient log analysis to enable online anomaly detection in cyber security