Title: Studying class membership scores in machine learning classification for imbalanced binary data
Language: English
Authors: Katzengruber, Matthias 
Qualification level: Diploma
Advisor: Zseby, Tanja 
Assisting Advisor: Iglesias Vazquez, Felix 
Issue Date: 2020
Number of Pages: 65
Qualification level: Diploma
Machine learning is getting increasing importance and is strongly promoted by the rise of computational power. A paramount application of machine learning is anomaly detection, sometimes understood as one-class classification,i.e., a binary classification problem in which there is a significant imbalance between the minority class (anomalies/outliers) and the majority class (normal/inlier). Real-life cases of such scenarios are, for example, fraud detection or attack detection in network communications. In this work, we study if the assumption is correct that wrongly classified instances are closer to decision boundaries and if this information can help to refine classification performances. We conducted experiments on network traffic and on other imbalanced datasets and found that, as a general rule, classification algorithms are able to leverage class membership scores to improve the “average precision” metric, which is suitable for evaluating imbalanced cases.Hence, class membership scores—defined based on distances to classification thresholds—help to improve classification while keeping the model explainability and the algorithm complexity simple.
Keywords: anomaly detection; machine learning; classification; network traffic analysis
URI: https://doi.org/10.34726/hss.2020.57167
DOI: 10.34726/hss.2020.57167
Library ID: AC15754408
Organisation: E389 - Telecommunications 
Publication Type: Thesis
Appears in Collections:Thesis

Files in this item:

Show full item record

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.