ROC curves for multivariate markers

Pérez Fernández, Sonia

doi:10.34726/hss.2021.89290

Datensatz Zitierlink:

https://doi.org/10.34726/hss.2021.89290
http://hdl.handle.net/20.500.12708/17188

Titel:

Zitat:

Pérez Fernández, S. (2020). ROC curves for multivariate markers [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2021.89290

reposiTUm-DOI:

10.34726/hss.2021.89290

CatalogPlus:

AC16183986

Publikationstyp:

Hochschulschrift - Dissertation

Sprache:

Englisch

Autor_innen:

Pérez Fernández, Sonia

Betreuer_in:

Filzmoser, Peter

Organisationseinheit:

E105 - Institut für Stochastik und Wirtschaftsmathematik

Datum (veröffentlicht):

2020

Umfang:

317

Keywords:

ROC curve; Classification; Multivariate marker; R package

Abstract:

Binary classification is a very common problem whose objective is to correctly determine whether or not a subject has one characteristic of interest. On the basis of a gold standard, the objective is to discriminate between two populations (positive and negative, depending on having or not the characteristic of interest, respectively) by means of a variable, so-called marker. In any binary categorization, there exist two types of error: classifying a negative subject as a positive (false positive) and classifying a positive subject as a negative (false negative). The probabilities of those errors are determined by the complementary of the specificity (or false-positive rate), and the complementary of the sensitivity (or false-negative rate), respectively. The trade-off between the sensitivity (y-axis) and the complementary of the specificity (x-axis) is reflected in the receiver operating characteristic (ROC) curve. This statistical graphical method is therefore used to measure and visualize the discrimination performance of the marker under study. The classification accuracy is frequently summarized by the area under the curve (AUC), but the underlying classification rules are rarely exhibited since in the standard configuration the decision rules are immediately determined. However, the available information may not immediately discriminate between the two populations, and therefore the decision criterion is not direct. In such case, different dichotomization criteria should be explored, giving rise to a classification subset.The main goal of this dissertation is to revisit the definition of the ROC curve in order to graphically analyze the discriminatory capacity of a continuous marker when alternative rules to perform a binary classification are considered. It covers different shapes for the classification regions (defined as those where a subject is classified as a positive if their marker value is inside), as well as flexibility on the nature of the marker under study. On this basis, classification accuracy for multivariate markers may be directly assessed. Graphical representations to reflect different types of classification rules and to display the construction of the resulting ROC curve are studied, with the ultimate aim of elucidating the underlying decision rules and preserving their interpretability, if appropriate.

Lizenz:

Urheberrechtsschutz

Enthalten in den Sammlungen:

Thesis