ROC curves for multivariate markers

Pérez Fernández, Sonia

doi:10.34726/hss.2021.89290

Record link:

https://doi.org/10.34726/hss.2021.89290
http://hdl.handle.net/20.500.12708/17188

Title:

Citation:

Pérez Fernández, S. (2020). ROC curves for multivariate markers [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2021.89290

reposiTUm DOI:

10.34726/hss.2021.89290

CatalogPlus:

AC16183986

Publication Type:

Thesis - Dissertation

Language:

English

Authors:

Pérez Fernández, Sonia

Advisor:

Filzmoser, Peter

Organisational Unit:

E105 - Institut für Stochastik und Wirtschaftsmathematik

Date (published):

2020

Number of Pages:

317

Keywords:

ROC curve; Classification; Multivariate marker; R package

Abstract:

Binary classification is a very common problem whose objective is to correctly determine whether or not a subject has one characteristic of interest. On the basis of a gold standard, the objective is to discriminate between two populations (positive and negative, depending on having or not the characteristic of interest, respectively) by means of a variable, so-called marker. In any binary categorization, there exist two types of error: classifying a negative subject as a positive (false positive) and classifying a positive subject as a negative (false negative). The probabilities of those errors are determined by the complementary of the specificity (or false-positive rate), and the complementary of the sensitivity (or false-negative rate), respectively. The trade-off between the sensitivity (y-axis) and the complementary of the specificity (x-axis) is reflected in the receiver operating characteristic (ROC) curve. This statistical graphical method is therefore used to measure and visualize the discrimination performance of the marker under study. The classification accuracy is frequently summarized by the area under the curve (AUC), but the underlying classification rules are rarely exhibited since in the standard configuration the decision rules are immediately determined. However, the available information may not immediately discriminate between the two populations, and therefore the decision criterion is not direct. In such case, different dichotomization criteria should be explored, giving rise to a classification subset.The main goal of this dissertation is to revisit the definition of the ROC curve in order to graphically analyze the discriminatory capacity of a continuous marker when alternative rules to perform a binary classification are considered. It covers different shapes for the classification regions (defined as those where a subject is classified as a positive if their marker value is inside), as well as flexibility on the nature of the marker under study. On this basis, classification accuracy for multivariate markers may be directly assessed. Graphical representations to reflect different types of classification rules and to display the construction of the resulting ROC curve are studied, with the ultimate aim of elucidating the underlying decision rules and preserving their interpretability, if appropriate.

License:

In Copyright

Appears in Collections:

Thesis