Maruszczak, D. (2023). Evaluation of energy-based modelling for medical pathology classification [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.78580
E105 - Institut für Stochastik und Wirtschaftsmathematik
-
Date (published):
2023
-
Number of Pages:
154
-
Keywords:
Klassifikation; Neuronale Netze
de
Classification; Neural networks
en
Abstract:
This thesis investigates and compares the suitability of two neural network based approaches for medical image diagnosis of posteroanterior chest radiographs, where diagnosis can be interpreted as a classification problem. On the one hand, we train a convolutional neural network using the standard deep learning methodology. On the other hand we train the same convolutional architecture using energ...
This thesis investigates and compares the suitability of two neural network based approaches for medical image diagnosis of posteroanterior chest radiographs, where diagnosis can be interpreted as a classification problem. On the one hand, we train a convolutional neural network using the standard deep learning methodology. On the other hand we train the same convolutional architecture using energy based methodology, meaning the standard classifier of p(y|x) is reinterpreted as an energy based model for the joint distribution p(x,y). The rationale behind this is that deep learning, while being very efficient and able to achieve very high accuracies, has significant drawbacks in the context of reliable uncertainty quantification, specifically in areas of out of distribution detection and model calibration. They often give overly confident predictions, which can be fatal in sensitive, high-risk areas of application such as medicine. Thus, uncertainty quantification is essential for widespread real-world adoption. The energy based model is a lot more flexible and can be used to create a hybrid model that combines both generative and discriminative capabilities, resulting in predictions that reflect the model’s own uncertainty much better. In addition, the model is versatile enough to be used for a host of other accessory tasks that are briefly investigated in this thesis such as outlier detection and sample/synthetic data generation. The energy-based work in this thesis is based in a large part on EBM research conducted by Will Grathwohl and Yann LeCun. Energy based modelling research has recently seen a strong increase in interest due to improvements in technology and its elegance and flexibility. In this setting, the standard class probabilities can be computed as well as unnormalized values of p(x) and p(x|y). We compare the deep learning classifier and the energy based model in the context of medical image classification using the chest x-ray dataset CheXpert, published by the Stanford Machine learning Group. The main experimental findings showed that energy based training results in strong discriminative results and improve out of distribution detection and outlier detection compared to a standard deep learning model, while also being able to generate samples of high quality. Previous literature has found that energy based models improve model calibration as well, this could not be fully reproduced. This approach is the first to be tested on medical image data and achieves results rivaling the generative and discriminative state-of-the-art within one hybrid model.
en
Additional information:
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers