<div class="csl-bib-body">
<div class="csl-entry">Vogl, B. (2024). <i>LLM calibration: A dual approach of post-processing and pre-processing calibration techniques in large language models for medical question answering</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2024.118886</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2024.118886
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/198211
-
dc.description.abstract
This thesis investigates the performance of Large Language Models in answering medical multiple-choice questions and explores strategies to enhance their accuracy, confidence estimation, and calibration. Specifically, we analyze the capabilities of GPT-3.5 and Cohere using the MedMCQA dataset, focusing on prompting techniques, revision strategies, and post-processing calibration methods. Our goals include assessing the efficacy of Chain of Thought prompting, examining the relationship between model confidence and correctness, and evaluating post-processing calibration techniques such as Platt Scaling, Beta Calibration, and Isotonic Regression.Findings reveal GPT-3.5's superior accuracy compared to Cohere in medical question-answering. However, CoT prompting did not significantly improve model performance, suggesting its limited effectiveness in this context. Model confidence correlated with answer accuracy, but discrepancies between predicted and actual performance underscored the importance of robust calibration methods. Revision strategies marginally improved accuracy, with models adjusting responses when prompted to reconsider. Post-processing calibration techniques, particularly Isotonic Regression, demonstrated significant improvements in alignment between predicted probabilities and actual outcomes, enhancing model reliability.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
LLM
en
dc.subject
Calibration
en
dc.subject
Chain-of-Thought
en
dc.subject
Medical Questioning
en
dc.subject
Diagnosis
en
dc.subject
LLM Confidence
en
dc.title
LLM calibration: A dual approach of post-processing and pre-processing calibration techniques in large language models for medical question answering
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2024.118886
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Bettina Vogl
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering