Faithful Attention Attribution in Vision Transformers for Chest X-Ray Interpretation

Sula, Julius

doi:10.34726/hss.2026.132361

DC Field

Value

Language

dc.contributor.advisor

Lukasiewicz, Thomas

dc.contributor.author

Sula, Julius

dc.date.accessioned

2026-03-16T12:09:16Z

dc.date.issued

2026

dc.date.submitted

2026-02

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Sula, J. (2026). <i>Faithful Attention Attribution in Vision Transformers for Chest X-Ray Interpretation</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2026.132361</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2026.132361

dc.identifier.uri

http://hdl.handle.net/20.500.12708/226961

dc.description

Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft

dc.description.abstract

Vision Transformers (ViTs) achieve strong performance in natural and medical imaging, yet their decision processes remain opaque. This is especially problematic in high-stakes settings like chest X-ray interpretation. TransMM is among the strongest attribution methods for ViTs, combining attention with class-specific gradients to highlight influential image patches. We ask whether injecting semantic structure from Sparse Autoencoders (SAEs) can further improve the faithfulness of such attributions.We introduce Feature-Gradient Attribution, which extends TransMM’s principle from attention space to feature space. SAEs are trained on residual streams to decompose activations into sparse, interpretable features, providing per-patch feature activations. We project gradients onto the SAE feature basis and compute feature-gradient scores that capture both which learned features are present and how they influence the target logit. These scores yield per-patch gates that modulate TransMM’s attention maps before relevance propagation, forming a lightweight, semantically informed correction.Across three datasets (chest X-rays, endoscopy, natural images), two architectures (finetuned ViT-B/16 and contrastively pre-trained CLIP ViT-B/32), and three complementary faithfulness metrics, our method improves attribution faithfulness consistently. Improvements are statistically significant (p<0.001) on all three metrics for one dataset and on two of three metrics for the remaining datasets. We observe gains of 10.5-34.8% on SaCo and 9.7-43.0% on Faithfulness Correlation, with Pixel Flipping improving by 1.8-10.8%. Notably, we never observe degradation relative to TransMM on any metric–dataset combination.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Vision Transformers

dc.subject

Attribution Methods

dc.subject

Faithfulness

dc.subject

Interpretability

dc.subject

Attention-Based Explanations

dc.subject

Sparse Autoencoders

dc.subject

Feature-Level Interpretability

dc.subject

Mechanistic Interpretability

dc.title

Faithful Attention Attribution in Vision Transformers for Chest X-Ray Interpretation

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2026.132361

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Julius Sula

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

dc.contributor.assistant

Menzat, Bayar Ilhan

tuw.publication.orgunit

E192 - Institut für Logic and Computation

dc.type.qualificationlevel

Diploma

dc.identifier.libraryid

AC17802980

dc.description.numberOfPages

dc.thesistype

Diplomarbeit

dc.thesistype

Diploma Thesis

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.assistant.staffStatus

staff

item.openairecristype

http://purl.org/coar/resource_type/c_bdcc

item.grantfulltext

open

item.cerifentitytype

Publications

item.openairetype

master thesis

item.mimetype

application/pdf

item.languageiso639-1

item.fulltext

with Fulltext

item.openaccessfulltext

Open Access

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(3.11 MB)

In Copyright

Show simple item record

Page view(s)

checked on Mar 16, 2026

Download(s)

checked on Mar 16, 2026

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM