<div class="csl-bib-body">
<div class="csl-entry">Weijler, L. M., Mirza, J. M., Sick, L., Ekkazan, C., & Hermosilla, P. (2025). TTT-KD: Test-Time Training for 3D Semantic Segmentation Through Knowledge Distillation From Foundation Models. In <i>2025 International Conference on 3D Vision (3DV)</i> (pp. 1264–1274). IEEE. https://doi.org/10.1109/3DV66043.2025.00120</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/221542
-
dc.description.abstract
Test-Time Training (TTT) proposes to adapt a pretrained network to changing data distributions on-the-fly. In this work, we propose the first TTT method for 3D semantic segmentation, TTT-KD, which models Knowledge Distillation (KD) from foundation models (e.g. DINOv2) as a self-supervised objective for adaptation to distribution shifts at test-time. Given access to paired image-pointcloud (2D-3D) data, we first optimize a 3D segmentation backbone for the main task of semantic segmentation using the pointclouds and the task of 2D → 3D KD by using an offthe-shelf 2D pre-trained foundation model. At test-time, our TTT-KD updates the 3D segmentation backbone for each test sample by using the self-supervised task of knowledge distillation before performing the final prediction. Extensive evaluations on multiple indoor and outdoor 3D segmentation benchmarks show the utility of TTT-KD, as it improves performance for both in-distribution (ID) and outof-distribution (OOD) test datasets. We achieve a gain of up to 13 % mIoU (7 % on average) when the train and test distributions are similar and up to 45 % (20 % on average) when adapting to OOD test samples. The code is available in the following repository.
en
dc.language.iso
en
-
dc.relation.ispartofseries
International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission (3DIMPVT)
-
dc.subject
domain adaptation
en
dc.subject
point clouds
en
dc.subject
semantic segmentation
en
dc.subject
test-time training
en
dc.title
TTT-KD: Test-Time Training for 3D Semantic Segmentation Through Knowledge Distillation From Foundation Models
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Massachusetts Institute of Technology, United States of America (the)
-
dc.contributor.affiliation
Universität Ulm, Germany
-
dc.contributor.affiliation
Yıldız Technical University, Turkey
-
dc.relation.isbn
979-8-3315-3851-4
-
dc.relation.doi
10.1109/3DV66043.2025
-
dc.relation.issn
2378-3826
-
dc.description.startpage
1264
-
dc.description.endpage
1274
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
2475-7888
-
tuw.booktitle
2025 International Conference on 3D Vision (3DV)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
IEEE
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publisher.doi
10.1109/3DV66043.2025.00120
-
dc.description.numberOfPages
11
-
tuw.author.orcid
0000-0003-1660-0329
-
tuw.author.orcid
0000-0001-8578-8332
-
tuw.author.orcid
0009-0004-6524-0715
-
tuw.author.orcid
0009-0002-3853-0480
-
tuw.event.name
2025 International Conference on 3D Vision (3DV)
en
tuw.event.startdate
25-03-2025
-
tuw.event.enddate
28-03-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.country
SG
-
tuw.event.presenter
Weijler, Lisa Magdalena
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
item.openairetype
conference paper
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.grantfulltext
restricted
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
Massachusetts Institute of Technology, United States of America (the)
-
crisitem.author.dept
Universität Ulm, Germany
-
crisitem.author.dept
Yıldız Technical University, Turkey
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.orcid
0000-0003-1660-0329
-
crisitem.author.orcid
0009-0004-6524-0715
-
crisitem.author.orcid
0009-0002-3853-0480
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology