<div class="csl-bib-body">
<div class="csl-entry">Johannes Spöcklberger, Lin, W., Hermosilla, P., Doveh, S., Possegger, H., & Mirza, J. M. (2025). Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation. In <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</i> (pp. 4789–4798). IEEE. https://doi.org/10.1109/CVPRW67362.2025.00465</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/223670
-
dc.description.abstract
Vision Foundation Models (VFMs) have become a de facto choice for many downstream vision tasks, like image classification, image segmentation, and object localization. However, they can also provide significant utility for downstream 3D tasks that can leverage the cross-modal information (e.g., from paired image data). In our work, we further explore the utility of VFMs for adapting from a labeled source to unlabeled target data for the task of LiDAR-based 3D semantic segmentation. Our method consumes paired 2D-3D (image and point cloud) data and relies on the robust (cross-domain) features from a VFM to train a 3D backbone on a mix of labeled source and unlabeled target data. At the heart of our method lies a fusion network that is guided by both the image and point cloud streams, with their relative contributions adjusted based on the target domain. We extensively compare our proposed methodology with different state-of-the-art methods in several settings and achieve strong performance gains. For example, achieving an average improvement of 6.5 mIoU (over all tasks), when compared with the previous state-of-the-art.
en
dc.language.iso
en
-
dc.subject
3D semantic segmentation
en
dc.subject
3D unsupervised domain adaptation
en
dc.subject
autonomous driving
en
dc.subject
domain adaptation
en
dc.subject
feature fusion
en
dc.subject
Lidar
en
dc.title
Exploring Modality Guidance to Enhance VFM-Based Feature Fusion for UDA in 3D Semantic Segmentation
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Graz University of Technology, Austria
-
dc.contributor.affiliation
Energieinstitut an der Johannes Kepler Universität Linz, Austria
-
dc.contributor.affiliation
Graz University of Technology (90000) (Graz, AT)
-
dc.contributor.affiliation
Massachusetts Institute of Technology, United States of America (the)
-
dc.relation.isbn
979-8-3315-9994-2
-
dc.relation.doi
10.1109/CVPRW67362.2025
-
dc.relation.issn
2160-7508
-
dc.description.startpage
4789
-
dc.description.endpage
4798
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
2160-7516
-
tuw.booktitle
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
IEEE
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publisher.doi
10.1109/CVPRW67362.2025.00465
-
dc.description.numberOfPages
10
-
tuw.author.orcid
0009-0008-0881-9295
-
tuw.author.orcid
0000-0003-2431-0620
-
tuw.author.orcid
0000-0002-5427-9938
-
tuw.event.name
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops
-
tuw.event.startdate
11-06-2025
-
tuw.event.enddate
12-06-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Nashville
-
tuw.event.country
US
-
tuw.event.presenter
Johannes Spöcklberger
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairetype
conference paper
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.fulltext
no Fulltext
-
crisitem.author.dept
Graz University of Technology, Austria
-
crisitem.author.dept
Energieinstitut an der Johannes Kepler Universität Linz, Austria
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
Graz University of Technology (90000) (Graz, AT)
-
crisitem.author.dept
Massachusetts Institute of Technology, United States of America (the)
-
crisitem.author.orcid
0009-0008-0881-9295
-
crisitem.author.orcid
0000-0003-2431-0620
-
crisitem.author.orcid
0000-0002-5427-9938
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology