<div class="csl-bib-body">
<div class="csl-entry">Sterzinger, R., Peer, M., & Sablatnig, R. (2025). Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models. In X.-C. Yin, D. Karatzas, & D. Lopresti (Eds.), <i>Document Analysis and Recognition – ICDAR 2025 : 19th International Conference Wuhan, China, September 16–21, 2025 Proceedings, Part III</i> (pp. 425–442). Springer. https://doi.org/10.1007/978-3-032-04624-6_25</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/221773
-
dc.description.abstract
As rich sources of history, maps provide crucial insights into historical changes, yet their diverse visual representations and limited annotated data pose significant challenges for automated processing. We propose a simple yet effective approach for few-shot segmentation of historical maps, leveraging the rich semantic embeddings of large vision foundation models combined with parameter-efficient fine-tuning. Our method outperforms the state-of-the-art on the Siegfried benchmark dataset in vineyard and railway segmentation, achieving +5% and +13% relative improvements in mIoU in 10-shot scenarios and around +20% in the more challenging 5-shot setting. Additionally, it demonstrates strong performance on the ICDAR 2021 competition dataset, attaining a mean PQ of 67.3% for building block segmentation, despite not being optimized for this shape-sensitive metric, underscoring its generalizability. Notably, our approach maintains high performance even in extremely low-data regimes (10- & 5-shot), while requiring only 689k trainable parameters – just 0.21% of the total model size. Our approach enables precise segmentation of diverse historical maps while drastically reducing the need for manual annotations, advancing automated processing and analysis in the field. Our implementation is publicly available at: https://github.com/RafaelSterzinger/few-shot-map-segmentation.
en
dc.language.iso
en
-
dc.relation.ispartofseries
Lecture Notes in Computer Science
-
dc.subject
Few-Shot Learning
en
dc.subject
Foundation Models
en
dc.subject
Historical Documents
en
dc.subject
Historical Maps
en
dc.subject
Low-Rank Adaptation
en
dc.subject
Semantic Segmentation
en
dc.title
Few-Shot Segmentation of Historical Maps via Linear Probing of Vision Foundation Models
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.editoraffiliation
University of Science and Technology Beijing, China
-
dc.contributor.editoraffiliation
Universitat Autònoma de Barcelona, Spain
-
dc.contributor.editoraffiliation
Lehigh University, United States of America (the)
-
dc.relation.isbn
978-3-032-04624-6
-
dc.relation.issn
0302-9743
-
dc.description.startpage
425
-
dc.description.endpage
442
-
dc.type.category
Full-Paper Contribution
-
dc.relation.eissn
1611-3349
-
tuw.booktitle
Document Analysis and Recognition – ICDAR 2025 : 19th International Conference Wuhan, China, September 16–21, 2025 Proceedings, Part III
-
tuw.container.volume
16025
-
tuw.peerreviewed
true
-
tuw.relation.publisher
Springer
-
tuw.relation.publisherplace
Cham
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publication.orgunit
E056-12 - Fachbereich ENROL DP
-
tuw.publication.orgunit
E056-18 - Fachbereich Visual Analytics and Computer Vision Meet Cultural Heritage
-
tuw.publisher.doi
10.1007/978-3-032-04624-6_25
-
dc.description.numberOfPages
18
-
tuw.author.orcid
0009-0001-0029-8463
-
tuw.author.orcid
0000-0001-6843-0830
-
tuw.author.orcid
0000-0003-4195-1593
-
tuw.event.name
The 19th International Conference on Document Analysis and Recognition (ICDAR 2025)
en
tuw.event.startdate
16-09-225
-
tuw.event.enddate
21-09-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Wuhan
-
tuw.event.country
CN
-
tuw.event.presenter
Sterzinger, Rafael
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
item.openairetype
conference paper
-
item.fulltext
no Fulltext
-
item.languageiso639-1
en
-
item.grantfulltext
restricted
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.orcid
0009-0001-0029-8463
-
crisitem.author.orcid
0000-0001-6843-0830
-
crisitem.author.orcid
0000-0003-4195-1593
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology