<div class="csl-bib-body">
<div class="csl-entry">Lin, T., Dadras, A., Kleber, F., & Sablatnig, R. (2025). DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification. In <i>SUMAC ’25: Proceedings of the 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents</i> (pp. 13–21). The Association for Computing Machinery. https://doi.org/10.1145/3746273.3760209</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/223142
-
dc.description.abstract
Camera movement classification (CMC) models trained on contemporary, high-quality footage often degrade when applied to archival film, where noise, missing frames, and low contrast obscure motion cues. We bridge this gap by assembling a unified benchmark that consolidates two modern corpora into four canonical classes and restructures the HISTORIAN collection into five balanced categories. Building on this benchmark, we introduce DGME-T, a lightweight extension to the Video Swin Transformer that injects directional grid motion encoding, derived from optical flow, via a learnable and normalized late-fusion layer. DGME-T raises the backbone's top-1 accuracy from 81.78% to 86.14% and its macro F1 from 82.08% to 87.81% on modern clips, while still improving the demanding World-War-II footage from 83.43% to 84.62% accuracy and from 81.72% to 82.63% macro F1. A cross-domain study further shows that an intermediate fine-tuning stage on modern data increases historical performance by more than five percentage points. These results demonstrate that structured motion priors and transformer representations are complementary and that even a small, carefully calibrated motion head can substantially enhance robustness in degraded film analysis.
en
dc.description.sponsorship
FWF - Österr. Wissenschaftsfonds
-
dc.language.iso
en
-
dc.subject
Historical Camera Movement Classification
en
dc.subject
Historical Video / Archival Film
en
dc.subject
Optical Flow
en
dc.subject
Domain Adaptation
en
dc.subject
Late-Fusion Layer
en
dc.subject
Robustness
en
dc.subject
Digital Heritage & Preservation
en
dc.subject
Video Transformer
en
dc.subject
Directional Grid Motion Encoding
en
dc.title
DGME-T: Directional Grid Motion Encoding for Transformer-Based Historical Camera Movement Classification
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
St. Pölten University of Applied Sciences, Austria
-
dc.relation.isbn
979-8-4007-2055-0
-
dc.description.startpage
13
-
dc.description.endpage
21
-
dc.relation.grantno
DFH 37-N
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
SUMAC '25: Proceedings of the 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents
-
tuw.peerreviewed
true
-
tuw.relation.publisher
The Association for Computing Machinery
-
tuw.project.title
Visuelle Analytik und Computer Vision treffen auf kulturelles Erbe
-
tuw.researchTopic.id
I5
-
tuw.researchTopic.name
Visual Computing and Human-Centered Technology
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E193-01 - Forschungsbereich Computer Vision
-
tuw.publication.orgunit
E056-12 - Fachbereich ENROL DP
-
tuw.publication.orgunit
E056-18 - Fachbereich Visual Analytics and Computer Vision Meet Cultural Heritage
-
tuw.publisher.doi
10.1145/3746273.3760209
-
dc.description.numberOfPages
9
-
tuw.author.orcid
0009-0008-9825-686X
-
tuw.author.orcid
0000-0001-6474-7208
-
tuw.author.orcid
0000-0001-8351-5066
-
tuw.author.orcid
0000-0003-4195-1593
-
tuw.event.name
SUMAC '25: The 7th International Workshop on analySis, Understanding and proMotion of heritAge Contents
en
tuw.event.startdate
27-10-2025
-
tuw.event.enddate
31-10-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Dublin
-
tuw.event.country
IE
-
tuw.event.presenter
Lin, Tingyu
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.grantfulltext
restricted
-
item.languageiso639-1
en
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.fulltext
no Fulltext
-
item.openairetype
conference paper
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
St. Pölten University of Applied Sciences, Austria
-
crisitem.author.dept
E193-01 - Forschungsbereich Computer Vision
-
crisitem.author.dept
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.orcid
0009-0008-9825-686X
-
crisitem.author.orcid
0000-0001-6474-7208
-
crisitem.author.orcid
0000-0001-8351-5066
-
crisitem.author.orcid
0000-0003-4195-1593
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology
-
crisitem.author.parentorg
E193 - Institut für Visual Computing and Human-Centered Technology