<div class="csl-bib-body">
<div class="csl-entry">Thoma, M., Villasante, J., Aghajanzadeh, E., Balamuthu Sampath, S., Mori, P., Groetzinger, M., Dylkin, D., Vemparala, M.-R., Fasfous, N., Frickenstein, A., Mueller-Gritschneder, D., & Schlichtmann, U. (2025). Flar-SVD: Fast and Latency-Aware Singular Value Decomposition for Model Compression. In <i>2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</i> (pp. 1889–1898). IEEE. https://doi.org/10.1109/CVPRW67362.2025.00178</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/222805
-
dc.description.abstract
Advanced deep learning architectures have achieved exceptional prediction performance but come with significant computational demands, posing challenges for deployment on resource-constrained devices such as edge devices. While pruning techniques offer a way to reduce model complexity, they often lead to substantial accuracy loss and can require extensive retraining. Alternatively, Singular Value Decomposition (SVD) provides a promising solution by decomposing model weights into lower-dimensional representations, thus maintaining a closer representation of the original features and preserving accuracy. Despite progress in this domain, approaches targeted on vision model architectures typically rely on uniform compression or slow, computationally expensive rank search methods that do not account for latency improvements. In this paper, we introduce Fast, Latency-Aware Rank Singular Value Decomposition (FLAR-SVD), a novel approach that leverages inherent SVD properties to accelerate the rank search process and incorporates latency tuning to further optimize performance for hardware targets. We demonstrate the capability of our approach across CNN, ViT and Mamba architectures on both server and edge hardware. For DeiT we achieve 81.0 % accuracy on ImageNet with only 1 epoch of fine-tuning, while reducing latency by 30 % over the baseline. Our code is available in https://github.com/MoritzTho/FLAR-SVD.
en
dc.language.iso
en
-
dc.subject
convolutional neural networks
en
dc.subject
deep learning
en
dc.subject
edge computing
en
dc.subject
latency-aware optimization
en
dc.subject
model pruning
en
dc.subject
singular value decomposition
en
dc.title
Flar-SVD: Fast and Latency-Aware Singular Value Decomposition for Model Compression
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
BMW Group (Germany), Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.contributor.affiliation
BMW Group (Germany), Germany
-
dc.contributor.affiliation
BMW Group (Germany), Germany
-
dc.contributor.affiliation
BMW Group (Germany), Germany
-
dc.contributor.affiliation
Technical University of Munich, Germany
-
dc.relation.isbn
979-8-3315-9994-2
-
dc.relation.doi
10.1109/CVPRW67362.2025
-
dc.description.startpage
1889
-
dc.description.endpage
1898
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
-
tuw.peerreviewed
true
-
tuw.relation.publisher
IEEE
-
tuw.researchTopic.id
I2
-
tuw.researchTopic.name
Computer Engineering and Software-Intensive Systems
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E191-02 - Forschungsbereich Embedded Computing Systems
-
tuw.publisher.doi
10.1109/CVPRW67362.2025.00178
-
dc.description.numberOfPages
10
-
tuw.author.orcid
0009-0005-1988-3848
-
tuw.author.orcid
0000-0003-0903-631X
-
tuw.event.name
The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2025
en
tuw.event.startdate
11-06-2025
-
tuw.event.enddate
12-06-2025
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Nashville
-
tuw.event.country
US
-
tuw.event.presenter
Thoma, Moritz
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Elektrotechnik, Elektronik, Informationstechnik
-
wb.sciencebranch
Mathematik
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
2020
-
wb.sciencebranch.oefos
1010
-
wb.sciencebranch.value
50
-
wb.sciencebranch.value
40
-
wb.sciencebranch.value
10
-
item.openairetype
conference paper
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
item.cerifentitytype
Publications
-
item.languageiso639-1
en
-
item.grantfulltext
none
-
item.fulltext
no Fulltext
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
BMW Group (Germany), Germany
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
Technical University of Munich, Germany
-
crisitem.author.dept
BMW Group (Germany), Germany
-
crisitem.author.dept
BMW Group (Germany), Germany
-
crisitem.author.dept
BMW Group (Germany), Germany
-
crisitem.author.dept
E191-02 - Forschungsbereich Embedded Computing Systems