reposiTUm: Comparative Analysis of Fashion Captioning for Multimodal Fashion Recommendation

DC Field

Value

Language

dc.contributor.author

Rippberger Fonseca, Maria De Los Angeles Gwendolyn Aglae

dc.contributor.author

Neidhardt, Julia

dc.date.accessioned

2026-02-13T08:26:53Z

dc.date.available

2026-02-13T08:26:53Z

dc.date.issued

2025-09-22

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Rippberger Fonseca, M. D. L. A. G. A., & Neidhardt, J. (2025, September 22). <i>Comparative Analysis of Fashion Captioning for Multimodal Fashion Recommendation</i> [Conference Presentation]. 19th ACM Conference on Recommender Systems (RecSys ’25), Prag, Czechia. http://hdl.handle.net/20.500.12708/226348</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/226348

dc.description.abstract

Multimodal information provides new opportunities for recommender systems, especially in the fashion domain, where both visual and textual information can be utilized to provide a comprehensive understanding of the product. In this work, we focused on the task of fashion captioning, a specialized form of image captioning for fashion items. We fine-tuned pretrained vision-language models on two distinct fashion datasets to evaluate how effectively they capture dataset-specific ground truths. We were able to fine-tune the models successfully to a competitive result with specifically trained models. The resulting captioning models are applied in two key scenarios: (1) as components for generating richer multimodal embeddings in recommender systems, and (2) for modality imputation, where automatically generated descriptions are used to fill in missing textual data. We show that different modalities work better depending on the size of the dataset and the list length but none outperform the traditional item-based collaborative filtering technique using a real-life dataset with over 1M users and 31M transactions. Additionally, we present a detailed analysis of the two fashion datasets, highlighting critical aspects such as item presentation and textual style, which are often overlooked yet essential for effective modeling.

dc.language.iso

dc.subject

Multimodal Recommendation

dc.subject

Fashion Captioning

dc.subject

NLP

dc.subject

Generative AI

dc.title

Comparative Analysis of Fashion Captioning for Multimodal Fashion Recommendation

dc.type

Presentation

dc.type

Vortrag

dc.type.category

Conference Presentation

tuw.researchTopic.id

tuw.researchTopic.name

Computer Science Foundations

tuw.researchTopic.value

100

tuw.linking

https://github.com/omgwenxx/multimodal-fashion-analysis

tuw.publication.orgunit

E194-04 - Forschungsbereich Data Science

tuw.publication.orgunit

E056-23 - Fachbereich Innovative Combinations and Applications of AI and ML (iCAIML)

tuw.publication.orgunit

E056-27 - Fachbereich Digital Humanism

tuw.author.orcid

0000-0001-7184-1841

tuw.event.name

19th ACM Conference on Recommender Systems (RecSys '25)

tuw.event.startdate

22-09-2025

tuw.event.enddate

26-09-2025

tuw.event.online

On Site

tuw.event.type

Event for scientific audience

tuw.event.place

Prag

tuw.event.country

tuw.event.presenter

Rippberger Fonseca, Maria De Los Angeles Gwendolyn Aglae

wb.sciencebranch

Informatik

wb.sciencebranch

Wirtschaftswissenschaften

wb.sciencebranch.oefos

1020

wb.sciencebranch.oefos

5020

wb.sciencebranch.value

item.openairecristype

http://purl.org/coar/resource_type/c_18cp

item.fulltext

no Fulltext

item.languageiso639-1

item.grantfulltext

none

item.openairetype

conference paper not in proceedings

item.cerifentitytype

Publications

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.orcid

0000-0001-7184-1841

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

Appears in Collections:

Presentation

Other

Adobe PDF

(731.07 kB)

Preprint of Proceedings Contribution

CC BY 4.0

Show simple item record

Page view(s)

checked on Feb 13, 2026

Download(s)

checked on Feb 13, 2026

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM