<div class="csl-bib-body">
<div class="csl-entry">Schindler, A. (2019). <i>Multi-modal music information retrieval: augmenting audio-analysis with visual computing for improved music Video analysis</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2019.72065</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2019.72065
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/4496
-
dc.description.abstract
This thesis focuses on harnessing the information provided by the visual layer of music videos for augmenting and improving tasks of the research domain Music Information Retrieval (MIR). The main hypothesis of this work is based on the observation that certain expressive categories, such as genre or theme, can be recognized solely based on the visual content, without the sound being heard. This leads to the hypothesis that there exists a visual language that is used to express mood or genre. In a further consequence it can be concluded that this visual information is music related and therefore should be beneficial for the corresponding MIR tasks such as music genre classification or mood recognition. The validation of these hypotheses is first based on literature search in the Musicology and Music Psychology research domain to identify production processes in music videos or visual branding in the music business. The analytical approach is based on a series of comprehensive experiments and evaluations of visual features concerning their ability to describe music related information. These evaluations range from low-level visual features to high-level concepts. Additionally, new visual features are introduced capturing rhythmic visual patterns. Experimental results showed that the developed audio-visual approaches improved over the audio-based benchmark in the conducted experiments for the three prominent MIR tasks Artist Identification, Music Genre and Cross-Genre Classification. Finally, the experimental results were compared to the findings from the literature review, which revealed correlations between identified production processes and quantitatively determined audio-visual correlations. Thus, well-known and documented visual stereotypes (e.g., cowboy hat/Country music, swimsuit/Dance, fire/Heavy Metal), the choice of particular colours as well as theme-specific symbols, could be confirmed.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Music Information Retrieval
en
dc.subject
Multi-Modal Information Retrieval
en
dc.subject
Audio-Visual Analysis
en
dc.subject
Machine Learning
en
dc.title
Multi-modal music information retrieval: augmenting audio-analysis with visual computing for improved music Video analysis
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2019.72065
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Alexander Schindler
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering
-
dc.type.qualificationlevel
Doctoral
-
dc.identifier.libraryid
AC15508631
-
dc.description.numberOfPages
169
-
dc.identifier.urn
urn:nbn:at:at-ubtuw:1-130913
-
dc.thesistype
Dissertation
de
dc.thesistype
Dissertation
en
tuw.author.orcid
0000-0002-4881-6741
-
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.advisor.orcid
0000-0002-9272-6225
-
item.openairetype
doctoral thesis
-
item.fulltext
with Fulltext
-
item.cerifentitytype
Publications
-
item.openaccessfulltext
Open Access
-
item.mimetype
application/pdf
-
item.languageiso639-1
en
-
item.openairecristype
http://purl.org/coar/resource_type/c_db06
-
item.grantfulltext
open
-
crisitem.author.dept
E194 - Institut für Information Systems Engineering