<div class="csl-bib-body">
<div class="csl-entry">Dallinger, D. (2025). <i>Raw Audio Piano Synthesis with Structured State Space Models</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2025.128685</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2025.128685
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/215487
-
dc.description
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
This thesis introduces Piano-SSM, a novel Structured State Space Model (SSM) architecture for real-time raw piano audio synthesis. Unlike conventional neural audio synthesis models, Piano-SSM focuses on computational efficiency by utilizing the advantages of SSMs, such as linear computational complexity with the sequence length and constant memory consumption. The proposed model synthesizes audio directly from Musical Instrument Digital Interface (MIDI) input. The network requires no intermediate representations in the form of spectral representations or domain-specific expert knowledge, simplifying training and improving accessibility. Evaluations on the MIDI and Audio Edited for Synchronous TRacks and Organization (MAESTRO) dataset show that Piano-SSM achieves a Multi-Scale Spectral Loss (MSSL) comparable to state-of-the-art models. Moreover, evaluations on the MIDI Aligned Piano Sounds (MAPS) dataset demonstrate the model’s generalization capabilities when trained on a dataset with very limited data. Further experiments on the MAESTRO dataset highlight the model’s ability to be trained on a high sampling rate while synthesizing on lower sampling rates. Finally, utilizing a custom C++ implementation, the thesis demonstrates Piano-SSM’s ability to synthesize high-quality piano audio in real-time.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
Deep Neural Networks
en
dc.subject
Machine Learning
en
dc.subject
State Space Models
en
dc.subject
audio classification
en
dc.title
Raw Audio Piano Synthesis with Structured State Space Models