<div class="csl-bib-body">
<div class="csl-entry">Naghibzadeh-Jalali, S.-A. (2018). <i>Sound event detection with deep neural networks</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2018.42625</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2018.42625
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/5458
-
dc.description
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers
-
dc.description.abstract
Acoustic Sound Event Detection (SED) has been extensively studies over the past years and is considered an emerging topic in Computational Auditory Scene Analysis (CASA) research which relates to the cocktail party eect. SED systems try to implement the phenomenon ability of the human brain, which enables human to detect any events occurring in the environmental sound in its surrounding. Therefore, these systems are trained in such a way that they classify sound events in the input audio signals. A Sound event is a label used by humans to describe and identify an event in an audio sequence. The proposed methodology used for this thesis is the Articial Neural Networks (ANNs) which have already shown robust performance on complicated tasks such as Speech Recognition, Natural Language Processing and Image Classication. Dierent audio input representations such as Constant Q-transform, Mel Frequency Cepstral Coecient (MFCC) and Mel Spectrogram are also tested from which Mel-Spectrogram proved to be the better representation among the ones mentioned. The ANN architectures studied in this work are the Recurrent Neural Network (RNN) and its extension, Long Short Term Memory (LSTM) and the Convolutional Neural Network (CNN). RNN architecture was chosen because of its ability to capture the temporal behaviour of its inputs and CNN architecturebecauseofitsabilitytolearnthehighlevelfeaturesthroughitsconvolutional layers. To generalize the constructed models, data augmentation was performed and also, the dropout technique was applied to avoid over learning. To evaluate the performance of these models, two datasets provided by the DCASE community for their DCASE 2017 challenge were used. The experimental results of this thesis show the robustness of deep neural networks in comparison with the conventional Multilayer Perceptron, ans Support vector machines which are considered as the baseline systems.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
deep learning
en
dc.subject
deep neural networks
en
dc.subject
audio event detection
en
dc.subject
sound event detection
en
dc.subject
acoustic event detection
en
dc.subject
event detection
en
dc.title
Sound event detection with deep neural networks
en
dc.title.alternative
Akustische Szenenalanyse mit Deep Neural Networks
de
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2018.42625
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Seyedeh-Anahid Naghibzadeh-Jalali
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
tuw.publication.orgunit
E188 - Institut für Softwaretechnik und Interaktive Systeme