<div class="csl-bib-body">
<div class="csl-entry">Plattner, M. (2023). <i>On SGD with momentum</i> [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.106165</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2023.106165
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/187497
-
dc.description.abstract
Stochastic Gradient Descent (SGD) is the workhorse for training large-scale machine learning applications. Although the convergence rate of its deterministic counterpart, Gradient Descent (GD), can be shown to be accelerated by adaptations that use the notion of momentum, e.g., Heavy Ball (HB) or Nesterov Accelerated Gradient (NAG), the theory could not prove, by means of local convergence analysis, that such modifications provide faster convergence rates in the stochastic setting. This work empirically establishes that a positive momentum coefficient in SGD has the effect of enlarging the algorithm’s learning rate, not contributing to a boost in performance per se. For the deep learning setting, however, this enlargement tends to be conducted in a way robust to unfavorableinitialization points. Given these findings, this work derives a heuristic, the Momentum Linear Scaling Rule (MLSR), to transfer from a small-batch setting to a large-batch setting in deep learning while approximately maintaining the same generalization performance.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
SGD
en
dc.subject
optimization
en
dc.subject
machine learning
en
dc.subject
momentum
en
dc.title
On SGD with momentum
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2023.106165
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Maximilian Plattner
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Drucks, Tamara
-
tuw.publication.orgunit
E194 - Institut für Information Systems Engineering