Fränkel, C. (2018). Long short-term memory neural networks for one-step time series forecasting [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2018.45860
LSTM; forecasting; time series; bayesian hyperparameter optimization; deep learning; neural networks; Internet of Things
en
Abstract:
This master thesis aims to employ Long Short-Term Memory (LSTM) neural networks for one-step and multi-step time series forecasts. For this endeavor, a software stack, including a deep learning framework is selected and different machine learning- and statistical models are implemented. The performance of the LSTM approaches is compared to carefully chosen benchmark methods on an exemplary real-world problem and the experiments are run on powerful, cloud-based machines. To provide a methodological framework for time series forecasting projects, a seven-phase process model is elaborated. Further, to allow for model selection of computationally intensive deep learning methods under limited resources, a modified form of blocked cross-validation, together with a multi-stage Bayesian hyperparameter optimization approach is proposed. The proof of concept of the pro-posed methodology is conducted on the real-world problem in the domain of electricity demand forecasting. The implemented LSTM model clearly outperformed the benchmark models on all per-formance measures in the one-step walk forward out-of-sample test and showed a roughly 10% low-er root-mean square error than the second-best model which utilized double seasonal Holt-Winters exponential smoothing. Inspired by work in the area of natural language processing, for the multi-step scenario, an encoder-decoder LSTM neural network was implemented, as simpler architectures showed disappointing results. Also, the multi-step LSTM forecaster proofed to be a competitive ap-proach, but the purely statistical model had the lead. However, due to resource constraints, it was not possible to retrieve statements on the same validity level as for the one-step case. By comparing the LSTM-forecaster to the predictive performance of simple recurrent neural networks, the added value of the more complex, gated cell architecture of the LSTM has been indicated. A downside of LSTM neural networks is the relatively long training time which can be a problem for exhaustive hy-perparameter searches. On the other hand, LSTM neural networks showed to have good generaliza-tion ability and needed comparably infrequent retraining.