Frohner, N. (2023). Advancing state space search for static and dynamic Optimization by parallelization and learning [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2023.113960
state space search; beam search; adaptive large neighborhood search; data-parallel search; traveling tournament problem; permutation flowshop problem; dynamic and stochastic vehicle routing; same-day delivery problems; value function approximation
en
Abstract:
In this thesis, we aim to advance state space search for static combinatorial optimization and stochastic decision making problems. With the recent rise of data-driven approaches based on machine learning methods and the increase of high-performance computing resources, we have new tools at our disposal to further push the practical limits of existing solution approaches. Driven by successful applications of the well-known heuristic state-space search method beam search, we implement and evaluate a general, data-parallel beam search framework in Julia for static combinatorial optimization problems. We demonstrate substantial speedups on the challenging traveling tournament problem (TTP), the permutation flow- shop problem, and the maximum independent set problem, with medium to high parallel efficiency for sufficiently large problem instances. With large-scale runs, this allows finding many new best feasible solutions for long-standing difficult benchmark instances of the TTP. As a challenging stochastic decision-making problem, we consider a real-world same-hour delivery problem originating from an online supermarket in Vienna, which promises to deliver orders within the hour. There, the task is to automatically dispatch a fleet of vehicles by planning their delivery routes and potential overtime to react to dynamically arriving unknown orders throughout the day, with the objective to maximize customer satisfaction while keeping delivery costs low. We propose and compare various policies based on a double-horizon approach, scenario sampling, surrogate functions, and value function learning, and compare these with myopic baseline policies. A key observation is that the potentially high computational demand of a classical scenario sampling approach can be replaced by an offline training of a machine learning model that is used effectively in the online decision making.