Binder, M. (2021). Shape optimization based on reinforcement learning [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2021.86842
The main focus of this thesis is to explore the feasibility of learning-based algorithms such as Reinforcement Learning (RL) as a data-driven alternative to classical optimization algorithms. For this, a simple geometry T-shaped geometry, which can be seen as an abstraction of the flow channel inside a profile extruder, is optimized with two different RL algorithms. First, a test function for optimization is introduced to establish if the RL algorithm works and if the training of the algorithm can be improved. Based on this test function, a reward function is shaped, and a hyperparameter study is performed. The results show, that a dynamic reward function is most suitable for this task and show that the standard hyperparameter are good enough and do not need to be changed. For the shape optimization task, a specific mass flow ratio between the two outflows of the geometry has to be configured. The flow channel geometry is parameterized by two different methods — one changes the corner points of the geometry directly, while the other one applies Free-Form Deformation (FFD). FFD deforms a box surrounding the object to change its shape. The experiments are carried out in order of increasing Degrees Of Freedom (DOF), as this turns out to be a measurement of the difficulty of the tasks. The RL algorithms are trained for a specific number of episodes and are evaluated if they can achieve the pre-defined goal of a specific mass flow ratio and if the learning decreases the number of time steps needed per episode.The RL algorithms tested, namely Advantage Actor Critic (A2C) and Proximal Policy Optimization (PPO), can both achieve the pre-defined goals most of the time. In the tasks with the direct change of coordinates, the algorithms can improve their policy while their performance stays fairly constant for the task with the FFD, probably because it has too many DOF. In the test cases where the agents can improve their policy, the A2C agents outperforms the PPO agent. The methods for shape optimization introduced in this thesis look very promising and, if further improved, could become a new standard for shape optimization tasks.
en
Additional information:
Abweichender Titel nach Übersetzung der Verfasserin/des Verfassers