Zhukova, O. (2018). Algorithm selection and performance prediction for the examination timetabling problem [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2018.39302
The Examination Timetabling Problem (ETP)is an important task that appears periodically at universities, colleges and schools. A general definition of the timetabling problem which covers most cases is as follows: a timetabling problem is a problem with four parameters: T, a finite set of times; R, a finite set of resources; M, a finite set of exams; and C, a finite set of constraints. The problem is to assign times and resources to the exams so as to satisfy the constraints as much as possible. However, due to its practical focus, a particular point of interest is the formulation presented on the ITC 2007. Examination timetabling has been studied for years, and NP-completeness of some formulations was proven. Therefore exact methods can not always solve large instances in a reasonable time. Although a lot of different heuristics have been developed, there is no known algorithm that dominates on all problem instances. Moreover, according to the No Free Lunch Theorem, this situation is in fact impossible. Therefore, in order to achieve better performance, we can choose the best performing heuristic for a particular instance using a set of predefined instance characteristics. This task is commonly known as the Algorithm Selection Problem (AS). Additionally, it might be practically useful to be able to predict the quality of the solution obtained by a certain solver on a given instance. This problem has been named the Algorithm Performance Prediction Problem (APP). In this thesis, we present the solution for the Algorithm Selection and the Algorithm Performance Prediction Problems for the ITC2007 formulation of the ETP using Machine Learning techniques. For that, we introduce a set of characteristics which consists of 196 features. Also, we collect 3 State-Of-The-Art heuristics for the ETP and run them on the dataset of 2243 real-world and artificially generated instances. We find none of the algorithms outperforms the others for all instances. Subsequently, we use 6 classification algorithms and 9 regression techniques for solving the AS and the APP problems respectively. Additionally, we investigate the influence of the parameter settings and preprocessing techniques on estimator performance. Moreover, we inspect the importance of particular feature groups for the AS and APP problems. As a result, we succeed in the building of rather accurate performance prediction models for the APP and construction of the AS solver that outperforms all of their underlying heuristics individually.