Neustifter, A. (2010). Efficient profiling in the LLVM compiler infrastructure [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-36698
In computer science profiling is the process of determining the execution frequencies of parts of a program. This can be done by adding counters to the code, each time a part of the program is executed the counter is incremented. At the end of the execution the counters are written to a file for later use. Currently the LLVM Compiler Infrastructure supports the profiling of programs by placing counters in the code and reading the resulting profiling data during consecutive compilations. But these counters are placed with a naïve and inefficient algorithm that uses more counters than necessary. Also the recorded profiling information is not used in the compiler during optimisation or in the code generating backend when recompiling the program. This work tries to improve the existing profiling support in LLVM in two ways.<br />First, the number of counters placed in the code is reduced as presented by Ball and Larus in 1994. Counters are placed only at the leaves of a spanning tree of a function's control flow graph (CFG), which gives an incomplete profile after the program execution. This incomplete profile can be completed by propagating the values of the leaves back into the tree. Secondly, the profiling information is made available to the code generating backend. The CFG modifications and instruction selection passes are modified where necessary to preserve the profiling information so that backend passes and code generation can benefit from it. For example the register allocator is one such backend pass that could benefit since the spilling decisions are based on the execution frequency information. The efficiency of the implemented counter placing algorithm is evaluated by measuring profiling overhead for the naïve and for the improved counter placement. Also several techniques to further improve the counter placement are evaluated, additionally the effects of the profiling being available for the code generator are measured.