Title: Build failure prediction in continuous integration workflows
Language: English
Authors: Rausch, Thomas 
Qualification level: Diploma
Keywords: Empirical Software Engineering; Machine Learning; Predictive Analytics; Continuous Integration; Build Failure Prediction
Empirical Software Engineering; Machine Learning; Predictive Analytics; Continuous Integration; Build Failure Prediction
Advisor: Schulte, Stefan  
Issue Date: 2016
Number of Pages: 128
Qualification level: Diploma
Continuous integration (CI) is a practice where developers integrate their work into the main stream of development frequently. A CI server monitors the source code repository of a project and automatically executes the software build process when new changes are checked in. If a build fails, developers have to identify and fix the cause of the broken build, leading to a delay in the integration process and stalling further development. Large software projects often have long running builds that exacerbate this problem. Despite the widespread use of CI, little is known about the multiplicity of errors that cause builds to fail. Yet, understanding when and why build errors occur is an important step towards improving developer productivity in the CI workflow. By identifying characteristics of development practices that cause build failures, we can predict preliminary results for an integration. This helps developers react to possible problems even before a build is initiated, thereby saving time and resources. In this thesis, we introduce CInsight, a framework for analyzing CI workflows and build failures. We conduct an empirical study on real-world data from 14 open source software projects. Data from source code repositories and build systems are explored to gather qualitative and quantitative evidence about the multiplicity and frequency of CI build errors. Statistical methods are used to examine the relationship between development practices and build failures. Based on the results, we devise a method for CI build failure prediction. Our results show that failing unit-tests and violations of code quality rules are the main causes for build failures. The statistical analyses reveal that the type and amount of previous errors are the strongest predictor for future failures. Our best prediction models yield average recall and precision values of 0.82 and 0.80, respectively. Furthermore, our approach allows to update a prediction during the execution of a build.
URI: https://resolver.obvsg.at/urn:nbn:at:at-ubtuw:1-7311
Library ID: AC13351644
Organisation: E184 - Institut für Informationssysteme 
Publication Type: Thesis
Appears in Collections:Thesis

Files in this item:

Show full item record

Page view(s)

checked on Feb 21, 2021


checked on Feb 21, 2021

Google ScholarTM


Items in reposiTUm are protected by copyright, with all rights reserved, unless otherwise indicated.