Kain, T. (2020). Towards a reliable system architecture for autonomous vehicles [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.75021
Nowadays, vehicles are equipped with various advanced assistance systems that support the driver during the operation of the vehicle. Actions that modern vehicles are capable of doing are, for instance, keeping the distance to a preceding vehicle, autonomous parking, or switching lanes on highways. Although these functions are highly reliable and well tested, the driver is still constrained to monitor their behavior and take over control, if required. As far as fully autonomous vehicles are concerned, i.e., so-called SAE Level 5 vehicles, any takeover actions by passengers are excluded. To operate such autonomous vehicles, numerous software applications, including, for example, perception, planning, and vehicle control services, have to interact with each other. Many of these applications are safety-critical, i.e., a failure might result in a hazardous situation.Therefore, to guarantee the safety of the passengers and other road users in case an occurring failure causes a safety-critical application to misbehave, measures have to be implemented to ensure a safe operation in such situations. In this thesis, we introduce a fail-operational approach for handling failures in a stepwise fashion by adapting the FDIR (“Fault Detection, Isolation, and Recovery”) approach known from the aerospace domain, whereby we reimplemented the steps defined by FDIR to fit the area of autonomous driving. Moreover, we extended the FDIR approach by a system optimization procedure that improves the system stability and efficiency after a safety-critical reconfiguration. Accordingly, we call our approach FDIRO, standing for “Fault Detection, Isolation, Recovery, and Optimization”. Since a fast reconfiguration time is considered an essential requirement of many safety-critical software applications, the detection and isolation steps of the FDIRO approach are designed to be of low complexity. Therefore, these steps can be performed within milliseconds.To show that the detection and isolation steps can be performed in a short time, we provide a proof-of-concept implementation. We further present an implementation based on linear integer programming for determining recovery actions and introduce concepts for optimizing the system based on context observations.