Suntinger, M. (2009). Event-based similarity search and its applications in business analytics [Master Thesis, Technische Universität Wien]. reposiTUm. http://hdl.handle.net/20.500.12708/185660
Similarity Search; Event Sequence Similarity; Event-Based Systems; Complex Event Processing; Time-Series Similarity; Business Process Management; Business Intelligence
en
Abstract:
Event-based systems enable real-time monitoring of business incidents and automated decision making to react on threats or seize time-critical business opportunities. Applications thereof are manifold, ranging from logistics, fraud detection and recommender systems to automated trading. Business incidents reflect in sequences of events. Understanding these sequences is crucial for designing accurate decision rules. At the same time, analysis tools for event data are still in their infancy. The on-hand thesis presents a comprehensive and generic model for similarity search in event data. It illuminates several application domains to derive requirements for fuzzy retrieval of event sequences. Similarity assessment starts at the level of data fields encapsulated in single events. In addition, occurrence times of events, their order, missing events and redundant events are considered. In a graphical editor, the analyst models search-constraints and refines the pattern sequence. The model aims at utmost flexibility which is achieved by pattern modeling, configurable similarity techniques with different semantics and adjustable weights for similarity features. The algorithm computes the similarity between two event sequences based on assigning events in the target sequence to events in the pattern sequence with respect to given search constraints. An efficient Branch-&-Bound algorithm finds the best possible assignment, to compute the final similarity score. In addition, a novel way for time-series similarity is introduced and integrated. It slices a time-series at decisive turning points of the curve and compares the slopes between these turning points. We surveyed applicability in real-world scenarios in four case studies. Results are promising for structured business processes of limited length. When choosing appropriate configuration parameters to focus the search on aspects of interest, it is able to reveal if a reference case is a reoccurring pattern in the data.