Gavric, A., Bork, D., & Proper, H. A. (2024). Enriching Business Process Event Logs with Multimodal Evidence. In The Practice of Enterprise Modeling (pp. 175–191). https://doi.org/10.1007/978-3-031-77908-4_11
Process mining uses data from event logs to understand which activities were undertaken, their timing, and the involved entities, providing a data trail for process analysis and improvement. However, a significant challenge involves ensuring that these logs accurately reflect the actual processes. Some processes leave few digital traces, and their event logs often lack details about manual and physical work that does not involve computers or simple sensors. We introduce the Business-knowledge Integration Cycles (BICycle) method and mm_proc_miner tool to convert raw and unstructured data from various modalities, such as video, audio, and sensor data, into a structured and unified event log, while keeping human-in-the-loop. Our method analyzes the semantic distance between visible, audible, and textual evidence within a self-hosted joint embedding space. Our approach is designed to consider (1) preserving the privacy of evidence data, (2) achieving real-time performance and scalability, and (3) preventing AI hallucinations. We also publish a dataset consisting of over 2K processes with 16K steps to facilitate domain inference-related tasks. For the evaluation, we created a novel test dataset in the domain of DNA home kit testing, for which we can guarantee that it was not encountered during the training of the employed AI foundational models. We show positive insights in both event log enrichment with multimodal evidence and human-in-the-loop contribution.