Zaitoun, A., Sagi, T., & Hose, K. (2023). Automated Ontology Evaluation: Evaluating Coverage and Correctness using a Domain Corpus. In Y. Ding, J. Tang, & J. Sequeda (Eds.), WWW ’23 Companion: Companion Proceedings of the ACM Web Conference 2023 (pp. 1127–1137). Association for Computing Machinery. https://doi.org/10.1145/3543873.3587617
ontologies; natural language processing; BERT; Web Ontology Language
en
Abstract:
Ontologies conceptualize domains and are a crucial part of web semantics and information systems. However, re-using an existing ontology for a new task requires a detailed evaluation of the candidate ontology as it may cover only a subset of the domain concepts, contain information that is redundant or misleading, and have inaccurate relations and hierarchies between concepts. Manual evaluation of large and complex ontologies is a tedious task. Thus, a few approaches have been proposed for automated evaluation, ranging from concept coverage to ontology generation from a corpus. Existing approaches, however, are limited by their dependence on external structured knowledge sources, such as a thesaurus, as well as by their inability to evaluate semantic relationships. In this paper, we propose a novel framework to automatically evaluate the domain coverage and semantic correctness of existing ontologies based on domain information derived from text. The approach uses a domain-tuned named-entity-recognition model to extract phrasal concepts. The extracted concepts are then used as a representation of the domain against which we evaluate the candidate ontology’s concepts. We further employ a domain-tuned language model to determine the semantic correctness of the candidate ontology’s relations. We demonstrate our automated approach on several large ontologies from the oceanographic domain and show its agreement with a manual evaluation by domain experts and its superiority over the state-of-the-art.