Ahmetaj, S. (2019). Rewriting approaches for ontology-mediated query answering [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2019.68700
Ontology-based data access (OBDA, for short) aims to facilitate the access to inherently incomplete and heterogeneous data sources and to retrieve more complete answers by means of an intermediate layer that conceptualizes the domain of interest, known as anontology. The OBDA paradigm is regarded as a key ingredient in modern information management systems, receiving tremendous attention over the past decade. Crucial to OBDA is the problem of answering user queries by taking into account the background knowledge provided by the ontology, viewing the query and the ontology as part of a composite query, called ontology-mediated query (OMQ). We refer to this problem as ontology-mediated query answering (OMQA). To make OMQA scale to large amounts of data and to be useful in practice, OMQA relies on converting the problem of answering OMQs to the problem of evaluating the query directly over the data. This is referred to as OMQ rewriting and is considered to be one of the most promising approaches for OMQA, as it allows for the exploitation of standard database management systems.Despite the fact that OMQA has been a subject of in-depth study in the database and KR research communities over the last decade, there are many challenges that remain to be addressed. In particular, several extensions of the components of the OMQA framework have been considered in order to enhance its expressive power and the application domain. We study three particular ways that enrich OMQA: the database is viewed as partially complete through closed predicates, the ontology languages are given in terms of expressive description logics (DLs) and guarded (disjunctive) tuple-generating dependencies (DTGDs), and the query language is given in terms of a fragment of SPARQL, the standard query language for the Semantic Web. The goal of this thesis is to investigate OMQA in the presence of such extensions and to explore novel rewriting techniques, with the emphasis on polynomial time rewritings. First, we develop a novel and versatile rewriting technique, which can be suitably extended and adapted for various OMQ languages, enabling polynomial time rewritings into variants of Datalog. By employing this technique we present a polynomial time rewriting for the DL ALCHOI in the presence of partial completeness of the database and a restricted class of conjunctive queries (CQs) into Datalog with stable negation.We then adapt it to support guarded DTGDs and a restricted class of CQs. We show that every such OMQ can be translated into a polynomially-sized Datalog program with disjunction (without negation) if the arity of the predicates and number of variables in the DTGDs is bounded by a constant; for non-disjunctive TGDs the rewriting is a plain Datalog program. To the best of our knowledge, these are the first such rewritings that are polynomial. We then study a fragment of SPARQL, called well-designed SPARQL, which extends conjunctive queries with a special operator. In recent years, SPARQL has been widely adopted as a query language for OMQA. However, the semantics of query answering under SPARQL entailment regimes is defined in a more naive and much less expressive way than the certain answer semantics usually adopted in OMQA. To bridge this gap, we introduce an intuitive certain answer semantics for SPARQL and present two rewriting approaches that allow us to obtain the certain answers for this fragment and OWL 2 QL, a standardized ontology language based on a lightweight DL.