<div class="csl-bib-body">
<div class="csl-entry">Selzer, A. (2026). <i>A Structural Approach to Query Optimisation for Efficient Join and Aggregate Processing</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2026.138101</div>
</div>
-
dc.identifier.uri
https://doi.org/10.34726/hss.2026.138101
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/225588
-
dc.description
Arbeit an der Bibliothek noch nicht eingelangt - Daten nicht geprüft
-
dc.description.abstract
Join processing remains a fundamental challenge in database systems, particularly when intermediate results grow large. This problem is frequently encountered in analytical queries, which aggregate data over many relations, but output only a comparably small final result. Although the vast majority of user-generated queries are acyclic, and the foundational result by Yannakakis provides a solution for avoiding unnecessary computation of intermediate results in theory, structure-guided query evaluation remains absent from standard database systems. In this thesis, we work towards closing this gap from theory to practice.To explore the benefits and limitations of structure-guided query processing, we introduce a query-rewriting-based implementation of Yannakakis' algorithm, on top of different DBMSs. Additionally, we identify the class of zero-materialisation answerable (0MA) queries, which can be answered efficiently without materialising join results. Experimental evaluation shows that the rewriting approach can achieve significant speedups on hard instances but fails to outperform in most situations. These results motivate us to frame the problem as an algorithm selection problem -- we therefore apply machine learning to solve this problem, setting up a new data set of queries in the process, and verify that we can reliably decide when to apply the rewriting.Following up on these encouraging results, we go beyond the quite restrictive 0MA class, introducing guarded and piecewise-guarded aggregate queries, allowing us to cover a wide range of aggregate queries while avoiding materialisation. We integrate these optimisations into the query optimiser of Spark SQL by implementing logical optimisations and a new physical operator AggJoin. Through experimental evaluation, we show that this implementation speeds up aggregate processing on many queries of several benchmarks while never degrading performance. Furthermore, to cover arbitrary unguarded queries, where materialisation cannot be fully avoided, we extend the AggJoin to the GroupAggJoin operator, substantially reducing materialisation on these queries as well. By establishing a new benchmark for unguarded queries, we show that this approach successfully handles moderately unguarded queries.Having established efficient techniques for the processing of large classes of acyclic aggregate queries, we focus on the particularly challenging class of cyclic queries. To apply the standard approach of using decompositions to make cyclic queries acyclic successfully, it is clear that we require secondary optimisation objectives in addition to minimising the width of decompositions. To address this challenge, we introduce soft hypertree decompositions and the associated measure of soft hypertree width (shw). By avoiding the special condition of hypertree decompositions, we gain algorithmic flexibility while retaining the tractability of computing width-k decompositions. By developing an end-to-end pipeline, we achieve a seamless integration of these optimisations into DBMSs. Experimental evaluation on cyclic queries confirms that we can achieve significant improvements in decomposition quality, which translates to performance gains in practice.
en
dc.language
English
-
dc.language.iso
en
-
dc.rights.uri
http://rightsstatements.org/vocab/InC/1.0/
-
dc.subject
relational databases
de
dc.subject
join processing
de
dc.subject
aggregate queries
de
dc.subject
acyclic queries
de
dc.subject
cyclic queries
de
dc.subject
structural decompositions
de
dc.subject
relational databases
en
dc.subject
join processing
en
dc.subject
aggregate queries
en
dc.subject
acyclic queries
en
dc.subject
cyclic queries
en
dc.subject
structural decompositions
en
dc.title
A Structural Approach to Query Optimisation for Efficient Join and Aggregate Processing
en
dc.type
Thesis
en
dc.type
Hochschulschrift
de
dc.rights.license
In Copyright
en
dc.rights.license
Urheberrechtsschutz
de
dc.identifier.doi
10.34726/hss.2026.138101
-
dc.contributor.affiliation
TU Wien, Österreich
-
dc.rights.holder
Alexander Selzer
-
dc.publisher.place
Wien
-
tuw.version
vor
-
tuw.thesisinformation
Technische Universität Wien
-
dc.contributor.assistant
Lanzinger, Matthias Paul
-
tuw.publication.orgunit
E192 - Institut für Logic and Computation
-
dc.type.qualificationlevel
Doctoral
-
dc.identifier.libraryid
AC17758013
-
dc.description.numberOfPages
146
-
dc.thesistype
Dissertation
de
dc.thesistype
Dissertation
en
tuw.author.orcid
0000-0002-6867-5448
-
dc.rights.identifier
In Copyright
en
dc.rights.identifier
Urheberrechtsschutz
de
tuw.advisor.staffStatus
staff
-
tuw.assistant.staffStatus
staff
-
tuw.advisor.orcid
0000-0002-1760-122X
-
tuw.assistant.orcid
0000-0002-7601-3727
-
item.openaccessfulltext
Open Access
-
item.mimetype
application/pdf
-
item.grantfulltext
open
-
item.fulltext
with Fulltext
-
item.cerifentitytype
Publications
-
item.openairecristype
http://purl.org/coar/resource_type/c_db06
-
item.languageiso639-1
en
-
item.openairetype
doctoral thesis
-
crisitem.author.dept
E192-02 - Forschungsbereich Databases and Artificial Intelligence