On Biases in Information retrieval models and evaluation

Lipani, Aldo

doi:10.34726/hss.2018.59228

DC Field

Value

Language

dc.contributor.advisor

Hanbury, Allan

dc.contributor.author

Lipani, Aldo

dc.date.accessioned

2020-06-29T08:21:59Z

dc.date.issued

2018

dc.date.submitted

2018-09

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Lipani, A. (2018). <i>On Biases in Information retrieval models and evaluation</i> [Dissertation, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2018.59228</div> </div>

dc.identifier.uri

https://doi.org/10.34726/hss.2018.59228

dc.identifier.uri

http://hdl.handle.net/20.500.12708/5405

dc.description.abstract

Der Einzug der modernen Informationstechnologie in unsere Gesellschaft führte in den letzten fünfzig Jahren zu einer rasant wachsenden Menge von digitalen Inhalten. Während das Informationsangebot stetig steigt, bleiben unsere Fähigkeiten zur Informationsverarbeitung unverändert. Aufgrund dieser Überladung mit Informationen kommt dem Information Retrieval (IR) die wichtige Rolle zu, Systeme zu entwickeln, die relevante Informationen von irrelevanten trennen können. Diese Trennung ist allerdings auf Grund der Komplexität des Verstehens was relevant ist und was nicht, eine schwierige Aufgabe. Um diese Komplexität zu bewältigen, wurde im IR ein empirischer Ansatz gewählt, der zur Entwicklung praktikabler Retrieval-Modelle geführt hat, die einen systematischen Fehler bzw. eine Neigung (Bias) in Richtung relevanter Information aufweisen. Neben diesem Bias treten allerdings auch andere Verzerrungen auf, die problematisch für den Retrieval-Vorgang sind. In dieser Arbeit werden diese problematischen Bias durch die Betrachtung von Retrieval-Systemen als Informationsfilter bzw. Sampling-Prozesse systematisch untersucht. Es werden Bias erforscht die üblicherweise in zwei Bereichen des IR auftreten: Retrieval-Modelle und Retrieval-Evaluierung. Zunächst wird das Retrieval-Bias von probabilistischen IR-Modellen analysiert und neue Dokument-Prioren entwickelt um die Retrieval-Leistung zu steigern. Im Anschluss wird das Zugänglichkeits-Bias von Retrieval-Modellen erörtert. Für boolesche Retrieval-Modelle wird ein eigens entwickeltes mathematisches Framework beschrieben. Hinsichtlich des Bias für Retrieval-Evaluierung werden Testdatensätze, welche mittels Pooling-Methode erstellt wurden und somit ein charakteristisches Bias enthalten, analysiert. Um die Zuverlässigkeit der Evaluierung zu verbessern, werden neue Pooling-Strategien beschrieben. Diese Strategien reduzieren das Bias bereits während der Erstellung eines Testdatensatzes. Schließlich wird für die Maßzahlen Precisionund Recall-at-Cutoff (P@n und R@n) ein neuer Pool-Bias-Schätzer entwickelt, welcher das Bias während der Systemevaluierung reduziert. Um die vorgeschlagenen Methoden dieser Arbeit zu evaluieren, wurden 15 Testdatensätze, vier IR-Metriken und drei Bias-Messverfahren herangezogen. Durch Experimente werden folgende Erkenntnisse gewonnen: durch das Verwenden von Dokument-Prioren basierend auf “Verboseness” wird die Retrieval-Genauigkeit von probabilistischen IR-Modellen gesteigert; das Zugänglichkeits-Bias von booleschen IR-Modellen verschlechtert sich für konjunktive Anfragen mit steigender Länge der Anfragen (für disjunktive Anfragen kann eine leichte Verbesserung festgestellt werden); das Testdatensatz-Bias kann bei der Erstellung des Testdatensatzes durch Pooling-Strategien, welche aus dem Bereich des Reinforcement Learning entlehnt sind (“Multi-Armed Bandit Problem”), verkleinert werden; und das Testdatensatz-Bias kann in der Evaluierung durch die Analyse der Pool-Beteiligung in den einzelnen Durchläufen reduziert werden. Speziell für den letzten Punkt wird gezeigt, dass das Bias für P@n durch die Quantifizierung des neuen Systems gegen die gepoolten Durchläufe und für R@n durch die Auslassung einzelner gepoolter Durchläufe reduziert wird. Diese Arbeit leistet einen wichtigen Beitrag zum Gebiet des IR, indem ein besseres Verständnis von Relevanz durch die Betrachtung von Bias in Retrieval-Modellen und Retrieval-Evaluierung erreicht wird. Die Identifizierung dieser Bias und deren Nutzung bzw. Reduktion führt zur Entwicklung von performanteren IR-Modellen und zu einer Verbesserung der derzeitigen Vorgehensweise hinsichtlich IR-Evaluierung.

dc.description.abstract

The advent of the modern information technology has benefited society as the digitisation of content increased over the last half-century. While the processing capability of our species has remained unchanged, the information available to us has been notably increasing. In this overload of information, Information Retrieval (IR) has been playing a prominent role by developing systems capable of separating relevant information from the rest. This separation, however, is a difficult task rooted in the complexity of understanding of what is and what is not relevant. To manage this complexity, IR has developed a strong empirical nature, which has led to the development of grounded retrieval models, resulting in the development of retrieval systems empirically designed to be biased towards relevant information. However, other biases have been observed, which counteract retrieval performance. In this thesis, the reduction of retrieval systems to filters of information, or sampling processes, has allowed us to systematically investigate these biases. We study biases manifesting in two aspects of IR research: retrieval models and retrieval evaluation. We start by identifying retrieval biases in probabilistic IR models and then develop new document priors to improve retrieval performance. Next, we discuss the accessibility bias of retrieval models, and for Boolean retrieval models we develop a mathematical framework of retrievability. For retrieval evaluation biases, we study how test collections are built using the pooling method and how this method introduces bias. Then, to improve the reliability of the evaluation, we first develop new pooling strategies to mitigate this bias at test collection build time and then, for two IR evaluation measures, Precision and Recall at cut-off (P@n and R@n), we develop new pool bias estimators to mitigate it at evaluation time. Through a large scale experimentation involving up to 15 test collections, four IR evaluation measures and three bias measures, we demonstrate that including document priors based on verboseness improves the performance of probabilistic retrieval models; that the accessibility bias of Boolean retrieval models quickly worsens for conjunctive queries with the increase of the query length (while slightly improving for disjunctive queries); that the test collection bias can be lowered at test collection build time by pooling strategies inspired by a well-known problem in reinforcement learning, the multi-armed bandit problem; and that this bias can also be improved at evaluation time by analysing the runs participating in the pool. For this last point in particular, we show that for P@n, bias reduction is done by quantifying the potential of the new system against the pooled runs, and for R@n, this is done instead by simulating the absence of a pooled run from the set of pooled runs. This thesis contributes to the IR field by giving a better understanding of relevance through the lens of biases in retrieval models and retrieval evaluation. The identification of these biases, and their exploitation or mitigation, leads to the development of better performing IR models and the improvement of the current IR evaluation practice.

dc.language

English

dc.language.iso

dc.rights.uri

http://rightsstatements.org/vocab/InC/1.0/

dc.subject

Relevance

dc.subject

Term Frequency Normalisation

dc.subject

Verboseness

dc.subject

Retrievability

dc.subject

Pooling Method

dc.subject

Pooling Strategy

dc.subject

Pool Bias

dc.subject

Precision at cut-off

dc.subject

Recall at cut-off

dc.title

On Biases in Information retrieval models and evaluation

dc.type

Thesis

dc.type

Hochschulschrift

dc.rights.license

In Copyright

dc.rights.license

Urheberrechtsschutz

dc.identifier.doi

10.34726/hss.2018.59228

dc.contributor.affiliation

TU Wien, Österreich

dc.rights.holder

Aldo Lipani

dc.publisher.place

Wien

tuw.version

vor

tuw.thesisinformation

Technische Universität Wien

dc.contributor.assistant

Lupu, Mihai

tuw.publication.orgunit

E194 - Institut für Information Systems Engineering

dc.type.qualificationlevel

Doctoral

dc.identifier.libraryid

AC15166467

dc.description.numberOfPages

202

dc.identifier.urn

urn:nbn:at:at-ubtuw:1-114910

dc.thesistype

Dissertation

dc.thesistype

Dissertation

dc.rights.identifier

In Copyright

dc.rights.identifier

Urheberrechtsschutz

tuw.advisor.staffStatus

staff

tuw.assistant.staffStatus

staff

tuw.advisor.orcid

0000-0002-7149-5843

item.languageiso639-1

item.openairetype

doctoral thesis

item.grantfulltext

open

item.fulltext

with Fulltext

item.cerifentitytype

Publications

item.mimetype

application/pdf

item.openairecristype

http://purl.org/coar/resource_type/c_db06

item.openaccessfulltext

Open Access

crisitem.author.dept

E194-04 - Forschungsbereich Data Science

crisitem.author.parentorg

E194 - Institut für Information Systems Engineering

Appears in Collections:

Thesis

Fulltext (Version of Record (published version))

Adobe PDF

(10.98 MB)

In Copyright

Show simple item record

Page view(s)

368

checked on Nov 21, 2023

Download(s)

338

checked on Nov 21, 2023

Google Scholar^TM

Check

Page view(s)

Download(s)

Google ScholarTM

Google Scholar^TM