<div class="csl-bib-body">
<div class="csl-entry">Staudinger, M., Kusa, W., Piroi, F., Lipani, A., & Hanbury, A. (2024). A Reproducibility and Generalizability Study of Large Language Models for Query Generation. In <i>SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region</i> (pp. 186–196). The Association for Computing Machinery. https://doi.org/10.1145/3673791.3698432</div>
</div>
-
dc.identifier.uri
http://hdl.handle.net/20.500.12708/209780
-
dc.description.abstract
Systematic literature reviews (SLRs) are a cornerstone of academic research, yet they are often labour-intensive and time-consuming due to the detailed literature curation process. The advent of generative AI and large language models (LLMs) promises to revolutionize this process by assisting researchers in several tedious tasks, one of them being the generation of effective Boolean queries that will select the publications to consider including in a review. This paper presents an extensive study of Boolean query generation using LLMs for systematic reviews, reproducing and extending the work of Wang et al. and Alaniz et al. Our study investigates the replicability and reliability of results achieved using ChatGPT and compares its performance with open-source alternatives like Mistral and Zephyr to provide a more comprehensive analysis of LLMs for query generation.
Therefore, we implemented a pipeline, which automatically creates a Boolean query for a given review topic by using a previously defined LLM, retrieves all documents for this query from the PubMed database and then evaluates the results. With this pipeline we first assess whether the results obtained using ChatGPT for query generation are reproducible and consistent. We then generalize our results by analyzing and evaluating open-source models and evaluating their efficacy in generating Boolean queries.
Finally, we conduct a failure analysis to identify and discuss the limitations and shortcomings of using LLMs for Boolean query generation. This examination helps to understand the gaps and potential areas for improvement in the application of LLMs to information retrieval tasks. Our findings highlight the strengths, limitations, and potential of LLMs in the domain of information retrieval and literature review automation. Our code is available online.
en
dc.language.iso
en
-
dc.subject
systematic reviews
en
dc.subject
Boolean query
en
dc.subject
LLMs
en
dc.subject
query generation
en
dc.title
A Reproducibility and Generalizability Study of Large Language Models for Query Generation
en
dc.type
Inproceedings
en
dc.type
Konferenzbeitrag
de
dc.relation.publication
SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
-
dc.contributor.affiliation
University College London, United Kingdom of Great Britain and Northern Ireland (the)
-
dc.relation.isbn
979-8-4007-0724-7
-
dc.description.startpage
186
-
dc.description.endpage
196
-
dc.type.category
Full-Paper Contribution
-
tuw.booktitle
SIGIR-AP 2024: Proceedings of the 2024 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region
-
tuw.relation.publisher
The Association for Computing Machinery
-
tuw.relation.publisherplace
New York, NY, USA
-
tuw.researchTopic.id
I4
-
tuw.researchTopic.name
Information Systems Engineering
-
tuw.researchTopic.value
100
-
tuw.publication.orgunit
E194-04 - Forschungsbereich Data Science
-
tuw.publication.orgunit
E058-06 - Fachbereich Zentrum für Forschungsdatenmanagement
-
tuw.publisher.doi
10.1145/3673791.3698432
-
dc.description.numberOfPages
11
-
tuw.author.orcid
0000-0002-5164-2690
-
tuw.author.orcid
0000-0003-4420-4147
-
tuw.author.orcid
0000-0001-7584-6439
-
tuw.author.orcid
0000-0002-3643-6493
-
tuw.author.orcid
0000-0002-7149-5843
-
tuw.event.name
SIGIR-AP 2024
en
tuw.event.startdate
09-12-2024
-
tuw.event.enddate
12-12-2024
-
tuw.event.online
On Site
-
tuw.event.type
Event for scientific audience
-
tuw.event.place
Tokyo
-
tuw.event.country
JP
-
tuw.event.presenter
Staudinger, Moritz
-
wb.sciencebranch
Informatik
-
wb.sciencebranch
Wirtschaftswissenschaften
-
wb.sciencebranch.oefos
1020
-
wb.sciencebranch.oefos
5020
-
wb.sciencebranch.value
90
-
wb.sciencebranch.value
10
-
item.grantfulltext
none
-
item.languageiso639-1
en
-
item.openairetype
conference paper
-
item.cerifentitytype
Publications
-
item.fulltext
no Fulltext
-
item.openairecristype
http://purl.org/coar/resource_type/c_5794
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.dept
E058-06 - Fachbereich Zentrum für Forschungsdatenmanagement
-
crisitem.author.dept
University College London
-
crisitem.author.dept
E194-04 - Forschungsbereich Data Science
-
crisitem.author.orcid
0000-0002-5164-2690
-
crisitem.author.orcid
0000-0003-4420-4147
-
crisitem.author.orcid
0000-0001-7584-6439
-
crisitem.author.orcid
0000-0002-3643-6493
-
crisitem.author.orcid
0000-0002-7149-5843
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering
-
crisitem.author.parentorg
E058 - Forschungs-, Technologie- und Innovationssupport
-
crisitem.author.parentorg
E194 - Institut für Information Systems Engineering