The IMO Small Challenge: Not-Too-Hard Olympiad Math Datasets for LLMs

Frieder, Simon; Olšák, Mirek; Berner, Julius; Lukasiewicz, Thomas

DC Element

Wert

Sprache

dc.contributor.author

Frieder, Simon

dc.contributor.author

Olšák, Mirek

dc.contributor.author

Berner, Julius

dc.contributor.author

Lukasiewicz, Thomas

dc.date.accessioned

2025-01-30T16:38:48Z

dc.date.available

2025-01-30T16:38:48Z

dc.date.issued

2024

dc.identifier.citation

<div class="csl-bib-body"> <div class="csl-entry">Frieder, S., Olšák, M., Berner, J., & Lukasiewicz, T. (2024). The IMO Small Challenge: Not-Too-Hard Olympiad Math Datasets for LLMs. In <i>The Second Tiny Papers Track at ICLR 2024</i>. The Twelfth International Conference on Learning Representations (ICLR 2024), Wien, Austria. http://hdl.handle.net/20.500.12708/210292</div> </div>

dc.identifier.uri

http://hdl.handle.net/20.500.12708/210292

dc.description.abstract

We introduce the IMO Small Challenge (IMOSC), as opposed to the IMO Grand Challenge: A text-only, natural-language dataset consisting of mathematical problems from various mathematical competitions. The IMOSC dataset exceeds the difficulty level of current datasets that are widely used for LLM evaluation, such as the MATH dataset, while not being too challenging for the current generation of LLMs. The IMOSC currently contains a carefully curated collection of the easiest possible problems from difficult competitions, such as the International Mathematical Olympiad (IMO). Problem hardness is measured by applying a mixture of (objective and subjective) difficulty filters to the original problems. We release the full dataset under the link below to encourage transparent evaluation of LLMs and theorem provers toward their mathematical proof-generating abilities: www.imo-small-challenge.io

dc.language.iso

dc.subject

IMO Small Challenge

dc.title

The IMO Small Challenge: Not-Too-Hard Olympiad Math Datasets for LLMs

dc.type

Inproceedings

dc.type

Konferenzbeitrag

dc.contributor.affiliation

University of Oxford, United Kingdom of Great Britain and Northern Ireland (the)

dc.contributor.affiliation

University of Cambridge, United Kingdom of Great Britain and Northern Ireland (the)

dc.contributor.affiliation

California Institute of Technology, United States of America (the)

dc.type.category

Poster Contribution

tuw.booktitle

The Second Tiny Papers Track at ICLR 2024

tuw.peerreviewed

true

tuw.researchTopic.id

tuw.researchTopic.name

Information Systems Engineering

tuw.researchTopic.value

100

tuw.publication.orgunit

E192-07 - Forschungsbereich Artificial Intelligence Techniques

tuw.publication.orgunit

E192-03 - Forschungsbereich Knowledge Based Systems

dc.description.numberOfPages

tuw.event.name

The Twelfth International Conference on Learning Representations (ICLR 2024)

tuw.event.startdate

07-05-2024

tuw.event.enddate

11-05-2024

tuw.event.online

Hybrid

tuw.event.type

Event for scientific audience

tuw.event.place

Wien

tuw.event.country

tuw.event.presenter

Frieder, Simon

wb.sciencebranch

Informatik

wb.sciencebranch

Mathematik

wb.sciencebranch.oefos

1020

wb.sciencebranch.oefos

1010

wb.sciencebranch.value

item.languageiso639-1

item.openairetype

conference poster

item.grantfulltext

none

item.fulltext

no Fulltext

item.cerifentitytype

Publications

item.openairecristype

http://purl.org/coar/resource_type/c_6670

crisitem.author.dept

E192-07 - Forschungsbereich Artificial Intelligence Techniques

crisitem.author.dept

University of Cambridge

crisitem.author.dept

California Institute of Technology

crisitem.author.dept

E192-07 - Forschungsbereich Artificial Intelligence Techniques

crisitem.author.parentorg

E192 - Institut für Logic and Computation

crisitem.author.parentorg

E192 - Institut für Logic and Computation

Enthalten in den Sammlungen:

Conference Paper

Zur Kurzanzeige

Seiten Aufrufe

129

aufgerufen am 31.01.2025

Google Scholar^TM

Check

Seiten Aufrufe

Google ScholarTM

Google Scholar^TM