Large Language Models for Mathematicians

Frieder, Simon; Berner, Julius; Petersen, Philipp; Lukasiewicz, Thomas

Record link:

http://hdl.handle.net/20.500.12708/192474

Title:

Large Language Models for Mathematicians

Citation:

Frieder, S., Berner, J., Petersen, P., & Lukasiewicz, T. (2023). Large Language Models for Mathematicians. Internationale Mathematische Nachrichten, 254, 1–20. http://hdl.handle.net/20.500.12708/192474

Publication Type:

Article - Original Research Article

Language:

English

Authors:

Frieder, Simon
Berner, Julius
Petersen, Philipp
Lukasiewicz, Thomas

Organisational Unit:

E192-07 - Forschungsbereich Artificial Intelligence Techniques

Journal:

Internationale Mathematische Nachrichten

ISSN:

0020-7926

Date (published):

Dec-2023

Number of Pages:

Peer reviewed:

Keywords:

large language models; ChatGPT; mathematics

Abstract:

Large language models (LLMs) such as CHATGPT have received immense in terest for their general-purpose language understanding and, in particular, their ability to generate high-quality text or computer code. For many professions, LLMs represent an invaluable tool that can speed up and improve the quality of work. In this note, we discuss to what extent they can aid professional mathe maticians. We first provide a mathematical description of the transformer model used in all modern language models. Based on recent studies, we then outline best practices and potential issues and report on the mathematical abilities of language models. Finally, we shed light on the potential of LMMs to change how mathematicians work.

Research Areas:

Information Systems Engineering: 100%

Science Branch:

1020 - Informatik: 80%
1010 - Mathematik: 20%

Appears in Collections:

Article

Show full item record

Page view(s)

474

checked on Jan 23, 2024

Google Scholar^TM

Check

Page view(s)

Google ScholarTM

Google Scholar^TM