Morais, G., Lemelin, E., Adda, M., & Bork, D. (2025). Enhancing API Labelling with BERT and GPT: An Exploratory Study. In Enterprise Design, Operations, and Computing. EDOC 2024 Workshops (pp. 169–182). https://doi.org/10.1007/978-3-031-79059-1_11
Enterprise Design, Operations, and Computing. EDOC 2024 Workshops
-
ISBN:
978-3-031-79059-1
-
Volume:
537
-
Date (published):
2025
-
Event name:
28th International Conference on Enterprise Design, Operations, and Computing (EDOC 2024)
en
Event date:
10-Sep-2024 - 13-Sep-2024
-
Event place:
Wien, Austria
-
Number of Pages:
14
-
Peer reviewed:
Yes
-
Keywords:
API classification; BERT; GPT; OpenAPI Specification
en
Abstract:
Application Programming Interfaces (APIs) enable interaction, integration, and interoperability among applications and services, contributing to their adoption and proliferation. However, discovering APIs has relied on manual, time-consuming, costly processes that jeopardize their reuse potential and accentuate the need for effective API retrieval mechanisms. Leveraging the OpenAPI Specification as a basis, this paper presents an exploratory study that combines BERT and GPT machine learning models to propose a novel API classifier. Our investigation explored the zero-shot learning capabilities of GPT-4 and GPT-3.5 using relevant terms extracted from API descriptions using BERT. The evaluation of our approach on two datasets comprising 940 API descriptions sourced from public repositories yielded an F1-score of 100% in the small dataset (17 APIs) and 39.1% in the large dataset (923 APIs). These results surpass state-of-the-art on the small dataset with an impressive 29-point improvement. The large dataset showed GPT can suggest labels not in the provided list. Manual analysis revealed that GPT’s suggested labels fit the API intent better in 18 out of 20 cases, highlighting its potential for unknown classes and mismatch detection. This emphasizes the need to improve dataset quality and availability for API research. Our findings show the potential of automated API retrieval and open avenues for future research.