Establishing Traceability Between Natural Language Requirements and Software Artifacts by Combining RAG and LLMs

Ali, Syed Juned; Naganathan, Varun; Bork, Dominik

doi:10.1007/978-3-031-75872-0_16

Datensatz Zitierlink:

http://hdl.handle.net/20.500.12708/205507

Titel:

Establishing Traceability Between Natural Language Requirements and Software Artifacts by Combining RAG and LLMs

Zitat:

Ali, S. J., Naganathan, V., & Bork, D. (2024). Establishing Traceability Between Natural Language Requirements and Software Artifacts by Combining RAG and LLMs. In Conceptual Modeling (pp. 295–314). https://doi.org/10.1007/978-3-031-75872-0_16

Verlags-DOI:

10.1007/978-3-031-75872-0_16

Publikationstyp:

Konferenzbeitrag - Full-Paper Contribution

Sprache:

Englisch

Autor_innen:

Ali, Syed Juned
Naganathan, Varun
Bork, Dominik

Organisationseinheit:

E194-03 - Forschungsbereich Business Informatics

Erschienen in:

Conceptual Modeling

ISBN:

978-3-031-75872-0

Band:

15238

Datum (veröffentlicht):

2024

Veranstaltungsname:

43rd International Conference on Conceptual Modeling (ER 2024)

Veranstaltungszeitraum:

28-Okt-2024 - 31-Okt-2024

Veranstaltungsort:

Pittsburgh, Vereinigte Staaten von Amerika

Umfang:

Peer Reviewed:

Keywords:

Large Language Models; LLM; Requirements Engineering; Requirements Traceability; Retrieval Augmented Generation

Abstract:

Software Engineering aims to effectively translate stakeholders’ requirements into executable code to fulfill their needs. Traceability from natural language use case requirements to classes in a UML class diagram, subsequently translated into code implementation, is essential in systems development and maintenance. Tasks such as assessing the impact of changes and enhancing software reusability require a clear link between these requirements and their software implementation. However, establishing such links manually across extensive codebases is prohibitively challenging. Requirements, typically articulated in natural language, embody semantics that clarify the purpose of the codebase. Conventional traceability methods, relying on textual similarities between requirements and code, often suffer from low precision due to the semantic gap between high-level natural language requirements and the syntactic nature of code. The advent of Large Language Models (LLMs) provides new methods to address this challenge through their advanced capability to interpret both natural language and code syntax. Furthermore, representing code as a knowledge graph facilitates the use of graph structural information to enhance traceability links. This paper introduces an LLM-supported retrieval augmented generation approach for enhancing requirements traceability to the class diagram of the code, incorporating keyword, vector, and graph indexing techniques, and their integrated application. We present a comparative analysis against conventional methods and among different indexing strategies and parameterizations on the performance. Our results demonstrate how this methodology significantly improves the efficiency and accuracy of establishing traceability links in software development processes.

Forschungsschwerpunkte:

Information Systems Engineering: 100%

Wissenschaftszweig:

1020 - Informatik: 90%
5020 - Wirtschaftswissenschaften: 10%

Enthalten in den Sammlungen:

Conference Paper

Zur Langanzeige

Seiten Aufrufe

122

aufgerufen am 10.12.2024

Download(s)

aufgerufen am 10.12.2024

Google Scholar^TM

Check

Seiten Aufrufe

Download(s)

Google ScholarTM

Google Scholar^TM