Towards LLM-Generated Code Tours for Onboarding

Example of generated code tour step

Abstract

Onboarding new developers is a challenge for any software project. Addressing this challenge relies on human resources (e.g., having a senior developer write documentation or mentor the new developer). One promising solution is using annotated code tours. While this approach partially lifts the need for mentorship, it still requires a senior developer to write this interactive form of documentation. This paper argues that a Large Language Model (LLM) might help with this documentation process. Our approach is to record the stack trace between a failed test and a faulty method. We then extract code snippets from the methods in this stack trace using CodeQL, a static analysis tool and have them explained by gpt-3.5-turbo-1106, the LLM behind ChatGPT. Finally, we evaluate the quality of a sample of these generated tours using a checklist. We show that the automatic generation of code tours is feasible but has limitations like redundant and low-level explanations.

Publication
Proceedings of the 2024 ACM/IEEE International Workshop on NL-based Software Engineering (NLBSE ‘24)
Martin Balfroid
Martin Balfroid
PhD Student
Benoît Vanderose
Benoît Vanderose
Assistant Professor of Software Engineering
Xavier Devroey
Xavier Devroey
Assistant Professor of Software Testing

My research goal is to to ease software testing by exploring new paths to achieve a high level of automation for test case design, generation, selection, and prioritization. My main research interests include search-based and model-based software testing, test suite augmentation, DevOps, and variability-intensive systems.

Related