Large Language Model

SelfBehave, Generating a Synthetic Behaviour-Driven Development Dataset Using SELF-INSTRUCT

SNAIL Team Members Present Research at ICST 2025 in Naples

This week, two members of our team are attending the IEEE International Conference on Software Testing, Verification and Validation (ICST 2025) in Naples, Italy. On Monday, Balfroid Martin took the stage to share research conducted by Manon Galloy, a SNAILee who defended her master’s thesis in 2024. The focus of their presentation was on a novel technique known as SelfBehave, aimed at synthesizing user stories, which was featured during the A-MOST workshop.

Xavier Devroey

Last updated on Apr 17, 2025 2 min read

SNAIL Team Members Present Research at ICST 2025 in Naples

ChatGPT, Could you please generate my exam?

It’s time for the January exams. Three years ago, my colleague Benoît Vanderose and I created a verification and validation course for the MSc in Software Engineering students. From the start, we both thought we needed a practical exam where students actually perform V&V tasks. So, for our exam, students have to analyze a codebase, read SonarQube reports, refactor code, correct bugs, and write tests. This was quite unusual for them (most courses are evaluated using projects or paper exams), and the students’ reaction three years ago was, let’s say, interesting.

Xavier Devroey

Last updated on Feb 11, 2025 5 min read

ChatGPT, Could you please generate my exam?

Improving DevX for test case generation using LLM

The goal of the thesis is to explore how LLMs and Retrieve Augment Generate (RAG) approaches can be leveraged to enhance automated test case generation in IntelliJ. The internship is done at JetBrains Amsterdam.

Victor Santelé

Towards LLM-Generated Code Tours for Onboarding

Onboarding new developers is a challenge for any software project. Addressing this challenge relies on human resources (e.g., having a …

Martin Balfroid, Benoît Vanderose, Xavier Devroey

Towards LLM-Generated Code Tours for Onboarding

Aladdin: Using natural language to facilitate open data visualisation

This dissertation explores the impact of using natural language in the visualisation of open datasets, focusing on the design and evaluation of Aladdin, a system based on the DSR approach. Aladdin uses advanced natural language processing techniques to transform text queries into interactive data visualisations.

Farid Yomi Feusing

LLM-explained mutation testing reports for teaching software testing

The widespread digitalization of society and the increasing complexity of software make it essential to develop high-quality software testing suites. In recent years, several techniques for learning software testing have been developed, including techniques based on mutation testing. At the same time, the recent performance of language models in both text comprehension and generation, as well as code generation, makes them potential candidates for assisting students in learning how to develop tests. To confirm this, an experiment was carried out with students with little experience in software testing, comparing the results obtained by some students using a report from a classic mutation testing tool and a report augmented with hints generated by a language model. The results seem promising since the augmented reports improved the mutation score and mutant coverage within the group more generally than the other reports. In addition, the augmented reports seem to have been most effective in testing methods for modifying and retrieving private variable values.

Antoine Piras

LLM-explained mutation testing reports for teaching software testing

SelfBehave: Generating a Behaviour-Driven Development Dataset Using the SELF-INSTRUCT Method

Software development faces persistent challenges in terms of maintainability and efficiency, and this is driving the ongoing search for innovative approaches. Agile methodologies, in particular Behaviour-Driven Development (BDD), have gained ground in society thanks to their ability to promote responsiveness to change and communication between stakeholders. However, as with many methods, the use of BDD can lead to mainte- nance costs and productivity problems. To meet these challenges, this research investigates the adaptation of advanced automatic data generation techniques, in particular SELF-INSTRUCT, to augment BDD datasets.

Manon Galloy

SelfBehave: Generating a Behaviour-Driven Development Dataset Using the SELF-INSTRUCT Method

Leveraging Large Language Models to Automatically Infer RESTful API Specifications

Application Programming Interfaces, known as APIs, are increasingly popular in modern web applications. With APIs, users around the world are able to access a plethora of data contained in numerous server databases. To understand the workings of an API, a formal documentation is required. This documentation is also required by API testing tools, aimed at improving the reliability of APIs. However, as writing API documentations can be time-consuming, API developers often overlook the process, resulting in unavailable, incomplete or informal API documentations. Recent Large Language Model technologies such as ChatGPT have displayed exceptionally efficient capabilities at automating tasks, disposing of data trained on billions of resources across the web. Thus, such capabilities could be utilized for the purpose of generating API documentations. Therefore, the Master’s Thesis proposes the first approach Leveraging Large Language Models to Automatically Infer RESTful API Specifications. Preliminary strategies are explored, leading to the implementation of a tool entitled MutGPT. The intent of MutGPT is to discover API features by generating and modifying valid API requests, with the help of Large Language Models. Experimental results demonstrate that MutGPT is capable of sufficiently inferring the specification of the tested APIs, with an average route discovery rate of 82.49% and an average parameter discovery rate of 75.10%. Additionally, MutGPT was capable of discovering 2 undocumented and valid routes of a tested API, which has been confirmed by the relevant developers. Overall, this Master’s Thesis uncovers 2 new contributions:

Alix Decrop

Leveraging Large Language Models to Automatically Infer RESTful API Specifications