Projects

Last updated on Oct 30, 2025

Automatic Vulnerability Injection using Genetic Improvement and Static Code Analysers

This thesis explores the idea of applying genetic improvement in the aim of injecting vulnerabilities into programs. Generating vulnerabilities automatically in this manner would allow creating datasets of vulnerable programs. This would, in turn, help training machine-learning models to detect vulnerabilities more efficiently. This idea was put to the test by implementing VulGr, a modified version of the framework dedicated to genetic improvement named PyGGi. VulGr itself uses CodeQL, a static code analyser, offering a new approach to statical detection of vulnerabilities. VulGr’s end goal was to use CodeQL to inject vulnerabilities into programs of the Vul4J dataset. This experiment proved unsuccessful, CodeQL lacking accuracy and being too time-consuming to produce concrete results in an acceptable time span (less than 72 hours). However, the general approach and VulGr still retain their relevancy for future uses as CodeQL is an ongoing community effort promising new updates fixing the issues mentioned.

Alix Decrop

Last updated on Feb 11, 2025

Leveraging Large Language Models to Automatically Infer RESTful API Specifications

Application Programming Interfaces, known as APIs, are increasingly popular in modern web applications. With APIs, users around the world are able to access a plethora of data contained in numerous server databases. To understand the workings of an API, a formal documentation is required. This documentation is also required by API testing tools, aimed at improving the reliability of APIs. However, as writing API documentations can be time-consuming, API developers often overlook the process, resulting in unavailable, incomplete or informal API documentations. Recent Large Language Model technologies such as ChatGPT have displayed exceptionally efficient capabilities at automating tasks, disposing of data trained on billions of resources across the web. Thus, such capabilities could be utilized for the purpose of generating API documentations. Therefore, the Master’s Thesis proposes the first approach Leveraging Large Language Models to Automatically Infer RESTful API Specifications. Preliminary strategies are explored, leading to the implementation of a tool entitled MutGPT. The intent of MutGPT is to discover API features by generating and modifying valid API requests, with the help of Large Language Models. Experimental results demonstrate that MutGPT is capable of sufficiently inferring the specification of the tested APIs, with an average route discovery rate of 82.49% and an average parameter discovery rate of 75.10%. Additionally, MutGPT was capable of discovering 2 undocumented and valid routes of a tested API, which has been confirmed by the relevant developers. Overall, this Master’s Thesis uncovers 2 new contributions:

Olivier Hensmans

Last updated on Oct 30, 2025

Study of the impacts of Code Smells on code Testability

Code Smells have been studied for more than 20 years now. They are used to describe a design 􏰊aw in a program intuitively. In this study, we wish to identify the impact of some of these Code Smells. And, more specifically, their potential impact on Testability. To do this, we will study the state of the research on both Code Smells and Testability. Using those studies, we will define a scope of parameters to de􏰉ne the two concepts. With that information, we will analyse the statistical distribution of our samples and try to understand the relationship between Code Smells and Testability in a corpus of Java projects.

Guillaume Nguyen, Xavier Devroey, Antoine Sacré

Last updated on Jan 30, 2025

CyberExcellence

The CyberExcellence project started in January 2022 under the umbrella of the CyberWal initiative and is funded by the Walloon region. It aims to position Wallonia as a major player in cybersecurity on the national and international map by developing a core framework allowing the implementation of solutions based on practical and thoughtful cybersecurity with a competitive advantage.

Nicolas Riquet, Benoît Vanderose, Xavier Devroey

Last updated on Mar 22, 2023

Defining software debt, an expanded and multidimensional view of socio-technical debt in industry

This project aims at developing a holistic approach to socio-technical debt by designing a framework for helping developers and managers to address software debt and prioritize mitigating actions in an industrial context.

Boris Cherry

Last updated on Feb 11, 2025

JCrashPack2.0: Search-based crash reproduction hardness analysis

This master thesis project, revisits the links between search-based crash reproduction and software quality metrics to assess the hardness of search-based crash reproducing test case generation.

Gaetan Delvaux

Last updated on Mar 6, 2023

DeFlake: Exploring test flakiness debugging

Flaky tests are nondeterministic tests, they can give different results without changes to the code. This wastes time and resources. A better understanding of this field should lead to a decrease of these inconveniences. However, there is little work that makes the effort to bring this knowledge together. This is why this thesis will propose to answer these questions What are the categories of flaky tests identified? and What debugging techniques are the most appropriate for the different categories of flaky tests?

Pierre Ortegat

Last updated on Feb 10, 2023

Génération de tests unitaires pour programmes Python

L’application de tests automatiques au code soumis par les étudiants sur une plateforme de correction automatique est un outil utile pour le corps enseignant. Il permet de fournir de meilleurs retours, sur plus d’exercices, créés plus rapidement. Des méthodes des tests automatiques sont analysées et une sélection est faite sur ceux qui, de par leurs caractéristiques, sont les plus intéressants dans le contexte de la correction automatique des codes d’étudiants. Les méthodes retenues sont le fuzzing en boite grise et le test de combinaisons d’appels sur une structure donnée. L’efficacité de celles-ci est discutée et une application pratique est développée sous la forme d’une librairie de test qui s’intègre dans la plateforme de correction automatique Inginious. Les limitations sont analysées et un protocole de test des modifications apportées par celles-ci est ensuite proposé pour pouvoir quantifier les gains apportés via une expérience pratique.

Samuel Van De Put

Last updated on Feb 10, 2023

Learning to assert in software testing using mutants

Learning software testing is a neglected subject in computer science courses. Over the years, methods and tools have appeared to provide educational support for this learning. Mutation testing is a technique used to evaluate the effectiveness of test suites. Recently, a variant called extreme mutation testing that reduces computational and time costs has emerged. Descartes, an extreme mutation engine was developed. With the support of a plugin extension called Reneri, Descartes can generate a report providing information to the developer on potential reasons why mutants remain undetected. In this thesis, an extension of Visual Studio Code has been developed in order to incorporate the information generated by Descartes and Reneri. The purpose of the experiment is to assess whether the inclusion of this data can help master’s students improve their test assertions. The results showed that this information integrated into an editor was well received by the students and that it guided them towards a refinement of their suite of tests.

Martin Balfroid, Pierre Luycx

Last updated on Feb 10, 2023

MuTEd: A Comparative Study of Classic and Extreme Mutation Testing for Teaching Software Testing

Although software testing is critical in software engineering, studies have shown a significant gap between students’ knowledge of software testing and the industry’s needs, hinting at the need to explore novel approaches to teach software testing. Among them, classical mutation testing has already proven to be effective in helping students. We hypothesise that extreme mutation testing could be more effective by introducing more obvious mutants to kill. In order to study this question, we organised an experiment with two undergraduate classes comparing the usage of two tools, one applying classical mutation testing, and the other one applying extreme mutation testing. The results contradicted our hypothesis. Indeed, students with access to the classic mutation testing tool obtained a better mutation score, while the others seem to have mostly covered more code. Finally, we have published and anonymised the students’ test suites in adherence to best open-science practices, and we have developed guidance based on previous evaluations and our own results.