Master Thesis projects

Last updated on Sep 29, 2024

The thesis will explore how artificial intelligence (Large Language Models, Genetic Programming, …) can be leveraged to enhance security testing. The internship is done at the Serval Team at the Security and Trust (SnT) research center of the University of Luxembourg.

Arturo Mozzon

Last updated on Oct 2, 2024

Design of an ergonomic nomadic code review environment

While the workstation (with screen, keyboard and mouse as peripherals) has long been the most common installation for computer-intensive employees, the arrival of powerful, versatile tablets and telephones in recent years has overturned this work organization in many professions. Software development, on the other hand, remains very attached to this type of installation. The aim of this thesis is therefore to explore the extent to which recent advances in software engineering (dockerization, static code analysis, etc.) could enable the development of a nomadic code review environment as efficient as the traditional model.

Antonin Sohy

Last updated on Sep 29, 2024

Optimized development interface to reduce the developer's cognitive load during debugging

While workstations (including monitors, keyboard and mouse as peripherals) have long been the most common installation for computer-intensive employees, the arrival of powerful, versatile tablets and telephones has been shaking up this organization of work in many professions for some years now. Software development, on the other hand, remains very attached to this type of installation. The aim of this thesis is therefore to explore the extent to which recent advances in software engineering (dockerization, static code analysis, etc.) could enable the development of a nomadic code review environment as efficient as the traditional model.

Nicolas Baltofski

Last updated on Sep 29, 2024

Socio-technical Debt

This thesis seeks to explore different aspects of socio-technical debt in the development cycle.

Benjamin-Pan, Simon Dourte

Last updated on Sep 29, 2024

Using the sound dimension in code review tools

The use of music to create favorable conditions for development and programming has become common practice (cf. the use of the lo-fi genre and the emergence of the “code-fi” musical genre). Similarly, the use of the sound dimension of a user experience is commonplace in many fields (video games, various control terminals, etc.). The aim of this thesis is to investigate how the sound dimension could be integrated into modern development tools to encourage concentration, attention and/or motivation, depending on the characteristics of the code being worked through (more complex code regions, repetitive tasks, etc.), the context outside the developer (time of day, weather, etc.) and personal preferences.

Victor Santelé

Last updated on Feb 11, 2025

Improving DevX for test case generation using LLM

The goal of the thesis is to explore how LLMs and Retrieve Augment Generate (RAG) approaches can be leveraged to enhance automated test case generation in IntelliJ. The internship is done at JetBrains Amsterdam.

Cédric Nfonou

Last updated on Sep 29, 2024

Multi-dimensional, multi-platform framework for web development

In this thesis, the student will be asked to design a web framework that will enable a developer to design an application that can be deployed as a website, desktop application, mobile application and immersive augmented and virtual reality application. One of the aspects that will be particularly studied is the framework’s ability to offer simple, explicit and intuitive commands for managing interface elements within 2D and 3D environments.

Gerry Longfils

Last updated on Sep 29, 2024

Test for Virtual and Augmented Reality

There exist very few automated testing frameworks for virtual and augmented reality-based applications. The goal of this master thesis is to define (and implement) such a framework for specific augmented reality technology (to be defined). The internship is done at the University of Rennes, France.

Denis Detry

Last updated on Sep 29, 2024

Augmented Reality Performance Testing

The goal of this thesis is to design and implement an augmented reality performance testing framework.

François Bechet

Last updated on Sep 29, 2024

Green Coding

These days, IT is increasingly resource-intensive, both in terms of hardware and energy consumption. As a result, the sustainable side of software is an increasingly studied topic in the software engineering community. However, developers have little knowledge of how to write and develop software in a more sustainable way. The aim of this thesis is to find out how to help developers write green code. The more precise subject is to be defined with the developers. The internship is done in the Software Engineering Research Group (SERG) at TU Delft in the Netherlands.

Salvio Strazzante

Last updated on Sep 29, 2024

Software Testing at Odoo

Odoo is a highly-configurable web-based ERP for various management tasks (CRM, inventory management, sales, invoicing, etc.) for small and medium-sized businesses. Odoo is developed mainly in Python and Javascript. This project will involve working with Odoo developers on one of the following topics at the intersection of Continuous Integration and Software Testing.

Thibaut Berg

Last updated on Sep 29, 2024

Software testing exercises gamification

The new Verification and Validation (V&V) course includes exercise sessions in which students write tests for small Java programs. These exercises use Andy to receive feedback in the form of coverage indicators, mutation score and number of successful meta-tests. The aim of this thesis is to apply a gamification approach to these exercises in order to increase student involvement: TEGa, you can’t help it!

Anaé De Baets

Last updated on Sep 29, 2024

Towards user-friendly fuzzers

Fuzz testing, or simply fuzzing, is an automated testing technique aiming at discovering bugs or vulnerabilities of software systems by providing random, invalid, or potential harmful input data. Fuzzing is a black-box testing technique, as typically, fuzzers, i.e. tools running fuzzing, have no access to the source code. Where structured testing will typically be manually written, i.e. developers will write test cases to ensure a system behaves as expected - verification - and fulfils the users’ expectations - validation -, fuzzing tools may employ techniques to generate a range of inputs with minimal manual intervention. However, many open-source or openly-accessible fuzzers have rudimentary user interfaces, so that developers must typically have a good understanding of what software security testing is, as well as understand how specific fuzzers need to be configured for specific software under tests. This project will explore how existing fuzzers could be made more approachable for non-experts. The scope of the project will be restricted to web systems (REST).

Julien Coppin

Last updated on Feb 11, 2025

Impacts and challenges of integrating a test generator into a CI/CD pipeline

This thesis investigates the impacts and challenges of integrating CLING, a test generator for Java, into CI/CD (Continuous Integration/Continuous Deployment) pipelines. In a context where test automation is essential to ensure software quality, CLING generates integration tests automatically. Integrating this tool into a Docker environment standardizes and isolates the environments, thus ensuring consistent test execution. The work addresses the technical challenges encountered during this integration, such as configuring and automating necessary operations. It also describes the development of the API used to manage the data generated by CLING. The results show an improvement in the efficiency of the development process, particularly through the reduction of manual interventions. Finally, the thesis offers recommendations for developers and DevOps engineers looking to optimize the integration of test generators into their CI/CD pipelines.

Matthys Gaillard

Last updated on Aug 10, 2024

A mixed-reality application for code comprehension

Learning to program, and especially understanding it, is a difficult task for newcomers. For this rea- son, aids are provided, such as IDEs, which give them tools to help them avoid syntax and/or semantic errors, depending on the programming languages used. depending on the programming languages used. However, these aids are not always sufficient to understand the written code, and more often than not, they fail to to understand the errors generated and their causes. For this purpose, code comprehension tools are available to help visualize the code. Some advances have even made it possible to use this through virtual reality. That’s why, with the advent of MR, a draft code visualization application has been proposed. This solution, called codeMR, makes it possible to represent code in 3 dimensions following the city’s paradigm, codeCity. To test its viability, an experiment was carried out with 10 people to see if it could a future for understanding code through the mixed reality. The results showed that the solution had the capabilities to help code comprehension. However, improvements to the application are still required to ensure optimal use in this context.

Farid Yomi Feusing

Last updated on Feb 11, 2025

Aladdin: Using natural language to facilitate open data visualisation

This dissertation explores the impact of using natural language in the visualisation of open datasets, focusing on the design and evaluation of Aladdin, a system based on the DSR approach. Aladdin uses advanced natural language processing techniques to transform text queries into interactive data visualisations.

Simon Polet

Last updated on Dec 16, 2024

Precise Feedback for a Precise Algorithm: Improving the User Experience on YouTube

Amid growing concern over filter bubbles and content diversity, this thesis explores the impact of feedback mechanisms on user experience with YouTube’s recommendation algorithm. The study examines how increased user control can influence their interactions with the algorithm. Based on user interviews, personas were created to understand user behaviors and expectations. A Chrome extension was developed to allow users to report errors in their recommendation feed. Results indicate that this mechanism enhances user satisfaction and a sense of control, though some limitations suggest areas for future improvements. The study also proposes a methodology to evaluate contextual thematic diversity on YouTube, paving the way for further research into recommendation diversity and self-actualization systems.

David Tang

Last updated on Apr 9, 2025

TesArt: Testing for digital art

The goal of the thesis is to devise an approach for automated test execution for digital performances developed using creative coding.

Simon Loir, Johan Rochet

Last updated on Aug 10, 2024

Using the Language Server Protocol to create a mobile, ergonomic code editor

This master thesis investigates how the Language Server Protocol (LSP) can be used to develop a nomad and ergonomic code editor. Mobile devices popularity has significantly increased in the past decade, strengthening the transformation of desktop solutions to mobile ones. However, code editing activities, traditionally carried out on a computer, have not yet found real alternatives to provide a suitable development environment and allow multilanguage support on mobile devices. Previous works focus on finding interaction solution to allow better code editing productivity, mainly adapting the code editor to one single programming language. By integrating the use of language servers through the LSP, we develop new design and interaction solutions to allow multilanguage support in a single mobile code editor. In this thesis, we present a prototype code editor combining interaction solutions found in the literature with LSP functionalities and evaluate it in terms of productivity and usability. This work aims to provide an alternative solution to the traditional desktop development environment on mobile devices, addressing the technological shifts and transforming the way developers may be coding in the future.

Gabriel Benoit

Last updated on Aug 10, 2024

Fuzzing highly-configurable Web user interface: a Odoo case study

Since many years, Odoo, a company providing business management services, is constantly expanding its scope and developing the complexity of its software, a web application. In response to that complexity, the introduction of automated testing techniques seems to be the next evolution of the testing tools already available to them. In the past, other tools for automatically testing web interfaces have been created, but often with limitations. This thesis explores the techniques that can be applied to implement fuzzing on the Odoo software web interface. It is shown that some methods do not seem applicable at present, while others work very well. A viable method will be proposed and implemented, and different configurations of the method will be evaluated. Ultimately, it will be shown that some weaknesses are present in the proposed method, but that future work in this direction can be done.

Lucas Berg

Last updated on Feb 11, 2025

Improving automated unit test generation for machine learning libraries using structured input data

The field of automated test case generation has grown considerably in recent years to reduce software testing costs and find bugs. However, the techniques for automatically generating test cases for machine learning libraries still produce low-quality tests and papers on the subject tend to work in Java, whereas the machine learning community tends to work in Python. Some papers have attempted to explain the causes of these poor-quality tests and to make it possible to generate tests in Python automatically, but they are still fairly recent, and therefore, no study has yet attempted to improve these test cases in Python. In this thesis, we introduce 2 improvements for Pynguin, an automated test case generation tool for Python, to generate better test cases for machine learning libraries using structured input data and to manage better crashes from C-extension modules. Based on a set of 7 modules, we will show that our approach has made it possible to cover lines of code unreachable with the traditional approach and to generate error-revealing test cases. We expect our approach to serve as a starting point for integrating testers’ knowledge of input data of programs more easily into automated test case generation tools and creating tools to find more bugs that cause crashes.

Antoine Piras

Last updated on Feb 11, 2025

LLM-explained mutation testing reports for teaching software testing

The widespread digitalization of society and the increasing complexity of software make it essential to develop high-quality software testing suites. In recent years, several techniques for learning software testing have been developed, including techniques based on mutation testing. At the same time, the recent performance of language models in both text comprehension and generation, as well as code generation, makes them potential candidates for assisting students in learning how to develop tests. To confirm this, an experiment was carried out with students with little experience in software testing, comparing the results obtained by some students using a report from a classic mutation testing tool and a report augmented with hints generated by a language model. The results seem promising since the augmented reports improved the mutation score and mutant coverage within the group more generally than the other reports. In addition, the augmented reports seem to have been most effective in testing methods for modifying and retrieving private variable values.

Ludovic Evrard

Last updated on Aug 10, 2024

Optimizing the energy consumption of showcase sites: a comparative analysis of WordPress and static sites

Energy efficiency in computing is an important subject that is increasingly being addressed by researchers and developers. Nowadays, the majority of websites are built using the Wordpress CMS, while other developers prefer to use more secure and energy-efficient site generators. This study focuses on the server-side energy consumption of these two methods of creating websites. A detailed analysis of the results will make it possible to identify borderline cases and suggest recommendations on the best technology to use, depending on the type of project.

Manon Galloy

Last updated on Feb 11, 2025

SelfBehave: Generating a Behaviour-Driven Development Dataset Using the SELF-INSTRUCT Method

Software development faces persistent challenges in terms of maintainability and efficiency, and this is driving the ongoing search for innovative approaches. Agile methodologies, in particular Behaviour-Driven Development (BDD), have gained ground in society thanks to their ability to promote responsiveness to change and communication between stakeholders. However, as with many methods, the use of BDD can lead to mainte- nance costs and productivity problems. To meet these challenges, this research investigates the adaptation of advanced automatic data generation techniques, in particular SELF-INSTRUCT, to augment BDD datasets.

Jérémie Dierickx

Last updated on Feb 11, 2025

Training machine learning models for vulnerability prediction and injection using datasets of vulnerability-inducing commits

Multiple techniques exist to find vulnerabilities in code, such as static analysis and machine learning. Although machine learning techniques are promising, they need to learn from a large quantity of examples. Since there is not such large quantity of data for vulnerable code, vulnerability injection techniques have been developed to create them. Both vulnerability prediction and injection techniques based on machine learning usually use the same kind of data, thus pairs of vulnerable code, just before the fix, and their fixed version. However, using the fixed version is not realistic, as the vulnerability has been introduced on a different version of the code that may be way different from the fixed version. Therefore, we suggest the use of pairs of code that has introduced the vulnerability and its previous version. Indeed, this is more realistic, but this is only relevant if machine learning techniques can properly learn from it and the patterns learned are significantly different than with the usual method. To make sure of this, we trained vulnerability prediction models for both kind of data and compared their performance. Our analysis showed a model trained on pairs of vulnerable code and their fixed version is unable to predict vulnerabilities from the vulnerability introducing versions. The same goes for the opposite, despite both models are able to properly learn from their data and detect vulnerabilities on similar data. Therefore, we conclude that the use of vulnerability introducing codes for machine learning training is more relevant than the fixed versions.

Charles Benimedourene

Last updated on Feb 11, 2025

Automatic Vulnerability Injection using Genetic Improvement and Static Code Analysers

This thesis explores the idea of applying genetic improvement in the aim of injecting vulnerabilities into programs. Generating vulnerabilities automatically in this manner would allow creating datasets of vulnerable programs. This would, in turn, help training machine-learning models to detect vulnerabilities more efficiently. This idea was put to the test by implementing VulGr, a modified version of the framework dedicated to genetic improvement named PyGGi. VulGr itself uses CodeQL, a static code analyser, offering a new approach to statical detection of vulnerabilities. VulGr’s end goal was to use CodeQL to inject vulnerabilities into programs of the Vul4J dataset. This experiment proved unsuccessful, CodeQL lacking accuracy and being too time-consuming to produce concrete results in an acceptable time span (less than 72 hours). However, the general approach and VulGr still retain their relevancy for future uses as CodeQL is an ongoing community effort promising new updates fixing the issues mentioned.

Alix Decrop

Last updated on Feb 11, 2025

Leveraging Large Language Models to Automatically Infer RESTful API Specifications

Application Programming Interfaces, known as APIs, are increasingly popular in modern web applications. With APIs, users around the world are able to access a plethora of data contained in numerous server databases. To understand the workings of an API, a formal documentation is required. This documentation is also required by API testing tools, aimed at improving the reliability of APIs. However, as writing API documentations can be time-consuming, API developers often overlook the process, resulting in unavailable, incomplete or informal API documentations. Recent Large Language Model technologies such as ChatGPT have displayed exceptionally efficient capabilities at automating tasks, disposing of data trained on billions of resources across the web. Thus, such capabilities could be utilized for the purpose of generating API documentations. Therefore, the Master’s Thesis proposes the first approach Leveraging Large Language Models to Automatically Infer RESTful API Specifications. Preliminary strategies are explored, leading to the implementation of a tool entitled MutGPT. The intent of MutGPT is to discover API features by generating and modifying valid API requests, with the help of Large Language Models. Experimental results demonstrate that MutGPT is capable of sufficiently inferring the specification of the tested APIs, with an average route discovery rate of 82.49% and an average parameter discovery rate of 75.10%. Additionally, MutGPT was capable of discovering 2 undocumented and valid routes of a tested API, which has been confirmed by the relevant developers. Overall, this Master’s Thesis uncovers 2 new contributions:

Mamadou Diop

Last updated on Nov 4, 2022

Socio-technical Debt

Olivier Hensmans

Last updated on Aug 8, 2023

Study of the impacts of Code Smells on code Testability

Code Smells have been studied for more than 20 years now. They are used to describe a design 􏰊aw in a program intuitively. In this study, we wish to identify the impact of some of these Code Smells. And, more specifically, their potential impact on Testability. To do this, we will study the state of the research on both Code Smells and Testability. Using those studies, we will define a scope of parameters to de􏰉ne the two concepts. With that information, we will analyse the statistical distribution of our samples and try to understand the relationship between Code Smells and Testability in a corpus of Java projects.

Boris Cherry

Last updated on Feb 11, 2025

JCrashPack2.0: Search-based crash reproduction hardness analysis

This master thesis project, revisits the links between search-based crash reproduction and software quality metrics to assess the hardness of search-based crash reproducing test case generation.

Gaetan Delvaux

Last updated on Mar 6, 2023

DeFlake: Exploring test flakiness debugging

Flaky tests are nondeterministic tests, they can give different results without changes to the code. This wastes time and resources. A better understanding of this field should lead to a decrease of these inconveniences. However, there is little work that makes the effort to bring this knowledge together. This is why this thesis will propose to answer these questions What are the categories of flaky tests identified? and What debugging techniques are the most appropriate for the different categories of flaky tests?

Pierre Ortegat

Last updated on Feb 10, 2023

Génération de tests unitaires pour programmes Python

L’application de tests automatiques au code soumis par les étudiants sur une plateforme de correction automatique est un outil utile pour le corps enseignant. Il permet de fournir de meilleurs retours, sur plus d’exercices, créés plus rapidement. Des méthodes des tests automatiques sont analysées et une sélection est faite sur ceux qui, de par leurs caractéristiques, sont les plus intéressants dans le contexte de la correction automatique des codes d’étudiants. Les méthodes retenues sont le fuzzing en boite grise et le test de combinaisons d’appels sur une structure donnée. L’efficacité de celles-ci est discutée et une application pratique est développée sous la forme d’une librairie de test qui s’intègre dans la plateforme de correction automatique Inginious. Les limitations sont analysées et un protocole de test des modifications apportées par celles-ci est ensuite proposé pour pouvoir quantifier les gains apportés via une expérience pratique.

Samuel Van De Put

Last updated on Feb 10, 2023

Learning to assert in software testing using mutants

Learning software testing is a neglected subject in computer science courses. Over the years, methods and tools have appeared to provide educational support for this learning. Mutation testing is a technique used to evaluate the effectiveness of test suites. Recently, a variant called extreme mutation testing that reduces computational and time costs has emerged. Descartes, an extreme mutation engine was developed. With the support of a plugin extension called Reneri, Descartes can generate a report providing information to the developer on potential reasons why mutants remain undetected. In this thesis, an extension of Visual Studio Code has been developed in order to incorporate the information generated by Descartes and Reneri. The purpose of the experiment is to assess whether the inclusion of this data can help master’s students improve their test assertions. The results showed that this information integrated into an editor was well received by the students and that it guided them towards a refinement of their suite of tests.

Martin Balfroid, Pierre Luycx

Last updated on Feb 10, 2023

MuTEd: A Comparative Study of Classic and Extreme Mutation Testing for Teaching Software Testing

Although software testing is critical in software engineering, studies have shown a significant gap between students’ knowledge of software testing and the industry’s needs, hinting at the need to explore novel approaches to teach software testing. Among them, classical mutation testing has already proven to be effective in helping students. We hypothesise that extreme mutation testing could be more effective by introducing more obvious mutants to kill. In order to study this question, we organised an experiment with two undergraduate classes comparing the usage of two tools, one applying classical mutation testing, and the other one applying extreme mutation testing. The results contradicted our hypothesis. Indeed, students with access to the classic mutation testing tool obtained a better mutation score, while the others seem to have mostly covered more code. Finally, we have published and anonymised the students’ test suites in adherence to best open-science practices, and we have developed guidance based on previous evaluations and our own results.

Zakaria El Idrissi

Last updated on Feb 10, 2023

Reverse Engineering Variability for Configurable Systems using Formal Concept Analysis: The Odoo case study

Reverse Engineering a Feature Model (FM) of an existing system allows its migration to a software product line approach to simplify the management of this system by applying a Software Product Line Engineering methodology that focuses mainly on the FM to determine the reusable artefacts and the variation points of the system. This thesis is a case study on the Odoo framework to define a reverse engineering approach that can drive an automatic synthesis of an FM to represent the variability architecture of the system. We manually explored the Odoo framework source code to identify variability patterns, then exploited Formal Concept Analysis properties to derive the FM based on the Odoo module’s dependencies. The heuristic that we executed for the process of reverse engineering is effective and results in FM, which describes the product configuration variability.

Tek-Sang Au

Last updated on Feb 10, 2023

Towards crash reproduction benchmark augmentation using mutation testing

Many applications are developed with a lot of different purposes and can provide quality output. Nevertheless, crashes still happen. Many techniques such as unit testing, peer-reviewing, or crash reproduction are being researched to improve quality by reducing crashes. This thesis contributes to the fast-evolving field of research on crash reproduction tools. These tools seek better reproduction with minimum information as input while delivering correct outputs in various scenarios. Different approaches have previously been tested to gather input-output data, also called benchmarks, but they often take time and manual e↵ort to be usable. The research documented in this thesis endeavours to synthesize crashes using mutation testing to serve as input for crash reproduction tools.

Master Thesis

Master Thesis Projects