A comparative study of achievements and leaderboards in the context of learning software testing

The practice of software testing is struggling to become widespread in the software development industry. This observation also applies to the academic world, where students often devote little energy to this activity, due to lack of motivation or time. To address these issues, this study explores the use of gamification as a lever of engagement in software testing. Following a state-of-the-art review of testing techniques and gamification principles applied to software development, an existing IntelliJ plugin was reused and enhanced. Initially centered on a system of achievements, this plugin was supplemented by a leaderboard, enabling a comparison of these two approaches. Achievements are badges, here represented by trophies, awarded when the user reaches certain levels of progress in the plugin. A leaderboard, on the other hand, is a table that ranks the various participants according to the points they have earned. The aim is to determine which mode favors student involvement, while having better performance in test writing.
An experiment has been carried out using a protocol that includes a cross-over study on two groups of computer science students. The aim was to measure the impact of each game mode on user involvement, as well as performance in terms of code coverage and mutation score. Data was collected via survey distributed to participants and via metrics automatically recorded by the plugin. The results show a preference for the achievements mode, perceived as more engaging and more sustainable over the long term. On the other hand, the code coverage and mutation score measures did not clearly distinguish between the two modes. These results were probably due to a low rate of valid projects for the analysis. Out of 45 users present for the experiment, only eight sent in a project that can be compiled for each session. Despite this, the effect size analysis suggests a slight trend in favor of achievements, without however reaching statistical significance.