VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems' Logs

Sophie Fortz, Paul Temple, Xavier Devroey, Patrick Heymans, Gilles Perrouin

March 2024

Result of Friedman’s statistical test along with Nemenyi’s post-hoc analysis over all datasets and parameterisations.

Abstract

From business processes to course management, variability-intensive software systems (VIS) are now ubiquitous. One can configure these systems’ behaviour by activating options, e.g., to derive variants handling building permits across municipalities or implementing different functionalities (quizzes, forums) for a given course. These customisation facilities allow VIS to support distinct relevant customer requirements while taking advantage of reuse for common parts. Customisation thus allows realising both scope and scale economies. Behavioural differences amongst variants manifest themselves in event logs. To re-engineer this kind of system, one must know which variant(s) have produced which behaviour. Since variant information is barely present in logs, this paper supports this task by employing machine learning techniques to classify behaviours (event sequences) among variants. Specifically, we train Long Short Term Memory (LSTMs) and Gated Recurrent Units (GRUs) recurrent neural networks to relate event sequences with the variants they belong to on six different datasets issued from the configurable process and VIS domains. After having evaluated 20 different architectures of LSTM/GRU, our results demonstrate that it is possible to effectively learn the trace-to-variant mapping with high accuracy (at least 80% and up to 99%) and at scale, i.e., identifying 50 variants using 5000+ traces for each variant.

Type

Publication

Empirical Software Engineering

VaryMinions: Leveraging RNNs to Identify Variants in Variability-intensive Systems' Logs

Abstract

Xavier Devroey

Assistant Professor of Software Testing