Document Type

Article

Publication Date

2020

Abstract

Code smells are symptoms of poor software design and implementation choices. Previous empirical studies have underlined their negative effect on software comprehension, fault-proneness and maintainability. A number of approaches have been proposed to identify the existence of code smells in the source code; recent studies have shown the potential of machine learning models in this context. However, previous approaches did not exploit the lexical and syntactical features of the source code; they instead modelled the source code using software metrics only. This paper proposes an approach for detecting the occurrence of the God class smell which utilizes both, the source code textual features and metrics to train three deep learning networks (i) Long short term memory, (ii) Gated recurrent unit and (iii) Convolutional neural network. We proposed utilizing deep leaning networks as they are reported to outperform traditional machine learning models in several domains including software engineering. To assess the proposed approach, a dataset for the God class smell was built using source codes acquired from the “Qualitas Corpus”. Experimental results demonstrated that, the three deep learning networks outperformed three traditional machine learning models: Naïve Bayes, Random forests and Decision trees. Additionally, of the three deep learning networks the Gated recurrent unit model is the superior in this context. Furthermore, combining both, the source code metrics and textual features enhanced the accuracy of detecting the God class smell.

Share

COinS