Code smells are seen as major source of technical debt and, as such, should
be detected and removed. However, researchers argue that the subjectiveness of
the code smells detection process is a major hindrance to mitigate the problem
of smells-infected code. We proposed the crowdsmelling approach based on
supervised machine learning techniques, where the wisdom of the crowd (of
software developers) is used to collectively calibrate code smells detection
algorithms, thereby lessening the subjectivity issue. This paper presents the
results of a validation experiment for the crowdsmelling approach. In the
context of three consecutive years of a Software Engineering course, a total
"crowd" of around a hundred teams, with an average of three members each,
classified the presence of 3 code smells (Long Method, God Class, and Feature
Envy) in Java source code. These classifications were the basis of the oracles
used for training six machine learning algorithms. Over one hundred models were
generated and evaluated to determine which machine learning algorithms had the
best performance in detecting each of the aforementioned code smells. Good
performances were obtained for God Class detection (ROC=0.896 for Naive Bayes)
and Long Method detection (ROC=0.870 for AdaBoostM1), but much lower for
Feature Envy (ROC=0.570 for Random Forrest). Obtained results suggest that
crowdsmelling is a feasible approach for the detection of code smells, but
further validation experiments are required to cover more code smells and to
increase external validity