research

Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets

Abstract

Association rule mining is a well-known methodology to discover significant and apparently hidden relations among attributes in a subspace of instances from datasets. Genetic algorithms have been extensively used to find interesting association rules. However, the rule-matching task of such techniques usually requires high computational and memory requirements. The use of efficient computational techniques has become a task of the utmost importance due to the high volume of generated data nowadays. Hence, this paper aims at improving the scalability of quantitative association rule mining techniques based on genetic algorithms to handle large-scale datasets without quality loss in the results obtained. For this purpose, a new representation of the individuals, new genetic operators and a windowing-based learning scheme are proposed to achieve successfully such challenging task. Specifically, the proposed techniques are integrated into the multi-objective evolutionary algorithm named QARGA-M to assess their performances. Both the standard version and the enhanced one of QARGA-M have been tested in several datasets that present different number of attributes and instances. Furthermore, the proposed methodologies have been integrated into other existing techniques based in genetic algorithms to discover quantitative association rules. The comparative analysis performed shows significant improvements of QARGA-M and other existing genetic algorithms in terms of computational costs without losing quality in the results when the proposed techniques are applied.Ministerio de Ciencia y Tecnología TIN2011- 28956-C02-02Junta de Andalucía TIC-7528Junta de Andalucía P12-TIC-1728Universidad Pablo de Olavide APPB81309

    Similar works