4 research outputs found
Discovering gene association networks by multi-objective evolutionary quantitative association rules
In the last decade, the interest in microarray technology has exponentially increased due to its
ability to monitor the expression of thousands of genes simultaneously. The reconstruction of gene
association networks from gene expression profiles is a relevant task and several statistical
techniques have been proposed to build them. The problem lies in the process to discover which
genes are more relevant and to identify the direct regulatory relationships among them. We
developed a multi-objective evolutionary algorithm for mining quantitative association rules to deal
with this problem. We applied our methodology named GarNet to a well-known microarray data of
yeast cell cycle. The performance analysis of GarNet was organized in three steps similarly to the
study performed by Gallo et al. GarNet outperformed the benchmark methods in most cases in terms
of quality metrics of the networks, such as accuracy and precision, which were measured using
YeastNet database as true network. Furthermore, the results were consistent with previous
biological knowledge.Ministerio de Ciencia y Tecnología TIN2011-28956-C02-02Junta de Andalucía P11-TIC-752
Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets
Association rule mining is a well-known methodology to discover significant and apparently hidden relations among
attributes in a subspace of instances from datasets. Genetic algorithms have been extensively used to find interesting association
rules. However, the rule-matching task of such techniques usually requires high computational and memory requirements. The use
of efficient computational techniques has become a task of the utmost importance due to the high volume of generated data
nowadays. Hence, this paper aims at improving the scalability of quantitative association rule mining techniques based on
genetic algorithms to handle large-scale datasets without quality loss in the results obtained. For this purpose, a new
representation of the individuals, new genetic operators and a windowing-based learning scheme are proposed to achieve
successfully such challenging task. Specifically, the proposed techniques are integrated into the multi-objective evolutionary
algorithm named QARGA-M to assess their performances. Both the standard version and the enhanced one of QARGA-M have
been tested in several datasets that present different number of attributes and instances. Furthermore, the proposed methodologies
have been integrated into other existing techniques based in genetic algorithms to discover quantitative association rules. The
comparative analysis performed shows significant improvements of QARGA-M and other existing genetic algorithms in terms of
computational costs without losing quality in the results when the proposed techniques are applied.Ministerio de Ciencia y Tecnología TIN2011- 28956-C02-02Junta de Andalucía TIC-7528Junta de Andalucía P12-TIC-1728Universidad Pablo de Olavide APPB81309
Mineração de regras de associação diversas em dados meteorológicos temporais de múltiplos pontos geográficos via algoritmo genético / Mining of diverse association rules in temporal meteorological data from multiple geographic points via genetic algorithm
O conhecimento das associações de fatores climáticos que influenciam o clima em uma determinada região é importante para análises climáticas e planejamentos de curto a longo prazo. Contudo, os métodos tradicionais existentes na literatura para a descoberta de associações apresentam várias deficiências como alto custo computacional, o que impede sua aplicação até mesmo para conjunto de dados relativamente pequenos, ajuste de vários parâmetros críticos como limiares de suporte e confiança das regras, além de muitas vezes produzirem regras triviais. Tendo em vista esta limitação dos métodos tradicionais da literatura este artigo usa da teoria de Algoritmos Genéticos e suas extensões (memória Tabu e técnica de Nicho, a saber Clearing) para desenvolver e experimentar metodologias para mineração de regras de associação de dados temporais quantitativos. Os métodos foram aplicados a dados meteorológicos temporais de múltiplas cidades brasileiras para minerar implicações meteorológicas de um conjunto de cidades na situação meteorológica posterior em um cidade específica. Os experimentos realizados mostram que um dos métodos desenvolvidos que combina memória Tabu e Clearing, é bastante promissor, pois minera uma grande quantidade de regras de alta diversidade e não apresenta problema de convergência
Gene Network Biological Validity Based on Gene-Gene Interaction Relevance
In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown