8,364 research outputs found
Web Usage Mining with Evolutionary Extraction of Temporal Fuzzy Association Rules
In Web usage mining, fuzzy association rules that have a temporal property can provide useful knowledge about when associations occur. However, there is a problem with traditional temporal fuzzy association rule mining algorithms. Some rules occur at the intersection of fuzzy sets' boundaries where there is less support (lower membership), so the rules are lost. A genetic algorithm (GA)-based solution is described that uses the flexible nature of the 2-tuple linguistic representation to discover rules that occur at the intersection of fuzzy set boundaries. The GA-based approach is enhanced from previous work by including a graph representation and an improved fitness function. A comparison of the GA-based approach with a traditional approach on real-world Web log data discovered rules that were lost with the traditional approach. The GA-based approach is recommended as complementary to existing algorithms, because it discovers extra rules. (C) 2013 Elsevier B.V. All rights reserved
A Study in function optimization with the breeder genetic algorithm
Optimization is concerned with the finding of global optima
(hence the name) of problems that can be cast in the form of a
function of several variables and constraints thereof. Among the
searching methods, {em Evolutionary Algorithms} have been shown to be
adaptable and general tools that have often outperformed traditional
{em ad hoc} methods. The {em Breeder Genetic Algorithm} (BGA)
combines a direct representation with a nice conceptual
simplicity. This work contains a general description of the algorithm
and a detailed study on a collection of function optimization
tasks. The results show that the BGA is a powerful and reliable
searching algorithm. The main discussion concerns the choice of
genetic operators and their parameters, among which the family of
Extended Intermediate Recombination (EIR) is shown to stand out. In
addition, a simple method to dynamically adjust the operator is
outlined and found to greatly improve on the already excellent overall
performance of the algorithm.Postprint (published version
A new sequential covering strategy for inducing classification rules with ant colony algorithms
Ant colony optimization (ACO) algorithms have been successfully applied to discover a list of classification rules. In general, these algorithms follow a sequential covering strategy, where a single rule is discovered at each iteration of the algorithm in order to build a list of rules. The sequential covering strategy has the drawback of not coping with the problem of rule interaction, i.e., the outcome of a rule affects the rules that can be discovered subsequently since the search space is modified due to the removal of examples covered by previous rules. This paper proposes a new sequential covering strategy for ACO classification algorithms to mitigate the problem of rule interaction, where the order of the rules is implicitly encoded as pheromone values and the search is guided by the quality of a candidate list of rules. Our experiments using 18 publicly available data sets show that the predictive accuracy obtained by a new ACO classification algorithm implementing the proposed sequential covering strategy is statistically significantly higher than the predictive accuracy of state-of-the-art rule induction classification algorithms
A Multi-Gene Genetic Programming Application for Predicting Students Failure at School
Several efforts to predict student failure rate (SFR) at school accurately
still remains a core problem area faced by many in the educational sector. The
procedure for forecasting SFR are rigid and most often times require data
scaling or conversion into binary form such as is the case of the logistic
model which may lead to lose of information and effect size attenuation. Also,
the high number of factors, incomplete and unbalanced dataset, and black boxing
issues as in Artificial Neural Networks and Fuzzy logic systems exposes the
need for more efficient tools. Currently the application of Genetic Programming
(GP) holds great promises and has produced tremendous positive results in
different sectors. In this regard, this study developed GPSFARPS, a software
application to provide a robust solution to the prediction of SFR using an
evolutionary algorithm known as multi-gene genetic programming. The approach is
validated by feeding a testing data set to the evolved GP models. Result
obtained from GPSFARPS simulations show its unique ability to evolve a suitable
failure rate expression with a fast convergence at 30 generations from a
maximum specified generation of 500. The multi-gene system was also able to
minimize the evolved model expression and accurately predict student failure
rate using a subset of the original expressionComment: 14 pages, 9 figures, Journal paper. arXiv admin note: text overlap
with arXiv:1403.0623 by other author
A hierarchical Mamdani-type fuzzy modelling approach with new training data selection and multi-objective optimisation mechanisms: A special application for the prediction of mechanical properties of alloy steels
In this paper, a systematic data-driven fuzzy modelling methodology is proposed, which allows to construct Mamdani fuzzy models considering both accuracy (precision) and transparency (interpretability) of fuzzy systems. The new methodology employs a fast hierarchical clustering algorithm to generate an initial fuzzy model efficiently; a training data selection mechanism is developed to identify appropriate and efficient data as learning samples; a high-performance Particle Swarm Optimisation (PSO) based multi-objective optimisation mechanism is developed to further improve the fuzzy model in terms of both the structure and the parameters; and a new tolerance analysis method is proposed to derive the confidence bands relating to the final elicited models. This proposed modelling approach is evaluated using two benchmark problems and is shown to outperform other modelling approaches. Furthermore, the proposed approach is successfully applied to complex high-dimensional modelling problems for manufacturing of alloy steels, using ‘real’ industrial data. These problems concern the prediction of the mechanical properties of alloy steels by correlating them with the heat treatment process conditions as well as the weight percentages of the chemical compositions
QCBA: Postoptimization of Quantitative Attributes in Classifiers based on Association Rules
The need to prediscretize numeric attributes before they can be used in
association rule learning is a source of inefficiencies in the resulting
classifier. This paper describes several new rule tuning steps aiming to
recover information lost in the discretization of numeric (quantitative)
attributes, and a new rule pruning strategy, which further reduces the size of
the classification models. We demonstrate the effectiveness of the proposed
methods on postoptimization of models generated by three state-of-the-art
association rule classification algorithms: Classification based on
Associations (Liu, 1998), Interpretable Decision Sets (Lakkaraju et al, 2016),
and Scalable Bayesian Rule Lists (Yang, 2017). Benchmarks on 22 datasets from
the UCI repository show that the postoptimized models are consistently smaller
-- typically by about 50% -- and have better classification performance on most
datasets
Recommended from our members
A niching memetic algorithm for simultaneous clustering and feature selection
Clustering is inherently a difficult task, and is made even more difficult when the selection of relevant features is also an issue. In this paper we propose an approach for simultaneous clustering and feature selection using a niching memetic algorithm. Our approach (which we call NMA_CFS) makes feature selection an integral part of the global clustering search procedure and attempts to overcome the problem of identifying less promising locally optimal solutions in both clustering and feature selection, without making any a priori assumption about the number of clusters. Within the NMA_CFS procedure, a variable composite representation is devised to encode both feature selection and cluster centers with different numbers of clusters. Further, local search operations are introduced to refine feature selection and cluster centers encoded in the chromosomes. Finally, a niching method is integrated to preserve the population diversity and prevent premature convergence. In an experimental evaluation we demonstrate the effectiveness of the proposed approach and compare it with other related approaches, using both synthetic and real data
- …