675 research outputs found

    A Study in function optimization with the breeder genetic algorithm

    Get PDF
    Optimization is concerned with the finding of global optima (hence the name) of problems that can be cast in the form of a function of several variables and constraints thereof. Among the searching methods, {em Evolutionary Algorithms} have been shown to be adaptable and general tools that have often outperformed traditional {em ad hoc} methods. The {em Breeder Genetic Algorithm} (BGA) combines a direct representation with a nice conceptual simplicity. This work contains a general description of the algorithm and a detailed study on a collection of function optimization tasks. The results show that the BGA is a powerful and reliable searching algorithm. The main discussion concerns the choice of genetic operators and their parameters, among which the family of Extended Intermediate Recombination (EIR) is shown to stand out. In addition, a simple method to dynamically adjust the operator is outlined and found to greatly improve on the already excellent overall performance of the algorithm.Postprint (published version

    Exploiting the accumulated evidence for gene selection in microarray gene expression data

    Get PDF
    Machine Learning methods have of late made signicant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in the modeling process, since these tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this particular scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along the search process, the obtained gene subsets may constitute better solutions, either in terms of predictive accuracy or gene size, or in both. The proposed technique is extremely simple and applicable at a negligible overhead in cost.Postprint (published version

    Instance and feature weighted k-nearest-neighbors algorithm

    Get PDF
    We present a novel method that aims at providing a more stable selection of feature subsets when variations in the training process occur. This is accomplished by using an instance-weighting process -assigning different importances to instances as a preprocessing step to a feature weighting method that is independent of the learner, and then making good use of both sets of computed weigths in a standard Nearest-Neighbours classifier. We report extensive experimentation in well-known benchmarking datasets as well as some challenging microarray gene expression problems. Our results show increases in stability for most subset sizes and most problems, without compromising prediction accuracy.Peer ReviewedPostprint (published version

    Heterogeneous Kohonen networks

    Get PDF
    A large number of practical problems involves elements that are described as a mixture of qualitative and quantitative infomation, and whose description is probably incomplete. The self-organizing map is an effective tool for visualization of high-dimensional continuous data. In this work, we extend the network and training algorithm to cope with heterogeneous information, as well as missing values. The classification performance on a collection of benchmarking data sets is compared in different configurations. Various visualization methods are suggested to aid users interpret post-training results.Peer ReviewedPostprint (author's final draft

    Similarity networks for classification: a case study in the Horse Colic problem

    Get PDF
    This paper develops a two-layer neural network in which the neuron model computes a user-defined similarity function between inputs and weights. The neuron transfer function is formed by composition of an adapted logistic function with the mean of the partial input-weight similarities. The resulting neuron model is capable of dealing directly with variables of potentially different nature (continuous, fuzzy, ordinal, categorical). There is also provision for missing values. The network is trained using a two-stage procedure very similar to that used to train a radial basis function (RBF) neural network. The network is compared to two types of RBF networks in a non-trivial dataset: the Horse Colic problem, taken as a case study and analyzed in detail.Postprint (published version
    • …
    corecore