2,451 research outputs found

    Feature selection for microarray gene expression data using simulated annealing guided by the multivariate joint entropy

    Get PDF
    In this work a new way to calculate the multivariate joint entropy is presented. This measure is the basis for a fast information-theoretic based evaluation of gene relevance in a Microarray Gene Expression data context. Its low complexity is based on the reuse of previous computations to calculate current feature relevance. The mu-TAFS algorithm --named as such to differentiate it from previous TAFS algorithms-- implements a simulated annealing technique specially designed for feature subset selection. The algorithm is applied to the maximization of gene subset relevance in several public-domain microarray data sets. The experimental results show a notoriously high classification performance and low size subsets formed by biologically meaningful genes.Postprint (published version

    Media planning by optimizing contact frequencies

    Get PDF
    In this paper we study a model to estimate the probability that a target group of an advertising campaign is reached by a commercial message a given number of times. This contact frequency distribution is known to be computationally difficult to calculate because of dependence between the viewing probabilities of advertisements. Our model calculates good estimates of contact frequencies in a very short time based on data that is often available. A media planning model that optimizes effective reach as a function of contact frequencies demonstrates the usefulness of the model. Several local search procedures such as taboo search, simulated annealing and genetic algorithms are applied to find a good media schedule. The results show that local search methods are flexible, fast and accurate in finding media schedules for media planning models based on contact frequencies. The contact frequency model is a potentially useful new tool for media planners.optimization;contact frequency;effective reach;media planning

    Optimizing Product Line Designs: Efficient Methods and Comparisons

    Get PDF
    We compare a broad range of optimal product line design methods. The comparisons take advantage of recent advances that make it possible to identify the optimal solution to problems that are too large for complete enumeration. Several of the methods perform surprisingly well, including Simulated Annealing, Product-Swapping and Genetic Algorithms. The Product-Swapping heuristic is remarkable for its simplicity. The performance of this heuristic suggests that the optimal product line design problem may be far easier to solve in practice than indicated by complexity theory

    Application of Global Optimization Methods for Feature Selection and Machine Learning

    Get PDF
    The feature selection process constitutes a commonly encountered problem of global combinatorial optimization. The process reduces the number of features by removing irrelevant and redundant data. This paper proposed a novel immune clonal genetic algorithm based on immune clonal algorithm designed to solve the feature selection problem. The proposed algorithm has more exploration and exploitation abilities due to the clonal selection theory, and each antibody in the search space specifies a subset of the possible features. Experimental results show that the proposed algorithm simplifies the feature selection process effectively and obtains higher classification accuracy than other feature selection algorithms

    Media planning by optimizing contact frequencies

    Get PDF
    In this paper we study a model to estimate the probability that a target group of an advertising campaign is reached by a commercial message a given number of times. This contact frequency distribution is known to be computationally difficult to calculate because of dependence between the viewing probabilities of advertisements. Our model calculates good estimates of contact frequencies in a very short time based on data that is often available. A media planning model that optimizes effective reach as a function of contact frequencies demonstrates the usefulness of the model. Several local search procedures such as taboo search, simulated annealing and genetic algorithms are applied to find a good media schedule. The results show that local search methods are flexible, fast and accurate in finding media schedules for media planning models based on contact frequencies. The contact frequency model is a potentially useful new tool for media planners

    Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity

    Full text link
    Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice---there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a non-monotone submodular function subject to a cardinality constraint kk that runs in O(log⁥(n))O(\log(n)) adaptive rounds and makes O(nlog⁥(k))O(n \log(k)) oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries.Comment: 12 pages, 8 figure

    Feature selection in high dimensional regression problems for genomic

    Get PDF
    International audienceIn the context of genomic selection in animal breeding, an important objective consists in looking for explicative markers for a phe- notype under study. In order to deal with a high number of markers, we propose to use combinatorial optimization to perform variable selection. Results show that our approach outperforms some classical and widely used methods on simulated and "closed to real" datasets
    • 

    corecore