3 research outputs found

    Stochastic Local Search Heuristics for Efficient Feature Selection: An Experimental Study

    Get PDF
    Feature engineering, including feature selection, plays a key role in data science, knowledge discovery, machine learning, and statistics. Recently, much progress has been made in increasing the accuracy of machine learning for complex problems. In part, this is due to improvements in feature engineering, for example by means of deep learning or feature selection. This progress has, to a large extent, come at the cost of dramatic and perhaps unsustainable increases in the computational resources used. Consequently, there is now a need to emphasize not only accuracy but also computational cost in research on and applications of machine learning including feature selection. With a focus on both the accuracy and computational cost of feature selection, we study stochastic local search (SLS) methods when applied to feature selection in this paper. With an eye to containing computational cost, we consider an SLS method for efficient feature selection, SLS4FS. SLS4FS is an amalgamation of several heuristics, including filter and wrapper methods, controlled by hyperparameters. While SLS4FS admits, for certain hyperparameter settings, analysis by means of homogeneous Markov chains, our focus is on experiments with several realworld datasets in this paper. Our experimental study suggests that SLS4FS is competitive with several existing methods, and is useful in settings where one wants to control the computational cost

    An Efficient Evolutionary Algorithm for Subset Selection with General Cost Constraints

    No full text
    In this paper, we study the problem of selecting a subset from a ground set to maximize a monotone objective function f such that a monotone cost function c is bounded by an upper limit. State-of-the-art algorithms include the generalized greedy algorithm and POMC. The former is an efficient fixed time algorithm, but the performance is limited by the greedy nature. The latter is an anytime algorithm that can find better subsets using more time, but without any polynomial-time approximation guarantee. In this paper, we propose a new anytime algorithm EAMC, which employs a simple evolutionary algorithm to optimize a surrogate objective integrating f and c. We prove that EAMC achieves the best known approximation guarantee in polynomial expected running time. Experimental results on the applications of maximum coverage, influence maximization and sensor placement show the excellent performance of EAMC

    An Efficient Evolutionary Algorithm for Subset Selection with General Cost Constraints

    No full text
    corecore