31,578 research outputs found

    A New Search Algorithm for Feature Selection in Hyperspectral Remote Sensing Images

    Get PDF
    A new suboptimal search strategy suitable for feature selection in very high-dimensional remote-sensing images (e.g. those acquired by hyperspectral sensors) is proposed. Each solution of the feature selection problem is represented as a binary string that indicates which features are selected and which are disregarded. In turn, each binary string corresponds to a point of a multidimensional binary space. Given a criterion function to evaluate the effectiveness of a selected solution, the proposed strategy is based on the search for constrained local extremes of such a function in the above-defined binary space. In particular, two different algorithms are presented that explore the space of solutions in different ways. These algorithms are compared with the classical sequential forward selection and sequential forward floating selection suboptimal techniques, using hyperspectral remote-sensing images (acquired by the AVIRIS sensor) as a data set. Experimental results point out the effectiveness of both algorithms, which can be regarded as valid alternatives to classical methods, as they allow interesting tradeoffs between the qualities of selected feature subsets and computational cost

    Facilitating meta-design techniques for multi-disciplinary conceptual design

    Get PDF
    The research reported in this paper was supported by the EU FP6 funded project, SimSAC (Simulating Aircraft Stability and Control Characteristics for Use in Conceptual Design)

    Learning the structure of Bayesian Networks: A quantitative assessment of the effect of different algorithmic schemes

    Full text link
    One of the most challenging tasks when adopting Bayesian Networks (BNs) is the one of learning their structure from data. This task is complicated by the huge search space of possible solutions, and by the fact that the problem is NP-hard. Hence, full enumeration of all the possible solutions is not always feasible and approximations are often required. However, to the best of our knowledge, a quantitative analysis of the performance and characteristics of the different heuristics to solve this problem has never been done before. For this reason, in this work, we provide a detailed comparison of many different state-of-the-arts methods for structural learning on simulated data considering both BNs with discrete and continuous variables, and with different rates of noise in the data. In particular, we investigate the performance of different widespread scores and algorithmic approaches proposed for the inference and the statistical pitfalls within them

    Reduction of the size of datasets by using evolutionary feature selection: the case of noise in a modern city

    Get PDF
    Smart city initiatives have emerged to mitigate the negative effects of a very fast growth of urban areas. Most of the population in our cities are exposed to high levels of noise that generate discomfort and different health problems. These issues may be mitigated by applying different smart cities solutions, some of them require high accurate noise information to provide the best quality of serve possible. In this study, we have designed a machine learning approach based on genetic algorithms to analyze noise data captured in the university campus. This method reduces the amount of data required to classify the noise by addressing a feature selection optimization problem. The experimental results have shown that our approach improved the accuracy in 20% (achieving an accuracy of 87% with a reduction of up to 85% on the original dataset).Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This research has been partially funded by the Spanish MINECO and FEDER projects TIN2016-81766-REDT (http://cirti.es), and TIN2017-88213-R (http://6city.lcc.uma.es)

    Bayesian semiparametric analysis for two-phase studies of gene-environment interaction

    Full text link
    The two-phase sampling design is a cost-efficient way of collecting expensive covariate information on a judiciously selected subsample. It is natural to apply such a strategy for collecting genetic data in a subsample enriched for exposure to environmental factors for gene-environment interaction (G x E) analysis. In this paper, we consider two-phase studies of G x E interaction where phase I data are available on exposure, covariates and disease status. Stratified sampling is done to prioritize individuals for genotyping at phase II conditional on disease and exposure. We consider a Bayesian analysis based on the joint retrospective likelihood of phases I and II data. We address several important statistical issues: (i) we consider a model with multiple genes, environmental factors and their pairwise interactions. We employ a Bayesian variable selection algorithm to reduce the dimensionality of this potentially high-dimensional model; (ii) we use the assumption of gene-gene and gene-environment independence to trade off between bias and efficiency for estimating the interaction parameters through use of hierarchical priors reflecting this assumption; (iii) we posit a flexible model for the joint distribution of the phase I categorical variables using the nonparametric Bayes construction of Dunson and Xing [J. Amer. Statist. Assoc. 104 (2009) 1042-1051].Comment: Published in at http://dx.doi.org/10.1214/12-AOAS599 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org
    • …
    corecore