34 research outputs found

    Feature Selection Based on Fuzzy Distances Between Clusters: First Results on Simulated Data.

    No full text
    Automatic feature selection methods are important in many situations where a large set of possible features are available from which a subset should be selected in order to compose suitable feature vectors. Several methods for automatic feature selection are based on two main points: a selection algorithm and a criterion function. Many criterion functions usually adopted depend on a distance between the clusters, being extremely important to the final result. Most distances between clusters are more suitable to convex sets, do not producing good results for concave clusters, or for clusters presenting overlapping areas. In order to circumvent these problems, this paper presents a new approach to defining the criterion decision based on fuzzy distances. In our approach, each cluster is fuzzified and a fuzzy distance is applied to the fuzzy sets. Experimental results illustrating the advantages of the new approach are discussed

    Statistical learning of multi-view face detection

    No full text
    1 Introduction Pattern recognition problems has two essential issues: (i) feature selection, and (ii) classifier design based on selected features. Boosting is a method which attempts to boost the accuracy of an ensemble of weak classifiers to a strong one. The AdaBoost algorithm [1] solved many of the practical difficulties of earlier boosting algorithms. Each weak classifier is trained one stage-wise to minimize the empirical error in a given distribution re-weighted according classification errors of the previously trained classifier. It is shown that AdaBoost is a sequential forward search procedure using the greedy selection strategy to minimize a certain margin on the training set [4]

    Feature Selection Based on Run Covering

    Get PDF
    This paper proposes a new feature selection algorithm. First, the data at every attribute are sorted. The continuously distributed data with the same class labels are grouped into runs. The runs whose length is greater than a given threshold are selected as “valid” runs, which enclose the instances separable from the other classes. Second, we count how many runs cover every instance and check how the covering number changes once eliminate a feature. Then, we delete the feature that has the least impact on the covering cases for all instances. We compare our method with ReliefF and a method based on mutual information. Evaluation was performed on 3 image databases. Experimental results show that the proposed method outperformed the other two

    Seamless Heterogeneous 3D Tessellation via DWT Domain Smoothing and Mosaicking

    Get PDF
    With todays geobrowsers, the tessellations are far from being smooth due to a variety of reasons: the principal being the light difference and resolution heterogeneity. Whilst the former has been extensively dealt with in the literature through classic mosaicking techniques, the latter has got little attention. We focus on this latter aspect and present two DWT domain methods to seamlessly stitch tiles of heterogeneous resolutions. The first method is local in that each of the tiles that constitute the view, is subjected to one of the three context-based smoothing functions proposed for horizontal, vertical, and radial smoothing, depending on its localization in the tessellation. These functions are applied at the DWT subband level and followed by an inverse DWT to give a smoothened tile. In the second method, though we assume the same tessellation scenario, the view field is thought to be of a sliding window which may contain parts of the tiles from the heterogeneous tessellation. The window is refined in the DWT domain through mosaicking and smoothing followed by a global inverse DWT. Rather than the traditional sense, the mosaicking employed over here targets the heterogeneous resolution. Perceptually, this second method has shown better results than the first one. The methods have been successfully applied to practical examples of both the texture and its corresponding DEM for seamless 3D terrain visualization

    Stability of feature selection methods ::a study of metrics across different gene expression datasets

    No full text
    Analysis of gene-expression data often requires that a gene (feature) subset is selected and many feature selection (FS) methods have been devised. However, FS methods often generate different lists of features for the same dataset and users then have to choose which list to use. One approach to support this choice is to apply stability metrics on the generated lists and selecting lists on that base. The aim of this study is to investigate the behavior of stability metrics applied to feature subsets generated by FS methods. The experiments in this work explore a plethora of gene expression datasets, FS methods, and expected number of features to compare several stability metrics. The stability metrics have been used to compare five feature selection methods (SVM, SAM, ReliefF, RFE + RF and LIMMA) on gene expression datasets from the EBI repository. Results show that the studied stability metrics display a high amount of variability. The reason behind this is not clear yet and is being further investigated. The final objective of the research, that is to define how to select a FS method, is an ongoing work whose partial findings are reported herein
    corecore