14 research outputs found

    Statistical estimation of the intrinsic dimensionality of data collections

    Get PDF
    A realization fi(路) from a class F(路) can be represented as a point in a metric space and the locus of all points belonging to F(路) lie on a surface in this space. The intrinsic dimensionality of F(路), defined as the least number of parameters needed to identify any fi(路) belonging to F(路), is equal to the topological dimensionality of this surface. Given a sample set of realizations fi(路) from F(路), a statistical method is presented for estimating the intrinsic dimensionality of F(路)

    Probability Density of the Maximum Likelihood Elevation Estimate of Radar Targets

    No full text

    Is Feature Selection Still Necessary?

    No full text

    A Probabilistic Definition of Intrinsic Dimensionality for Images

    No full text
    In this paper we address the problem of appropriately representing the intrinsic dimensionality of image neighborhoods. This dimensionality describes the degrees of freedom of a local image patch and it gives rise to some of the most often applied corner and edge detectors

    Feature selection with adjustable criteria

    No full text
    Abstract. We present a study on a rough set based approach for feature selection. Instead of using significance or support, Parameterized Average Support Heuristic (PASH) considers the overall quality of the potential set of rules. It will produce a set of rules with balanced support distribution over all decision classes. Adjustable parameters of PASH can help users with different levels of approximation needs to extract predictive rules that may be ignored by other methods. This paper finetunes the PASH heuristic and provides experimental results to PASH.

    GA-Facilitated Knowledge Discovery and Pattern Recognition Optimization Applied to the Biochemistry of Protein Solvation

    No full text
    The authors present a GA optimization technique for cosine-based k-nearest neighbors classification that improves predictive accuracy in a class-balanced manner while simultaneously enabling knowledge discovery. The GA performs feature selection and extraction by searching for feature weights and offsets maximizing cosine classifier performance. GA-selected feature weights determine the relevance of each feature to the classification task. This hybrid GA/classifier provides insight to a notoriously difficult problem in molecular biology, the correct treatment of water molecules mediating ligand binding to proteins. In distinguishing patterns of water conservation and displacement, this method achieves higher accuracy than previous techniques. The data mining capabilities of the hybrid system improve the understanding of the physical and chemical determinants governing favored protein-water binding
    corecore