2 research outputs found

    Fast Multi-Class Probabilistic Classifier by Sparse Non-parametric Density Estimation

    Full text link
    The model interpretation is essential in many application scenarios and to build a classification model with a ease of model interpretation may provide useful information for further studies and improvement. It is common to encounter with a lengthy set of variables in modern data analysis, especially when data are collected in some automatic ways. This kinds of datasets may not collected with a specific analysis target and usually contains redundant features, which have no contribution to a the current analysis task of interest. Variable selection is a common way to increase the ability of model interpretation and is popularly used with some parametric classification models. There is a lack of studies about variable selection in nonparametric classification models such as the density estimation-based methods and this is especially the case for multiple-class classification situations. In this study we study multiple-class classification problems using the thought of sparse non-parametric density estimation and propose a method for identifying high impacts variables for each class. We present the asymptotic properties and the computation procedure for the proposed method together with some suggested sample size. We also repost the numerical results using both synthesized and some real data sets

    Some New Copula Based Distribution-free Tests of Independence among Several Random Variables

    Full text link
    Over the last couple of decades, several copula based methods have been proposed in the literature to test for the independence among several random variables. But these existing tests are not invariant under monotone transformations of the variables, and they often perform poorly if the dependence among the variables is highly non-monotone in nature. In this article, we propose a copula based measure of dependency and use it to construct some new distribution-free tests of independence. The proposed measure and the resulting tests, all are invariant under permutations and monotone transformations of the variables. Our dependency measure involves a kernel function, and we use the Gaussian kernel for that purpose. We adopt a multi-scale approach, where we look at the results obtained for several choices of the bandwidth parameter associated with the Gaussian kernel and aggregate them judiciously. Large sample properties of the dependency measure and the resulting tests are derived under appropriate regularity conditions. Several simulated and real data sets are analyzed to compare the performance of the proposed tests with some popular tests available in the literature.Comment: arXiv admin note: text overlap with arXiv:1708.0748
    corecore