2 research outputs found

    A Non-parametric Semi-supervised Discretization Method

    Get PDF
    Semi-supervised classification methods aim to exploit labelled and unlabelled examples to train a predictive model. Most of these approaches make assumptions on the distribution of classes. This article first proposes a new semi-supervised discretization method which adopts very low informative prior on data. This method discretizes the numerical domain of a continuous input variable, while keeping the information relative to the prediction of classes. Then, an in-depth comparison of this semi-supervised method with the original supervised MODL approach is presented. We demonstrate that the semi-supervised approach is asymptotically equivalent to the supervised approach, improved with a post-optimization of the intervals bounds location

    An algorithm for discretization of real value attributes based on interval similarity

    Get PDF
    Extent: 8p.Discretization algorithm for real value attributes is of very important uses in many areas such as intelligence and machine learning. The algorithms related to Chi2 algorithm (includes modified Chi2 algorithm and extended Chi2 algorithm) are famous discretization algorithm exploiting the technique of probability and statistics. In this paper the algorithms are analyzed, and their drawback is pointed. Based on the analysis a new modified algorithm based on interval similarity is proposed. The new algorithm defines an interval similarity function which is regarded as a new merging standard in the process of discretization. At the same time, two important parameters (condition parameterαand tiny move parameterc) in the process of discretization and discrepancy extent of a number of adjacent two intervals are given in the form of function. The related theory analysis and the experiment results show that the presented algorithm is effective.Li Zou, Deqin Yan, Hamid Reza Karimi, and Peng Sh
    corecore