18 research outputs found

    Visualising computational intelligence through converting data into formal concepts

    Get PDF

    Global Entropy Based Greedy Algorithm for discretization

    Get PDF
    Discretization algorithm is a crucial step to not only achieve summarization of continuous attributes but also better performance in classification that requires discrete values as input. In this thesis, I propose a supervised discretization method, Global Entropy Based Greedy algorithm, which is based on the Information Entropy Minimization. Experimental results show that the proposed method outperforms state of the art methods with well-known benchmarking datasets. To further improve the proposed method, a new approach for stop criterion that is based on the change rate of entropy was also explored. From the experimental analysis, it is noticed that the threshold based on the decreasing rate of entropy could be more effective than a constant number of intervals in the classification such as C5.0

    Mining Association Rule Summarization Techniques to prognosis Diabetic unknown patterns

    Get PDF
    Early detection of patients with lifted danger of creating diabetes mellitus is basic to the enhanced counteractive action and general clinical management of these patients. Data mining now-a-days assumes a critical part in expectation of diseases in human services industry. Data mining is the way toward choosing, investigating, and displaying a lot of data to find obscure examples or connections helpful to the data examiner. Therapeutic data mining has risen perfect with potential for investigating concealed examples from the data sets of medicinal area. These examples can be used for quick and better clinical basic leadership for preventive and suggestive medicine. However crude medicinal data are accessible generally appropriated, heterogeneous in nature and voluminous for common handling. Data mining and Statistics can altogether work better towards finding shrouded examples and structures in data. In this paper, two noteworthy Data Mining strategies v.i.z., FP-Growth and Apriori have been utilized for application to diabetes dataset and association rules are being created by both of these calculations

    Knowledge in Imperfect Data

    Get PDF

    The Unreasonable Effectiveness of Patches in Deep Convolutional Kernels Methods

    Get PDF
    International audienceA recent line of work showed that various forms of convolutional kernel methods can be competitive with standard supervised deep convolutional networks on datasets like CIFAR-10, obtaining accuracies in the range of 87-90% while being more amenable to theoretical analysis. In this work, we highlight the importance of a data-dependent feature extraction step that is key to obtain good performance in convolutional kernel methods. This step typically corresponds to a whitened dictionary of patches, and gives rise to a data-driven convolutional kernel methods. We extensively study its effect, demonstrating it is the key ingredient for high performance of these methods. Specifically, we show that one of the simplest instances of such kernel methods, based on a single layer of image patches followed by a linear classifier is already obtaining classification accuracies on CIFAR-10 in the same range as previous more sophisticated convolutional kernel methods. We scale this method to the challenging ImageNet dataset, showing such a simple approach can exceed all existing non-learned representation methods. This is a new baseline for object recognition without representation learning methods, that initiates the investigation of convolutional kernel models on ImageNet. We conduct experiments to analyze the dictionary that we used, our ablations showing they exhibit low-dimensional properties
    corecore