11,407 research outputs found

    Evolving Large-Scale Data Stream Analytics based on Scalable PANFIS

    Full text link
    Many distributed machine learning frameworks have recently been built to speed up the large-scale data learning process. However, most distributed machine learning used in these frameworks still uses an offline algorithm model which cannot cope with the data stream problems. In fact, large-scale data are mostly generated by the non-stationary data stream where its pattern evolves over time. To address this problem, we propose a novel Evolving Large-scale Data Stream Analytics framework based on a Scalable Parsimonious Network based on Fuzzy Inference System (Scalable PANFIS), where the PANFIS evolving algorithm is distributed over the worker nodes in the cloud to learn large-scale data stream. Scalable PANFIS framework incorporates the active learning (AL) strategy and two model fusion methods. The AL accelerates the distributed learning process to generate an initial evolving large-scale data stream model (initial model), whereas the two model fusion methods aggregate an initial model to generate the final model. The final model represents the update of current large-scale data knowledge which can be used to infer future data. Extensive experiments on this framework are validated by measuring the accuracy and running time of four combinations of Scalable PANFIS and other Spark-based built in algorithms. The results indicate that Scalable PANFIS with AL improves the training time to be almost two times faster than Scalable PANFIS without AL. The results also show both rule merging and the voting mechanisms yield similar accuracy in general among Scalable PANFIS algorithms and they are generally better than Spark-based algorithms. In terms of running time, the Scalable PANFIS training time outperforms all Spark-based algorithms when classifying numerous benchmark datasets.Comment: 20 pages, 5 figure

    ART and ARTMAP Neural Networks for Applications: Self-Organizing Learning, Recognition, and Prediction

    Full text link
    ART and ARTMAP neural networks for adaptive recognition and prediction have been applied to a variety of problems. Applications include parts design retrieval at the Boeing Company, automatic mapping from remote sensing satellite measurements, medical database prediction, and robot vision. This chapter features a self-contained introduction to ART and ARTMAP dynamics and a complete algorithm for applications. Computational properties of these networks are illustrated by means of remote sensing and medical database examples. The basic ART and ARTMAP networks feature winner-take-all (WTA) competitive coding, which groups inputs into discrete recognition categories. WTA coding in these networks enables fast learning, that allows the network to encode important rare cases but that may lead to inefficient category proliferation with noisy training inputs. This problem is partially solved by ART-EMAP, which use WTA coding for learning but distributed category representations for test-set prediction. In medical database prediction problems, which often feature inconsistent training input predictions, the ARTMAP-IC network further improves ARTMAP performance with distributed prediction, category instance counting, and a new search algorithm. A recently developed family of ART models (dART and dARTMAP) retains stable coding, recognition, and prediction, but allows arbitrarily distributed category representation during learning as well as performance.National Science Foundation (IRI 94-01659, SBR 93-00633); Office of Naval Research (N00014-95-1-0409, N00014-95-0657

    An Overview of Classifier Fusion Methods

    Get PDF
    A number of classifier fusion methods have been recently developed opening an alternative approach leading to a potential improvement in the classification performance. As there is little theory of information fusion itself, currently we are faced with different methods designed for different problems and producing different results. This paper gives an overview of classifier fusion methods and attempts to identify new trends that may dominate this area of research in future. A taxonomy of fusion methods trying to bring some order into the existing “pudding of diversities” is also provided

    Art Neural Networks for Remote Sensing: Vegetation Classification from Landsat TM and Terrain Data

    Full text link
    A new methodology for automatic mapping from Landsat Thematic Mapper (TM) and terrain data, based on the fuzzy ARTMAP neural network, is developed. System capabilities are tested on a challenging remote sensing classification problem, using spectral and terrain features for vegetation classification in the Cleveland National Forest. After training at the pixel level, system performance is tested at the stand level, using sites not seen during training. Results are compared to those of maximum likelihood classifiers, as well as back propagation neural networks and K Nearest Neighbor algorithms. ARTMAP dynamics are fast, stable, and scalable, overcoming common limitations of back propagation, which did not give satisfactory performance. Best results are obtained using a hybrid system based on a convex combination of fuzzy ARTMAP and maximum likelihood predictions. A prototype remote sensing example introduces each aspect of data processing and fuzzy ARTMAP classification. The example shows how the network automatically constructs a minimal number of recognition categories to meet accuracy criteria. A voting strategy improves prediction and assigns confidence estimates by training the system several times on different orderings of an input set.National Science Foundation (IRI 94-01659, SBR 93-00633); Office of Naval Research (N00014-95-l-0409, N00014-95-0657

    An Overview of Classifier Fusion Methods

    Get PDF
    A number of classifier fusion methods have been recently developed opening an alternative approach leading to a potential improvement in the classification performance. As there is little theory of information fusion itself, currently we are faced with different methods designed for different problems and producing different results. This paper gives an overview of classifier fusion methods and attempts to identify new trends that may dominate this area of research in future. A taxonomy of fusion methods trying to bring some order into the existing “pudding of diversities” is also provided

    A survey on utilization of data mining approaches for dermatological (skin) diseases prediction

    Get PDF
    Due to recent technology advances, large volumes of medical data is obtained. These data contain valuable information. Therefore data mining techniques can be used to extract useful patterns. This paper is intended to introduce data mining and its various techniques and a survey of the available literature on medical data mining. We emphasize mainly on the application of data mining on skin diseases. A categorization has been provided based on the different data mining techniques. The utility of the various data mining methodologies is highlighted. Generally association mining is suitable for extracting rules. It has been used especially in cancer diagnosis. Classification is a robust method in medical mining. In this paper, we have summarized the different uses of classification in dermatology. It is one of the most important methods for diagnosis of erythemato-squamous diseases. There are different methods like Neural Networks, Genetic Algorithms and fuzzy classifiaction in this topic. Clustering is a useful method in medical images mining. The purpose of clustering techniques is to find a structure for the given data by finding similarities between data according to data characteristics. Clustering has some applications in dermatology. Besides introducing different mining methods, we have investigated some challenges which exist in mining skin data

    ARTMAP-IC and Medical Diagnosis: Instance Counting and Inconsistent Cases

    Full text link
    For complex database prediction problems such as medical diagnosis, the ARTMAP-IC neural network adds distributed prediction and category instance counting to the basic fuzzy ARTMAP system. For the ARTMAP match tracking algorithm, which controls search following a predictive error, a new version facilitates prediction with sparse or inconsistent data. Compared to the original match tracking algorithm (MT+), the new algorithm (MT-) better approximates the real-time network differential equations and further compresses memory without loss of performance. Simulations examine predictive accuracy on four medical databases: Pima Indian diabetes, breast cancer, heart disease, and gall bladder removal. ARTMAP-IC results arc equal to or better than those of logistic regression, K nearest neighbor (KNN), the ADAP perceptron, multisurface pattern separation, CLASSIT, instance-based (IBL), and C4. ARTMAP dynamics are fast, stable, and scalable. A voting strategy improves prediction by training the system several times on different orderings of an input set. Voting, instance counting, and distributed representations combine to form confidence estimates for competing predictions.National Science Foundation (IRI 94-01659); Office of Naval Research (N00014-95-J-0409, N00014-95-0657

    ART Neural Networks for Remote Sensing Image Analysis

    Full text link
    ART and ARTMAP neural networks for adaptive recognition and prediction have been applied to a variety of problems, including automatic mapping from remote sensing satellite measurements, parts design retrieval at the Boeing Company, medical database prediction, and robot vision. This paper features a self-contained introduction to ART and ARTMAP dynamics. An application of these networks to image processing is illustrated by means of a remote sensing example. The basic ART and ARTMAP networks feature winner-take-all (WTA) competitive coding, which groups inputs into discrete recognition categories. WTA coding in these networks enables fast learning, which allows the network to encode important rare cases but which may lead to inefficient category proliferation with noisy training inputs. This problem is partially solved by ART-EMAP, which use WTA coding for learning but distributed category representations for test-set prediction. Recently developed ART models (dART and dARTMAP) retain stable coding, recognition, and prediction, but allow arbitrarily distributed category representation during learning as well as performance

    Fuzzy ART

    Full text link
    Adaptive Resonance Theory (ART) models are real-time neural networks for category learning, pattern recognition, and prediction. Unsupervised fuzzy ART and supervised fuzzy ARTMAP synthesize fuzzy logic and ART networks by exploiting the formal similarity between the computations of fuzzy subsethood and the dynamics of ART category choice, search, and learning. Fuzzy ART self-organizes stable recognition categories in response to arbitrary sequences of analog or binary input patterns. It generalizes the binary ART 1 model, replacing the set-theoretic: intersection (∩) with the fuzzy intersection (∧), or component-wise minimum. A normalization procedure called complement coding leads to a symmetric: theory in which the fuzzy inter:>ec:tion and the fuzzy union (√), or component-wise maximum, play complementary roles. Complement coding preserves individual feature amplitudes while normalizing the input vector, and prevents a potential category proliferation problem. Adaptive weights :otart equal to one and can only decrease in time. A geometric interpretation of fuzzy AHT represents each category as a box that increases in size as weights decrease. A matching criterion controls search, determining how close an input and a learned representation must be for a category to accept the input as a new exemplar. A vigilance parameter (p) sets the matching criterion and determines how finely or coarsely an ART system will partition inputs. High vigilance creates fine categories, represented by small boxes. Learning stops when boxes cover the input space. With fast learning, fixed vigilance, and an arbitrary input set, learning stabilizes after just one presentation of each input. A fast-commit slow-recode option allows rapid learning of rare events yet buffers memories against recoding by noisy inputs. Fuzzy ARTMAP unites two fuzzy ART networks to solve supervised learning and prediction problems. A Minimax Learning Rule controls ARTMAP category structure, conjointly minimizing predictive error and maximizing code compression. Low vigilance maximizes compression but may therefore cause very different inputs to make the same prediction. When this coarse grouping strategy causes a predictive error, an internal match tracking control process increases vigilance just enough to correct the error. ARTMAP automatically constructs a minimal number of recognition categories, or "hidden units," to meet accuracy criteria. An ARTMAP voting strategy improves prediction by training the system several times using different orderings of the input set. Voting assigns confidence estimates to competing predictions given small, noisy, or incomplete training sets. ARPA benchmark simulations illustrate fuzzy ARTMAP dynamics. The chapter also compares fuzzy ARTMAP to Salzberg's Nested Generalized Exemplar (NGE) and to Simpson's Fuzzy Min-Max Classifier (FMMC); and concludes with a summary of ART and ARTMAP applications.Advanced Research Projects Agency (ONR N00014-92-J-4015); National Science Foundation (IRI-90-00530); Office of Naval Research (N00014-91-J-4100

    ARTMAP Neural Networks for Information Fusion and Data Mining: Map Production and Target Recognition Methodologies

    Full text link
    The Sensor Exploitation Group of MIT Lincoln Laboratory incorporated an early version of the ARTMAP neural network as the recognition engine of a hierarchical system for fusion and data mining of registered geospatial images. The Lincoln Lab system has been successfully fielded, but is limited to target I non-target identifications and does not produce whole maps. Procedures defined here extend these capabilities by means of a mapping method that learns to identify and distribute arbitrarily many target classes. This new spatial data mining system is designed particularly to cope with the highly skewed class distributions of typical mapping problems. Specification of canonical algorithms and a benchmark testbed has enabled the evaluation of candidate recognition networks as well as pre- and post-processing and feature selection options. The resulting mapping methodology sets a standard for a variety of spatial data mining tasks. In particular, training pixels are drawn from a region that is spatially distinct from the mapped region, which could feature an output class mix that is substantially different from that of the training set. The system recognition component, default ARTMAP, with its fully specified set of canonical parameter values, has become the a priori system of choice among this family of neural networks for a wide variety of applications.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); Office of Naval Research (N00014-01-1-0624
