70,576 research outputs found

    ARTMAP Neural Networks for Information Fusion and Data Mining: Map Production and Target Recognition Methodologies

    Full text link
    The Sensor Exploitation Group of MIT Lincoln Laboratory incorporated an early version of the ARTMAP neural network as the recognition engine of a hierarchical system for fusion and data mining of registered geospatial images. The Lincoln Lab system has been successfully fielded, but is limited to target I non-target identifications and does not produce whole maps. Procedures defined here extend these capabilities by means of a mapping method that learns to identify and distribute arbitrarily many target classes. This new spatial data mining system is designed particularly to cope with the highly skewed class distributions of typical mapping problems. Specification of canonical algorithms and a benchmark testbed has enabled the evaluation of candidate recognition networks as well as pre- and post-processing and feature selection options. The resulting mapping methodology sets a standard for a variety of spatial data mining tasks. In particular, training pixels are drawn from a region that is spatially distinct from the mapped region, which could feature an output class mix that is substantially different from that of the training set. The system recognition component, default ARTMAP, with its fully specified set of canonical parameter values, has become the a priori system of choice among this family of neural networks for a wide variety of applications.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); Office of Naval Research (N00014-01-1-0624

    Data mining based cyber-attack detection

    Get PDF

    ARTMAP Neural Networks for Information Fusion and Data Mining: Map Production and Target Recognition Methodologies

    Full text link
    The Sensor Exploitation Group of MIT Lincoln Laboratory incorporated an early version of the ARTMAP neural network as the recognition engine of a hierarchical system for fusion and data mining of registered geospatial images. The Lincoln Lab system has been successfully fielded, but is limited to target I non-target identifications and does not produce whole maps. Procedures defined here extend these capabilities by means of a mapping method that learns to identify and distribute arbitrarily many target classes. This new spatial data mining system is designed particularly to cope with the highly skewed class distributions of typical mapping problems. Specification of canonical algorithms and a benchmark testbed has enabled the evaluation of candidate recognition networks as well as pre- and post-processing and feature selection options. The resulting mapping methodology sets a standard for a variety of spatial data mining tasks. In particular, training pixels are drawn from a region that is spatially distinct from the mapped region, which could feature an output class mix that is substantially different from that of the training set. The system recognition component, default ARTMAP, with its fully specified set of canonical parameter values, has become the a priori system of choice among this family of neural networks for a wide variety of applications.Air Force Office of Scientific Research (F49620-01-1-0397, F49620-01-1-0423); Office of Naval Research (N00014-01-1-0624

    Improving acoustic vehicle classification by information fusion

    No full text
    We present an information fusion approach for ground vehicle classification based on the emitted acoustic signal. Many acoustic factors can contribute to the classification accuracy of working ground vehicles. Classification relying on a single feature set may lose some useful information if its underlying sound production model is not comprehensive. To improve classification accuracy, we consider an information fusion diagram, in which various aspects of an acoustic signature are taken into account and emphasized separately by two different feature extraction methods. The first set of features aims to represent internal sound production, and a number of harmonic components are extracted to characterize the factors related to the vehicleā€™s resonance. The second set of features is extracted based on a computationally effective discriminatory analysis, and a group of key frequency components are selected by mutual information, accounting for the sound production from the vehicleā€™s exterior parts. In correspondence with this structure, we further put forward a modifiedBayesian fusion algorithm, which takes advantage of matching each specific feature set with its favored classifier. To assess the proposed approach, experiments are carried out based on a data set containing acoustic signals from different types of vehicles. Results indicate that the fusion approach can effectively increase classification accuracy compared to that achieved using each individual features set alone. The Bayesian-based decision level fusion is found fusion is found to be improved than a feature level fusion approac

    Deep fusion of multi-channel neurophysiological signal for emotion recognition and monitoring

    Get PDF
    How to fuse multi-channel neurophysiological signals for emotion recognition is emerging as a hot research topic in community of Computational Psychophysiology. Nevertheless, prior feature engineering based approaches require extracting various domain knowledge related features at a high time cost. Moreover, traditional fusion method cannot fully utilise correlation information between different channels and frequency components. In this paper, we design a hybrid deep learning model, in which the 'Convolutional Neural Network (CNN)' is utilised for extracting task-related features, as well as mining inter-channel and inter-frequency correlation, besides, the 'Recurrent Neural Network (RNN)' is concatenated for integrating contextual information from the frame cube sequence. Experiments are carried out in a trial-level emotion recognition task, on the DEAP benchmarking dataset. Experimental results demonstrate that the proposed framework outperforms the classical methods, with regard to both of the emotional dimensions of Valence and Arousal

    Machine Learning Methods for Attack Detection in the Smart Grid

    Get PDF
    Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semi-supervised) are employed with decision and feature level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than the attack detection algorithms which employ state vector estimation methods in the proposed attack detection framework.Comment: 14 pages, 11 Figure

    Decision Stream: Cultivating Deep Decision Trees

    Full text link
    Various modifications of decision trees have been extensively used during the past years due to their high efficiency and interpretability. Tree node splitting based on relevant feature selection is a key step of decision tree learning, at the same time being their major shortcoming: the recursive nodes partitioning leads to geometric reduction of data quantity in the leaf nodes, which causes an excessive model complexity and data overfitting. In this paper, we present a novel architecture - a Decision Stream, - aimed to overcome this problem. Instead of building a tree structure during the learning process, we propose merging nodes from different branches based on their similarity that is estimated with two-sample test statistics, which leads to generation of a deep directed acyclic graph of decision rules that can consist of hundreds of levels. To evaluate the proposed solution, we test it on several common machine learning problems - credit scoring, twitter sentiment analysis, aircraft flight control, MNIST and CIFAR image classification, synthetic data classification and regression. Our experimental results reveal that the proposed approach significantly outperforms the standard decision tree learning methods on both regression and classification tasks, yielding a prediction error decrease up to 35%

    Consistency in Models for Distributed Learning under Communication Constraints

    Full text link
    Motivated by sensor networks and other distributed settings, several models for distributed learning are presented. The models differ from classical works in statistical pattern recognition by allocating observations of an independent and identically distributed (i.i.d.) sampling process amongst members of a network of simple learning agents. The agents are limited in their ability to communicate to a central fusion center and thus, the amount of information available for use in classification or regression is constrained. For several basic communication models in both the binary classification and regression frameworks, we question the existence of agent decision rules and fusion rules that result in a universally consistent ensemble. The answers to this question present new issues to consider with regard to universal consistency. Insofar as these models present a useful picture of distributed scenarios, this paper addresses the issue of whether or not the guarantees provided by Stone's Theorem in centralized environments hold in distributed settings.Comment: To appear in the IEEE Transactions on Information Theor

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues
    • ā€¦
    corecore