2,121 research outputs found

    Adaptive Online Sequential ELM for Concept Drift Tackling

    Get PDF
    A machine learning method needs to adapt to over time changes in the environment. Such changes are known as concept drift. In this paper, we propose concept drift tackling method as an enhancement of Online Sequential Extreme Learning Machine (OS-ELM) and Constructive Enhancement OS-ELM (CEOS-ELM) by adding adaptive capability for classification and regression problem. The scheme is named as adaptive OS-ELM (AOS-ELM). It is a single classifier scheme that works well to handle real drift, virtual drift, and hybrid drift. The AOS-ELM also works well for sudden drift and recurrent context change type. The scheme is a simple unified method implemented in simple lines of code. We evaluated AOS-ELM on regression and classification problem by using concept drift public data set (SEA and STAGGER) and other public data sets such as MNIST, USPS, and IDS. Experiments show that our method gives higher kappa value compared to the multiclassifier ELM ensemble. Even though AOS-ELM in practice does not need hidden nodes increase, we address some issues related to the increasing of the hidden nodes such as error condition and rank values. We propose taking the rank of the pseudoinverse matrix as an indicator parameter to detect underfitting condition.Comment: Hindawi Publishing. Computational Intelligence and Neuroscience Volume 2016 (2016), Article ID 8091267, 17 pages Received 29 January 2016, Accepted 17 May 2016. Special Issue on "Advances in Neural Networks and Hybrid-Metaheuristics: Theory, Algorithms, and Novel Engineering Applications". Academic Editor: Stefan Hauf

    Efficient Diverse Ensemble for Discriminative Co-Tracking

    Full text link
    Ensemble discriminative tracking utilizes a committee of classifiers, to label data samples, which are in turn, used for retraining the tracker to localize the target using the collective knowledge of the committee. Committee members could vary in their features, memory update schemes, or training data, however, it is inevitable to have committee members that excessively agree because of large overlaps in their version space. To remove this redundancy and have an effective ensemble learning, it is critical for the committee to include consistent hypotheses that differ from one-another, covering the version space with minimum overlaps. In this study, we propose an online ensemble tracker that directly generates a diverse committee by generating an efficient set of artificial training. The artificial data is sampled from the empirical distribution of the samples taken from both target and background, whereas the process is governed by query-by-committee to shrink the overlap between classifiers. The experimental results demonstrate that the proposed scheme outperforms conventional ensemble trackers on public benchmarks.Comment: CVPR 2018 Submissio

    Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

    Get PDF
    Learning-based pattern classifiers, including deep networks, have shown impressive performance in several application domains, ranging from computer vision to cybersecurity. However, it has also been shown that adversarial input perturbations carefully crafted either at training or at test time can easily subvert their predictions. The vulnerability of machine learning to such wild patterns (also referred to as adversarial examples), along with the design of suitable countermeasures, have been investigated in the research field of adversarial machine learning. In this work, we provide a thorough overview of the evolution of this research area over the last ten years and beyond, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks. We report interesting connections between these apparently-different lines of work, highlighting common misconceptions related to the security evaluation of machine-learning algorithms. We review the main threat models and attacks defined to this end, and discuss the main limitations of current work, along with the corresponding future challenges towards the design of more secure learning algorithms.Comment: Accepted for publication on Pattern Recognition, 201

    Evaluation methods and decision theory for classification of streaming data with temporal dependence

    Get PDF
    Predictive modeling on data streams plays an important role in modern data analysis, where data arrives continuously and needs to be mined in real time. In the stream setting the data distribution is often evolving over time, and models that update themselves during operation are becoming the state-of-the-art. This paper formalizes a learning and evaluation scheme of such predictive models. We theoretically analyze evaluation of classifiers on streaming data with temporal dependence. Our findings suggest that the commonly accepted data stream classification measures, such as classification accuracy and Kappa statistic, fail to diagnose cases of poor performance when temporal dependence is present, therefore they should not be used as sole performance indicators. Moreover, classification accuracy can be misleading if used as a proxy for evaluating change detectors with datasets that have temporal dependence. We formulate the decision theory for streaming data classification with temporal dependence and develop a new evaluation methodology for data stream classification that takes temporal dependence into account. We propose a combined measure for classification performance, that takes into account temporal dependence, and we recommend using it as the main performance measure in classification of streaming data

    Reservoir of Diverse Adaptive Learners and Stacking Fast Hoeffding Drift Detection Methods for Evolving Data Streams

    Full text link
    The last decade has seen a surge of interest in adaptive learning algorithms for data stream classification, with applications ranging from predicting ozone level peaks, learning stock market indicators, to detecting computer security violations. In addition, a number of methods have been developed to detect concept drifts in these streams. Consider a scenario where we have a number of classifiers with diverse learning styles and different drift detectors. Intuitively, the current 'best' (classifier, detector) pair is application dependent and may change as a result of the stream evolution. Our research builds on this observation. We introduce the \mbox{Tornado} framework that implements a reservoir of diverse classifiers, together with a variety of drift detection algorithms. In our framework, all (classifier, detector) pairs proceed, in parallel, to construct models against the evolving data streams. At any point in time, we select the pair which currently yields the best performance. We further incorporate two novel stacking-based drift detection methods, namely the \mbox{FHDDMS} and \mbox{FHDDMS}_{add} approaches. The experimental evaluation confirms that the current 'best' (classifier, detector) pair is not only heavily dependent on the characteristics of the stream, but also that this selection evolves as the stream flows. Further, our \mbox{FHDDMS} variants detect concept drifts accurately in a timely fashion while outperforming the state-of-the-art.Comment: 42 pages, and 14 figure

    Heart Disease Prediction Using Stacking Model With Balancing Techniques and Dimensionality Reduction

    Get PDF
    Heart disease is a serious worldwide health issue with wide-reaching effects. Since heart disease is one of the leading causes of mortality worldwide, early detection is crucial. Emerging technologies like Machine Learning (ML) are currently being actively used by the biomedical, healthcare, and health prediction industries. PaRSEL, a new stacking model is proposed in this research, that combines four classifiers, Passive Aggressive Classifier (PAC), Ridge Classifier (RC), Stochastic Gradient Descent Classifier (SGDC), and eXtreme Gradient Boosting (XGBoost), at the base layer, and LogitBoost is deployed for the final predictions at the meta layer. The imbalanced and irrelevant features in the data increase the complexity of the classification models. The dimensionality reduction and data balancing approaches are considered very important for lowering costs and increasing the accuracy of the model. In PaRSEL, three dimensionality reduction techniques, Recursive Feature Elimination (RFE), Linear Discriminant Analysis (LDA), and Factor Analysis (FA), are used to reduce the dimensionality and select the most relevant features for the diagnosis of heart disease. Furthermore, eight balancing techniques, Proximity Weighted Random Affine Shadowsampling (ProWRAS), Localized Randomized Affine Shadowsampling (LoRAS), Random Over Sampling (ROS), Adaptive Synthetic (ADASYN), Synthetic Minority Oversampling Technique (SMOTE), Borderline SMOTE (B-SMOTE), Majority Weighted Minority Oversampling Technique (MWMOTE) and Random Walk Oversampling (RWOS), are used to deal with the imbalanced nature of the dataset. The performance of PaRSEL is compared with the other standalone classifiers using different performance measures like accuracy, F1-score, precision, recall and AUC-ROC score. Our proposed model achieves 97% accuracy, 80% F1-score, precision is greater than 90%, 67% recall, and 98% AUC-ROC score. This shows that PaRSEL outperforms other standalone classifiers in terms of heart disease prediction. Additionally, we deploy SHapley Additive exPlanations (SHAP) on our proposed model. It helps to understand the internal working of the model. It illustrates how much influence a classifier has on the final prediction outcome
    corecore