184,256 research outputs found

    Learning from distributed data sources using random vector functional-link networks

    Get PDF
    One of the main characteristics in many real-world big data scenarios is their distributed nature. In a machine learning context, distributed data, together with the requirements of preserving privacy and scaling up to large networks, brings the challenge of designing fully decentralized training protocols. In this paper, we explore the problem of distributed learning when the features of every pattern are available throughout multiple agents (as is happening, for example, in a distributed database scenario). We propose an algorithm for a particular class of neural networks, known as Random Vector Functional-Link (RVFL), which is based on the Alternating Direction Method of Multipliers optimization algorithm. The proposed algorithm allows to learn an RVFL network from multiple distributed data sources, while restricting communication to the unique operation of computing a distributed average. Our experimental simulations show that the algorithm is able to achieve a generalization accuracy comparable to a fully centralized solution, while at the same time being extremely efficient

    The Random Hivemind: An Ensemble Deep Learner. A Case Study of Application to Solar Energetic Particle Prediction Problem

    Full text link
    Deep learning has become a popular trend in recent years in the machine learning community and has even occasionally become synonymous with machine learning itself thanks to its efficiency, malleability, and ability to operate free of human intervention. However, a series of hyperparameters passed to a conventional neural network (CoNN) may be rather arbitrary, especially if there is no surefire way to decide how to program hyperparameters for a given dataset. The random hivemind (RH) alleviates this concern by having multiple neural network estimators make decisions based on random permutations of features. The learning rate and the number of epochs may be boosted or attenuated depending on how all features of a given estimator determine the class that the numerical feature data belong to, but all other hyperparameters remain the same across estimators. This allows one to quickly see whether consistent decisions on a given dataset can be made by multiple neural networks with the same hyperparameters, with random subsets of data chosen to force variation in how data are predicted by each, placing the quality of the data and hyperparameters into focus. The effectiveness of RH is demonstrated through experimentation in the predictions of dangerous solar energetic particle events (SEPs) by comparing it to that of using both CoNN and the traditional approach used by ensemble deep learning in this application. Our results demonstrate that RH outperforms the CoNN and a committee-based approach, and demonstrates promising results with respect to the ``all-clear'' prediction of SEPs.Comment: 10 pages, 2 figures, 5 table

    Methods to Address Extreme Class Imbalance in Machine Learning Based Network Intrusion Detection Systems

    Get PDF
    Despite the considerable academic interest in using machine learning methods to detect cyber attacks and malicious network traffic, there is little evidence that modern organizations employ such systems. Due to the targeted nature of attacks and cybercriminals’ constantly changing behavior, valid observations of attack traffic suitable for training a classifier are extremely rare. Rare positive cases combined with the fact that the overwhelming majority of network traffic is benign create an extreme class imbalance problem. Using publically available datasets, this research examines the class imbalance problem by using small samples of the attack observations to create multiple training sets that reflect a realistic class imbalance. A variety of techniques to alleviate the imbalance are examined including under sampling the majority class and three techniques to over sample the minority attack observations by creating new synthetic observations. We test these methods on four of the most popular machine learning classifiers. We examine two single model classifiers, artificial neural networks and support vector machines, and two ensemble methods, gradient boosting and random forests. We find that under sampling generally outperforms oversampling techniques and that the ensemble methods both outperform single models. We show that the apparent superiority of the ensemble methods may be illusory due to the “laboratory conditions” of using well-crafted public datasets. By introducing an element of noise into the training data, we show that neural networks’ robustness to noise make it the preferred approach in real world settings where the more sophisticated ensemble methods fail. We also present a technique where neural networks are used to select features from the noisy dataset that improve the performance of random forests and gradient boosting allowing for the creation of an improved ensemble classifier

    Prediction Error-based Classification for Class-Incremental Learning

    Full text link
    Class-incremental learning (CIL) is a particularly challenging variant of continual learning, where the goal is to learn to discriminate between all classes presented in an incremental fashion. Existing approaches often suffer from excessive forgetting and imbalance of the scores assigned to classes that have not been seen together during training. In this study, we introduce a novel approach, Prediction Error-based Classification (PEC), which differs from traditional discriminative and generative classification paradigms. PEC computes a class score by measuring the prediction error of a model trained to replicate the outputs of a frozen random neural network on data from that class. The method can be interpreted as approximating a classification rule based on Gaussian Process posterior variance. PEC offers several practical advantages, including sample efficiency, ease of tuning, and effectiveness even when data are presented one class at a time. Our empirical results show that PEC performs strongly in single-pass-through-data CIL, outperforming other rehearsal-free baselines in all cases and rehearsal-based methods with moderate replay buffer size in most cases across multiple benchmarks

    Object Recognition from very few Training Examples for Enhancing Bicycle Maps

    Full text link
    In recent years, data-driven methods have shown great success for extracting information about the infrastructure in urban areas. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. While large datasets have been published regarding cars, for cyclists very few labeled data is available although appearance, point of view, and positioning of even relevant objects differ. Unfortunately, labeling data is costly and requires a huge amount of work. In this paper, we thus address the problem of learning with very few labels. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. We propose a system for object recognition that is trained with only 15 examples per class on average. To achieve this, we combine the advantages of convolutional neural networks and random forests to learn a patch-wise classifier. In the next step, we map the random forest to a neural network and transform the classifier to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, we integrate data of the Global Positioning System (GPS) to localize the predictions on the map. In comparison to Faster R-CNN and other networks for object recognition or algorithms for transfer learning, we considerably reduce the required amount of labeled data. We demonstrate good performance on the recognition of traffic signs for cyclists as well as their localization in maps.Comment: Submitted to IV 2018. This research was supported by German Research Foundation DFG within Priority Research Programme 1894 "Volunteered Geographic Information: Interpretation, Visualization and Social Computing

    Machine Learning and Integrative Analysis of Biomedical Big Data.

    Get PDF
    Recent developments in high-throughput technologies have accelerated the accumulation of massive amounts of omics data from multiple sources: genome, epigenome, transcriptome, proteome, metabolome, etc. Traditionally, data from each source (e.g., genome) is analyzed in isolation using statistical and machine learning (ML) methods. Integrative analysis of multi-omics and clinical data is key to new biomedical discoveries and advancements in precision medicine. However, data integration poses new computational challenges as well as exacerbates the ones associated with single-omics studies. Specialized computational approaches are required to effectively and efficiently perform integrative analysis of biomedical data acquired from diverse modalities. In this review, we discuss state-of-the-art ML-based approaches for tackling five specific computational challenges associated with integrative analysis: curse of dimensionality, data heterogeneity, missing data, class imbalance and scalability issues

    Significant Gene Array Analysis and Cluster-Based Machine Learning for Disease Class Prediction

    Get PDF
    Gene expression analysis has been of major interest to biostatisticians for many decades. Such studies are necessary for the understanding of disease risk assessment and prediction, so that medical professionals and scientists alike may learn how to better create treatment plans to lessen symptoms and perhaps even find cures. In this study, we will investigate various gene expression analyses and machine learning techniques for disease class prediction, as well as assess predictive validity of these models and uncover differentially expressed (DE) genes for their relevant pathology datasets. Multiple gene expression datasets will be used to test model accuracies and will be obtained using the Affymetrix U133A platform (GPL96). Significant Analysis of Microarrays (SAM) had been used to identify potential disease biomarkers, followed by these predictive models: (a) random forest, (b) random forest with Gene eXpression Network Analysis (GXNA), (c) RF++, (d) LASSO, and (e) Bayesian Neural Networks. One of the intended goals for this study is to find clusters of co-expressed genes and identify the effect of clustering classification based on knowledge in gene expression data/microarray data. The other goal is to determine the usefulness of Automatic Relevancy Determination in Bayesian neural networks
    • …
    corecore