104 research outputs found

    Data stream classification using random feature functions and novel method combinations

    Get PDF
    Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high performing ensemble setups such as online and leveraging bagging. Also, k-nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyper-parameter options and initial conditions to be considered an effective 'off -the-shelf' data-streams solution. In this work, we look at combinations of Hoeffding-trees, nearest neighbor, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power. We further extend the investigation to implementing methods on GPUs, which we test on some large real-world datasets, and show the benefits of using GPUs for data-stream learning due to their high scalability. Our empirical evaluation yields positive results for the novel approaches that we experiment with, highlighting important issues, and shed light on promising future directions in approaches to data-stream classification. (C) 2016 Elsevier Inc. All rights reserved.Peer ReviewedPostprint (author's final draft

    Doctor of Philosophy

    Get PDF
    dissertationThe goal of machine learning is to develop efficient algorithms that use training data to create models that generalize well to unseen data. Learning algorithms can use labeled data, unlabeled data or both. Supervised learning algorithms learn a model using labeled data only. Unsupervised learning methods learn the internal structure of a dataset using only unlabeled data. Lastly, semisupervised learning is the task of finding a model using both labeled and unlabeled data. In this research work, we contribute to both supervised and semisupervised learning. We contribute to supervised learning by proposing an efficient high-dimensional space coverage scheme which is based on the disjunctive normal form. We use conjunctions of a set of half-spaces to create a set of convex polytopes. Disjunction of these polytopes can provide desirable coverage of space. Unlike traditional methods based on neural networks, we do not initialize the model parameters randomly. As a result, our model minimizes the risk of poor local minima and higher learning rates can be used which leads to faster convergence. We contribute to semisupervised learning by proposing 2 unsupervised loss functions that form the basis of a novel semisupervised learning method. The first loss function is called Mutual-Exclusivity. The motivation of this method is the observation that an optimal decision boundary lies between the manifolds of different classes where there are no or very few samples. Decision boundaries can be pushed away from training samples by maximizing their margin and it is not necessary to know the class labels of the samples to maximize the margin. The second loss is named Transformation/Stability and is based on the fact that the prediction of a classifier for a data sample should not change with respect to transformations and perturbations applied to that data sample. In addition, internal variations of a learning system should have little to no effect on the output. The proposed loss minimizes the variation in the prediction of the network for a specific data sample. We also show that the same technique can be used to improve the robustness of a learning model with respect to adversarial examples

    Designing robust forecasting ensembles of Data-Driven Models with a Multi-Objective Formulation: An application to Home Energy Management Systems

    Get PDF
    This work proposes a procedure for the multi-objective design of a robust forecasting ensemble of data-driven models. Starting with a data-selection algorithm, a multi-objective genetic algorithm is then executed, performing topology and feature selection, as well as parameter estimation. From the set of non-dominated or preferential models, a smaller sub-set is chosen to form the ensemble. Prediction intervals for the ensemble are obtained using the covariance method. This procedure is illustrated in the design of four different models, required for energy management systems. Excellent results were obtained by this methodology, superseding the existing alternatives. Further research will incorporate a robustness criterion in MOGA, and will incorporate the prediction intervals in predictive control techniques.Grant number 72581/2020info:eu-repo/semantics/publishedVersio

    Universal Activation Function For Machine Learning

    Full text link
    This article proposes a Universal Activation Function (UAF) that achieves near optimal performance in quantification, classification, and reinforcement learning (RL) problems. For any given problem, the optimization algorithms are able to evolve the UAF to a suitable activation function by tuning the UAF's parameters. For the CIFAR-10 classification and VGG-8, the UAF converges to the Mish like activation function, which has near optimal performance F1=0.9017±0.0040F_{1} = 0.9017\pm0.0040 when compared to other activation functions. For the quantification of simulated 9-gas mixtures in 30 dB signal-to-noise ratio (SNR) environments, the UAF converges to the identity function, which has near optimal root mean square error of 0.4888±0.00320.4888 \pm 0.0032 μM\mu M. In the BipedalWalker-v2 RL dataset, the UAF achieves the 250 reward in 961±193961 \pm 193 epochs, which proves that the UAF converges in the lowest number of epochs. Furthermore, the UAF converges to a new activation function in the BipedalWalker-v2 RL dataset

    A comparative study of surrogate musculoskeletal models using various neural network configurations

    Get PDF
    Title from PDF of title page, viewed on August 13, 2013Thesis advisor: Reza R. DerakhshaniVitaIncludes bibliographic references (pages 85-88)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2013The central idea in musculoskeletal modeling is to be able to predict body-level (e.g. muscle forces) as well as tissue-level information (tissue-level stress, strain, etc.). To develop computationally efficient techniques to analyze such models, surrogate models have been introduced which concurrently predict both body-level and tissue-level information using multi-body and finite-element analysis, respectively. However, this kind of surrogate model is not an optimum solution as it involves the usage of finite element models which are computation intensive and involve complex meshing methods especially during real-time movement simulations. An alternative surrogate modeling method is the use of artificial neural networks in place of finite-element models. The ultimate objective of this research is to predict tissue-level stresses experienced by the cartilage and ligaments during movement and achieve concurrent simulation of muscle force and tissue stress using various surrogate neural network models, where stresses obtained from finite-element models provide the frame of reference. Over the last decade, neural networks have been successfully implemented in several biomechanical modeling applications. Their adaptive ability to learn from examples, simple implementation techniques, and fast simulation times make neural networks versatile and robust when compared to other techniques. The neural network models are trained with reaction forces from multi-body models and stresses from finite element models obtained at the interested elements. Several configurations of static and dynamic neural networks are modeled, and accuracies close to 93% were achieved, where the correlation coefficient is the chosen measure of goodness. Using neural networks, the simulation time was reduced nearly 40,000 times when compared to the finite-element models. This study also confirms theoretical concepts that special network configurations--including average committee, stacked generalization, and negative correlation learning--provide considerably better results when compared to individual networks themselves.Introduction -- Methods -- Results -- Conclusion -- Future work -- Appendix A. Various linear and non-linear modeling techniques -- Appendix B. Error analysi
    • …
    corecore