11 research outputs found

    Adaptation in P300 braincomputer interfaces: A two-classifier cotraining approach

    Get PDF
    10.1109/TBME.2010.2058804IEEE Transactions on Biomedical Engineering57122927-2935IEBE

    Doctor of Philosophy

    Get PDF
    dissertationMachine learning is the science of building predictive models from data that automatically improve based on past experience. To learn these models, traditional learning algorithms require labeled data. They also require that the entire dataset fits in the memory of a single machine. Labeled data are available or can be acquired for small and moderately sized datasets but curating large datasets can be prohibitively expensive. Similarly, massive datasets are usually too huge to fit into the memory of a single machine. An alternative is to distribute the dataset over multiple machines. Distributed learning, however, poses new challenges as most existing machine learning techniques are inherently sequential. Additionally, these distributed approaches have to be designed keeping in mind various resource limitations of real-world settings, prime among them being intermachine communication. With the advent of big datasets machine learning algorithms are facing new challenges. Their design is no longer limited to minimizing some loss function but, additionally, needs to consider other resources that are critical when learning at scale. In this thesis, we explore different models and measures for learning with limited resources that have a budget. What budgetary constraints are posed by modern datasets? Can we reuse or combine existing machine learning paradigms to address these challenges at scale? How does the cost metrics change when we shift to distributed models for learning? These are some of the questions that have been investigated in this thesis. The answers to these questions hold the key to addressing some of the challenges faced when learning on massive datasets. In the first part of this thesis, we present three different budgeted scenarios that deal with scarcity of labeled data and limited computational resources. The goal is to leverage transfer information from related domains to learn under budgetary constraints. Our proposed techniques comprise semisupervised transfer, online transfer and active transfer. In the second part of this thesis, we study distributed learning with limited communication. We present initial sampling based results, as well as, propose communication protocols for learning distributed linear classifiers

    CM-CASL: Comparison-based Performance Modeling of Software Systems via Collaborative Active and Semisupervised Learning

    Full text link
    Configuration tuning for large software systems is generally challenging due to the complex configuration space and expensive performance evaluation. Most existing approaches follow a two-phase process, first learning a regression-based performance prediction model on available samples and then searching for the configurations with satisfactory performance using the learned model. Such regression-based models often suffer from the scarcity of samples due to the enormous time and resources required to run a large software system with a specific configuration. Moreover, previous studies have shown that even a highly accurate regression-based model may fail to discern the relative merit between two configurations, whereas performance comparison is actually one fundamental strategy for configuration tuning. To address these issues, this paper proposes CM-CASL, a Comparison-based performance Modeling approach for software systems via Collaborative Active and Semisupervised Learning. CM-CASL learns a classification model that compares the performance of two given configurations, and enhances the samples through a collaborative labeling process by both human experts and classifiers using an integration of active and semisupervised learning. Experimental results demonstrate that CM-CASL outperforms two state-of-the-art performance modeling approaches in terms of both classification accuracy and rank accuracy, and thus provides a better performance model for the subsequent work of configuration tuning

    Solution Path for Manifold Regularized Semisupervised Classification

    Full text link

    W-Air: Enabling personal air pollution monitoring on wearables

    Get PDF
    Accurate, portable and personal air pollution sensing devices enable quantification of individual exposure to air pollution, personalized health advice and assistance applications. Wearables are promising (e.g., on wristbands, attached to belts or backpacks) to integrate commercial off-the-shelf gas sensors for personal air pollution sensing. Yet previous research lacks comprehensive investigations on the accuracies of air pollution sensing on wearables. In response, we proposed W-Air, an accurate personal multi-pollutant monitoring platform for wearables. We discovered that human emissions introduce non-linear interference when low-cost gas sensors are integrated into wearables, which is overlooked in existing studies. W-Air adopts a sensor-fusion calibration scheme to recover high-fidelity ambient pollutant concentrations from the human interference. It also leverages a neural network with shared hidden layers to boost calibration parameter training with fewer measurements and utilizes semi-supervised regression for calibration parameter updating with little user intervention. We prototyped W-Air on a wristband with low-cost gas sensors. Evaluations demonstrated that W-Air reports accurate measurements both with and without human interference and is able to automatically learn and adapt to new environments.</jats:p

    Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction

    Full text link
    Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. It can be tackled with multiple hypotheses frameworks but with the difficulty of combining them efficiently in a learning model. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. The predictors are regression models of any type that can form centroidal Voronoi tessellations which are a function of their losses during training. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution and is equivalent to interpolating the meta-loss of the predictors, the loss being a zero set of the interpolation error. This model has a fixed-point iteration algorithm between the predictors and the centers of the basis functions. Diversity in learning can be controlled parametrically by truncating the tessellation formation with the losses of individual predictors. A closed-form solution with least-squares is presented, which to the authors knowledge, is the fastest solution in the literature for multiple hypotheses and structured predictions. Superior generalization performance and computational efficiency is achieved using only two-layer neural networks as predictors controlling diversity as a key component of success. A gradient-descent approach is introduced which is loss-agnostic regarding the predictors. The expected value for the loss of the structured model with Gaussian basis functions is computed, finding that correlation between predictors is not an appropriate tool for diversification. The experiments show outperformance with respect to the top competitors in the literature.Comment: 63 Pages, 40 Figure

    Multimodal Video Analysis and Modeling

    Get PDF
    From recalling long forgotten experiences based on a familiar scent or on a piece of music, to lip reading aided conversation in noisy environments or travel sickness caused by mismatch of the signals from vision and the vestibular system, the human perception manifests countless examples of subtle and effortless joint adoption of the multiple senses provided to us by evolution. Emulating such multisensory (or multimodal, i.e., comprising multiple types of input modes or modalities) processing computationally offers tools for more effective, efficient, or robust accomplishment of many multimedia tasks using evidence from the multiple input modalities. Information from the modalities can also be analyzed for patterns and connections across them, opening up interesting applications not feasible with a single modality, such as prediction of some aspects of one modality based on another. In this dissertation, multimodal analysis techniques are applied to selected video tasks with accompanying modalities. More specifically, all the tasks involve some type of analysis of videos recorded by non-professional videographers using mobile devices.Fusion of information from multiple modalities is applied to recording environment classification from video and audio as well as to sport type classification from a set of multi-device videos, corresponding audio, and recording device motion sensor data. The environment classification combines support vector machine (SVM) classifiers trained on various global visual low-level features with audio event histogram based environment classification using k nearest neighbors (k-NN). Rule-based fusion schemes with genetic algorithm (GA)-optimized modality weights are compared to training a SVM classifier to perform the multimodal fusion. A comprehensive selection of fusion strategies is compared for the task of classifying the sport type of a set of recordings from a common event. These include fusion prior to, simultaneously with, and after classification; various approaches for using modality quality estimates; and fusing soft confidence scores as well as crisp single-class predictions. Additionally, different strategies are examined for aggregating the decisions of single videos to a collective prediction from the set of videos recorded concurrently with multiple devices. In both tasks multimodal analysis shows clear advantage over separate classification of the modalities.Another part of the work investigates cross-modal pattern analysis and audio-based video editing. This study examines the feasibility of automatically timing shot cuts of multi-camera concert recordings according to music-related cutting patterns learnt from professional concert videos. Cut timing is a crucial part of automated creation of multicamera mashups, where shots from multiple recording devices from a common event are alternated with the aim at mimicing a professionally produced video. In the framework, separate statistical models are formed for typical patterns of beat-quantized cuts in short segments, differences in beats between consecutive cuts, and relative deviation of cuts from exact beat times. Based on music meter and audio change point analysis of a new recording, the models can be used for synthesizing cut times. In a user study the proposed framework clearly outperforms a baseline automatic method with comparably advanced audio analysis and wins 48.2 % of comparisons against hand-edited videos

    ADVANCED REPRESENTATION LEARNING STRATEGIES FOR BIG DATA ANALYSIS

    Get PDF
    With the fast technological advancement in data storage and machine learning, big data analytics has become a core component of various practical applications ranging from industrial automation to medical diagnosis and from cyber-security to space exploration. Recent studies show that every day, more than 1.8 billion photos/images are posted on social media, and 720 thousand hours of videos are uploaded to YouTube. Thus, to handle this large amount of visual data efficiently, image/video classification, object detection/recognition, and segmentation tasks have gathered a lot of attention since the decade. Consequently, the researchers in this domain has proposed various feature extraction, feature learning, and feature encoding algorithms for improving the generalization performance of the aforesaid tasks. For example, the generalization performance of the image classification models mainly depends on the choice of data representation. These models aim at building comprehensive representation learning (RL) strategies to encode the relationship among the input and output attributes from the raw big data. Existing RL strategies can be divided into three general categories: statistic approaches (e.g. probabilistic-based analysis, and correlation-based measures), unsupervised learning (e.g., autoencoders), and supervised learning (e.g., deep convolutional neural network (DCNN)). Among these categories, the unsupervised and supervised learning strategies using artificial neural networks (ANNs) have been widely adopted. In this direction, several auxiliary ideas have been proposed over the past decade, to improve the learning capability of the ANNs. For instance, Moore-Penrose (MP) inverse is exploited to refine the parameters (weights and biases) of a trained network. However, the existing MP inverse-based RL methods have an important limitation. The representations learned through the MP inverse-based strategies suffer from loosely-connected feature coding, resulting into a poor representation of the objects having lack of discriminative power. To address this issue, this dissertation proposes a set of eight novel MP inverse-based RL algorithms. The first part of this dissertation from Chapter 4 to Chapter 7 is dedicated to proposing novel width-growth models based on subnet neural network (SNN) for representation learning and image classification. In this part, a novel feature learning algorithm, coined Wi-HSNN is proposed, followed by an improved batch-by-batch learning algorithm, called OS-HSNN. Then, two novel SNNs are introduced to detect extreme outliers for one-class classification (OCC). Finally, a semi-supervised SNN, named SS-HSNN is introduced to extend the strategy from the supervised learning domain to the semi-supervised learning domain. The second part of this thesis, subsuming Chapter 8 and Chapter 9, focuses on improving the performance of the existing multilayer neural networks through harnessing the MP inverse. Here, a novel weight optimization strategy is proposed to improve the performance of multilayer extreme learning machines (ELMs), where the MP inverse is used to feedback the classification imprecision information from the output layer to the hidden layers. Then, a novel fast retraining framework is proposed to enhance the efficiency of transfer learning of DCNNs. The effectiveness of the proposed subnet- and retraining-based algorithms have been evaluated on several widely used image classification datasets, such as ImageNet and Places-365. Furthermore, we validated the performance of the proposed strategies in some extended domains, such as ship-target detection, food image classification, camera model identification and misinformation identification. The experimental results illustrate the superiority of the proposed algorithms
    corecore