10 research outputs found

    Integrating Informativeness, Representativeness and Diversity in Pool-Based Sequential Active Learning for Regression

    Full text link
    In many real-world machine learning applications, unlabeled samples are easy to obtain, but it is expensive and/or time-consuming to label them. Active learning is a common approach for reducing this data labeling effort. It optimally selects the best few samples to label, so that a better machine learning model can be trained from the same number of labeled samples. This paper considers active learning for regression (ALR) problems. Three essential criteria -- informativeness, representativeness, and diversity -- have been proposed for ALR. However, very few approaches in the literature have considered all three of them simultaneously. We propose three new ALR approaches, with different strategies for integrating the three criteria. Extensive experiments on 12 datasets in various domains demonstrated their effectiveness.Comment: Int'l Joint Conf. on Neural Networks (IJCNN), Glasgow, UK, July 202

    Offline EEG-based driver drowsiness estimation using enhanced batch-mode active learning (EBMAL) for regression

    Full text link
    © 2016 IEEE. There are many important regression problems in real-world brain-computer interface (BCI) applications, e.g., driver drowsiness estimation from EEG signals. This paper considers offline analysis: given a pool of unlabeled EEG epochs recorded during driving, how do we optimally select a small number of them to label so that an accurate regression model can be built from them to label the rest? Active learning is a promising solution to this problem, but interestingly, to our best knowledge, it has not been used for regression problems in BCI so far. This paper proposes a novel enhanced batch-mode active learning (EBMAL) approach for regression, which improves upon a baseline active learning algorithm by increasing the reliability, representativeness and diversity of the selected samples to achieve better regression performance. We validate its effectiveness using driver drowsiness estimation from EEG signals. However, EBMAL is a general approach that can also be applied to many other offline regression problems beyond BCI

    Applicazione di Algoritmi di Deep Learning alla Speech Emotion Recognition

    Get PDF
    Il presente volume di tesi ha lo scopo di esporre i risultati del lavoro di ricerca svolto sulla Speech Emotion Recognition. Il problema è una specializzazione della Emotion Recognition su dati audio, ovvero riguarda l'identificazione delle emozioni espresse nelle tracce audio in esame ed è un compito che risulta di una complessità elevata sia per gli esseri umani sia per i sistemi automatici. Il grado di complessità è dovuto principalmente al fatto che soggetti differenti possono percepire emozioni diverse nella stessa traccia, mentre per quel che riguarda i sistemi automatici, che impiegano cioè il Machine Learning o il Deep Learning per effettuare la classificazione dei dati, le difficoltà hanno molteplici fonti differenti, descritte nel corso del documento. Nel lavoro in esame la SER è stata affrontata impiegando quelle che ad oggi sono le più note tecniche di Deep Learning per la classificazione multi classe. Il campo di ricerca appena citato è una sotto branca del Machine Learning con il quale condivide la caratteristica principale di utilizzare algoritmi in grado di apprendere e di migliorare in autonomia. Tali algoritmi fanno uso di modelli profondi e più o meno complessi per etichettare i dati, rappresentati da un insieme di informazioni ritenute indicative rispetto alla classe di appartenenza della traccia audio. La profondità dei modelli è data dal numero di livelli interni della rete, che sono in numero superiore all'unità. Lo scopo del lavoro è quello di valutare le principali tecnologie e metodologie nell'ambito del Deep Learning applicate al problema citato poc'anzi e per fare ciò sono stati allenati diversi modelli profondi tra reti ricorrenti, convoluzionali e multi classificatori, adottando diverse tecniche per evitare i problemi classici nell'ambito del addestramento delle reti profonde, effettuando poi un'analisi dei dati raccolti riguardanti le prestazioni di tali modelli nella classificazione del dataset usato per l'allenamento

    Affective Speech Recognition

    Get PDF
    Speech, as a medium of interaction, carries two different streams of information. Whereas one stream carries explicit messages, the other one contains implicit information about speakers themselves. Affective speech recognition is a set of theories and tools that intend to automate unfolding the part of the implicit stream that has to do with humans emotion. Application of affective speech recognition is to human computer interaction; a machine that is able to recognize humans emotion could engage the user in a more effective interaction. This thesis proposes a set of analyses and methodologies that advance automatic recognition of affect from speech. The proposed solution spans two dimensions of the problem: speech signal processing, and statistical learning. At the speech signal processing dimension, extraction of speech low-level descriptors is dis- cussed, and a set of descriptors that exploit the spectrum of the signal are proposed, which have shown to be particularly practical for capturing affective qualities of speech. Moreover, consider- ing the non-stationary property of the speech signal, further proposed is a measure of dynamicity that captures that property of speech by quantifying changes of the signal over time. Furthermore, based on the proposed set of low-level descriptors, it is shown that individual human beings are different in conveying emotions, and that parts of the spectrum that hold the affective information are different from one person to another. Therefore, the concept of emotion profile is proposed that formalizes those differences by taking into account different factors such as cultural and gender-specific differences, as well as those distinctions that have to do with individual human beings. At the statistical learning dimension, variable selection is performed to identify speech features that are most imperative to extracting affective information. In doing so, low-level descriptors are distinguished from statistical functionals, therefore, effectiveness of each of the two are studied dependently and independently. The major importance of variable selection as a standalone component of a solution is to real-time application of affective speech recognition. Although thousands of speech features are commonly used to tackle this problem in theory, extracting that many features in a real-time manner is unrealistic, especially for mobile applications. Results of the conducted investigations show that the required number of speech features is far less than the number that is commonly used in the literature of the problem. At the core of an affective speech recognition solution is a statistical model that uses speech features to recognize emotions. Such a model comes with a set of parameters that are estimated through a learning process. Proposed in this thesis is a learning algorithm, developed based on the notion of Hilbert-Schmidt independence criterion and named max-dependence regression, that maximizes the dependence between predicted and actual values of affective qualities. Pearson’s correlation coefficient is commonly used as the measure of goodness of a fit in the literature of affective computing, therefore max-dependence regression is proposed to make the learning and hypothesis testing criteria consistent with one another. Results of this research show that doing so results in higher prediction accuracy. Lastly, sparse representation for affective speech datasets is considered in this thesis. For this purpose, the application of a dictionary learning algorithm based on Hilbert-Schmidt independence criterion is proposed. Dictionary learning is used to identify the most important bases of the data in order to improve the generalization capability of the proposed solution to affective speech recognition. Based on the dictionary learning approach of choice, fusion of feature vectors is proposed. It is shown that sparse representation leads to higher generalization capability for affective speech recognition
    corecore