18,120 research outputs found

    A Subband-Based SVM Front-End for Robust ASR

    Full text link
    This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

    Tweet Acts: A Speech Act Classifier for Twitter

    Get PDF
    Speech acts are a way to conceptualize speech as action. This holds true for communication on any platform, including social media platforms such as Twitter. In this paper, we explored speech act recognition on Twitter by treating it as a multi-class classification problem. We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features. We trained and tested a logistic regression classifier using a data set of manually labelled tweets. Our method achieved a state-of-the-art performance with an average F1 score of more than 0.700.70. We also explored classifiers with three different granularities (Twitter-wide, type-specific and topic-specific) in order to find the right balance between generalization and overfitting for our task.Comment: ICWSM'16, May 17-20, Cologne, Germany. In Proceedings of the 10th AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, German

    Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema

    No full text
    In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    In-the-wild Facial Expression Recognition in Extreme Poses

    Full text link
    In the computer research area, facial expression recognition is a hot research problem. Recent years, the research has moved from the lab environment to in-the-wild circumstances. It is challenging, especially under extreme poses. But current expression detection systems are trying to avoid the pose effects and gain the general applicable ability. In this work, we solve the problem in the opposite approach. We consider the head poses and detect the expressions within special head poses. Our work includes two parts: detect the head pose and group it into one pre-defined head pose class; do facial expression recognize within each pose class. Our experiments show that the recognition results with pose class grouping are much better than that of direct recognition without considering poses. We combine the hand-crafted features, SIFT, LBP and geometric feature, with deep learning feature as the representation of the expressions. The handcrafted features are added into the deep learning framework along with the high level deep learning features. As a comparison, we implement SVM and random forest to as the prediction models. To train and test our methodology, we labeled the face dataset with 6 basic expressions.Comment: Published on ICGIP201
    corecore