18,120 research outputs found
A Subband-Based SVM Front-End for Robust ASR
This work proposes a novel support vector machine (SVM) based robust
automatic speech recognition (ASR) front-end that operates on an ensemble of
the subband components of high-dimensional acoustic waveforms. The key issues
of selecting the appropriate SVM kernels for classification in frequency
subbands and the combination of individual subband classifiers using ensemble
methods are addressed. The proposed front-end is compared with state-of-the-art
ASR front-ends in terms of robustness to additive noise and linear filtering.
Experiments performed on the TIMIT phoneme classification task demonstrate the
benefits of the proposed subband based SVM front-end: it outperforms the
standard cepstral front-end in the presence of noise and linear filtering for
signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed
front-end with a conventional front-end such as MFCC yields further
improvements over the individual front ends across the full range of noise
levels
Tweet Acts: A Speech Act Classifier for Twitter
Speech acts are a way to conceptualize speech as action. This holds true for
communication on any platform, including social media platforms such as
Twitter. In this paper, we explored speech act recognition on Twitter by
treating it as a multi-class classification problem. We created a taxonomy of
six speech acts for Twitter and proposed a set of semantic and syntactic
features. We trained and tested a logistic regression classifier using a data
set of manually labelled tweets. Our method achieved a state-of-the-art
performance with an average F1 score of more than . We also explored
classifiers with three different granularities (Twitter-wide, type-specific and
topic-specific) in order to find the right balance between generalization and
overfitting for our task.Comment: ICWSM'16, May 17-20, Cologne, Germany. In Proceedings of the 10th
AAAI Conference on Weblogs and Social Media (ICWSM 2016). Cologne, German
Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema
In this paper, a psychologically-inspired binary cascade classification schema is proposed for speech emotion recognition. Performance is enhanced because commonly confused pairs of emotions are distinguishable from one another. Extracted features are related to statistics of pitch, formants, and energy contours, as well as spectrum, cepstrum, perceptual and temporal features, autocorrelation, MPEG-7 descriptors, Fujisakis model parameters, voice quality, jitter, and shimmer. Selected features are fed as input to K nearest neighborhood classifier and to support vector machines. Two kernels are tested for the latter: Linear and Gaussian radial basis function. The recently proposed speaker-independent experimental protocol is tested on the Berlin emotional speech database for each gender separately. The best emotion recognition accuracy, achieved by support vector machines with linear kernel, equals 87.7%, outperforming state-of-the-art approaches. Statistical analysis is first carried out with respect to the classifiers error rates and then to evaluate the information expressed by the classifiers confusion matrices. © Springer Science+Business Media, LLC 2011
A Machine Learning Approach For Opinion Holder Extraction In Arabic Language
Opinion mining aims at extracting useful subjective information from reliable
amounts of text. Opinion mining holder recognition is a task that has not been
considered yet in Arabic Language. This task essentially requires deep
understanding of clauses structures. Unfortunately, the lack of a robust,
publicly available, Arabic parser further complicates the research. This paper
presents a leading research for the opinion holder extraction in Arabic news
independent from any lexical parsers. We investigate constructing a
comprehensive feature set to compensate the lack of parsing structural
outcomes. The proposed feature set is tuned from English previous works coupled
with our proposed semantic field and named entities features. Our feature
analysis is based on Conditional Random Fields (CRF) and semi-supervised
pattern recognition techniques. Different research models are evaluated via
cross-validation experiments achieving 54.03 F-measure. We publicly release our
own research outcome corpus and lexicon for opinion mining community to
encourage further research
In-the-wild Facial Expression Recognition in Extreme Poses
In the computer research area, facial expression recognition is a hot
research problem. Recent years, the research has moved from the lab environment
to in-the-wild circumstances. It is challenging, especially under extreme
poses. But current expression detection systems are trying to avoid the pose
effects and gain the general applicable ability. In this work, we solve the
problem in the opposite approach. We consider the head poses and detect the
expressions within special head poses. Our work includes two parts: detect the
head pose and group it into one pre-defined head pose class; do facial
expression recognize within each pose class. Our experiments show that the
recognition results with pose class grouping are much better than that of
direct recognition without considering poses. We combine the hand-crafted
features, SIFT, LBP and geometric feature, with deep learning feature as the
representation of the expressions. The handcrafted features are added into the
deep learning framework along with the high level deep learning features. As a
comparison, we implement SVM and random forest to as the prediction models. To
train and test our methodology, we labeled the face dataset with 6 basic
expressions.Comment: Published on ICGIP201
- …