36 research outputs found
Syllable classification using static matrices and prosodic features
In this paper we explore the usefulness of prosodic features for
syllable classification. In order to do this, we represent the
syllable as a static analysis unit such that its acoustic-temporal
dynamics could be merged into a set of features that the SVM
classifier will consider as a whole. In the first part of our
experiment we used MFCC as features for classification,
obtaining a maximum accuracy of 86.66%. The second part of
our study tests whether the prosodic information is
complementary to the cepstral information for syllable
classification. The results obtained show that combining the
two types of information does improve the classification, but
further analysis is necessary for a more successful
combination of the two types of features
The application of support vector machine for speech classification
For the classical statistical classification algorithms the probability distribution models are known. However, in many real life applications, such as speech recognition, there is not enough information about the probability distribution function. This is a very common scenario and poses a very serious restriction in classification. Support Vector Machines (SVMs) can help in such situations because they are distribution free algorithms that originated from statistical learning theory and Structural Risk Minimization (SRM). In the most basic approach SVMs use linearly separating Hyperplanes to create classification with maximal margins. However in application, the classification problem requires a constrained nonlinear approach to be taken during the learning stages, and a quadratic problem has to be solved. For the case where the classes cannot be linearly separable due to overlap, the SVM algorithm will transform the original input space into a higher dimensional feature space, where the new features are potentially linearly separable. In this paper we present a study on the performance of these classifiers when applied to speech classification and provide computational results on phonemes from the TIMIT database.peer-reviewe
Smart Phone Based Data Mining for Human Activity Recognition
AbstractAutomatic activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors, and permit continuous monitoring of numerous physiological signals, where these sensors are attached to the subject's body. This can be immensely useful in healthcare applications, for automatic and intelligent daily activity monitoring for elderly people. In this paper, we present novel data analytic scheme for intelligent Human Activity Recognition (AR) using smartphone inertial sensors based on information theory based feature ranking algorithm and classifiers based on random forests, ensemble learning and lazy learning. Extensive experiments with a publicly available database1 of human activity with smart phone inertial sensors show that the proposed approach can indeed lead to development of intelligent and automatic real time human activity monitoring for eHealth application scenarios for elderly, disabled and people with special needs
A Subband-Based SVM Front-End for Robust ASR
This work proposes a novel support vector machine (SVM) based robust
automatic speech recognition (ASR) front-end that operates on an ensemble of
the subband components of high-dimensional acoustic waveforms. The key issues
of selecting the appropriate SVM kernels for classification in frequency
subbands and the combination of individual subband classifiers using ensemble
methods are addressed. The proposed front-end is compared with state-of-the-art
ASR front-ends in terms of robustness to additive noise and linear filtering.
Experiments performed on the TIMIT phoneme classification task demonstrate the
benefits of the proposed subband based SVM front-end: it outperforms the
standard cepstral front-end in the presence of noise and linear filtering for
signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed
front-end with a conventional front-end such as MFCC yields further
improvements over the individual front ends across the full range of noise
levels
Human activity recognition on smartphones for mobile context awareness
Activity-Based Computing [1] aims to capture the state of the user and its environment
by exploiting heterogeneous sensors in order to provide adaptation to
exogenous computing resources. When these sensors are attached to the subject’s
body, they permit continuous monitoring of numerous physiological signals. This
has appealing use in healthcare applications, e.g. the exploitation of Ambient Intelligence
(AmI) in daily activity monitoring for elderly people. In this paper,
we present a system for human physical Activity Recognition (AR) using smartphone
inertial sensors. As these mobile phones are limited in terms of energy and
computing power, we propose a novel hardware-friendly approach for multiclass
classification. This method adapts the standard Support Vector Machine (SVM)
and exploits fixed-point arithmetic. In addition to the clear computational advantages
of fixed-point arithmetic, it is easy to show the regularization effect of the
number of bits and then the connections with the Statistical Learning Theory. A
comparison with the traditional SVM shows a significant improvement in terms
of computational costs while maintaining similar accuracy, which can contribute
to develop more sustainable systems for AmI.Peer ReviewedPostprint (published version
Voice-Based Gender Recognition Model Using FRT and Light GBM
Voice-based gender recognition is vital in many computer-aided voice analysis applications like Human-Computer Interaction, fraudulent call identification, etc. A powerful feature is needed for training the machine learning model to discriminate a gender as male or female from a voice signal. This work proposes the use of a gradient boosting model in conjunction with a novel Cumulative Point Index (CPI) feature computed by Forward Rajan Transform (FRT) for gender recognition from voice signals. Firstly, voice signals are preprocessed to remove the nonsignificant silence period and are further framed and windowed to make them stationary. Then CPI is computed using the first coefficients of FRT and concatenated to form a feature set, and it is used to train the Light Gradient Boosting Machine (LightGBM) to recognize the gender. This approach provides better accuracy and faster training compared with the state of the art techniques. Experimental results show the primacy of the FRTCPI over other standard features used in the literature. It is also shown that the proposed features, in combination with LightGBM, provide better accuracy of 95.26% with a less computational time of 2.25 s over the challenging large datasets like Speech Accent Archive, Voice Gender Dataset, Common Voice, and Texas Instruments/Massachusetts Institute of Technology corpus
Energy efficient smartphone-based activity recognition using fixed-point arithmetic
In this paper we propose a novel energy efficient approach for the recognition of human activities using smartphones as wearable sensing devices, targeting
assisted living applications such as remote patient activity monitoring for the disabled
and the elderly. The method exploits fixed-point arithmetic to propose a modified
multiclass Support Vector Machine (SVM) learning algorithm, allowing to better pre-
serve the smartphone battery lifetime with respect to the conventional floating-point
based formulation while maintaining comparable system accuracy levels. Experiments
show comparative results between this approach and the traditional SVM in terms of
recognition performance and battery consumption, highlighting the advantages of the
proposed method.Peer ReviewedPostprint (published version