36 research outputs found

    Syllable classification using static matrices and prosodic features

    Get PDF
    In this paper we explore the usefulness of prosodic features for syllable classification. In order to do this, we represent the syllable as a static analysis unit such that its acoustic-temporal dynamics could be merged into a set of features that the SVM classifier will consider as a whole. In the first part of our experiment we used MFCC as features for classification, obtaining a maximum accuracy of 86.66%. The second part of our study tests whether the prosodic information is complementary to the cepstral information for syllable classification. The results obtained show that combining the two types of information does improve the classification, but further analysis is necessary for a more successful combination of the two types of features

    The application of support vector machine for speech classification

    Get PDF
    For the classical statistical classification algorithms the probability distribution models are known. However, in many real life applications, such as speech recognition, there is not enough information about the probability distribution function. This is a very common scenario and poses a very serious restriction in classification. Support Vector Machines (SVMs) can help in such situations because they are distribution free algorithms that originated from statistical learning theory and Structural Risk Minimization (SRM). In the most basic approach SVMs use linearly separating Hyperplanes to create classification with maximal margins. However in application, the classification problem requires a constrained nonlinear approach to be taken during the learning stages, and a quadratic problem has to be solved. For the case where the classes cannot be linearly separable due to overlap, the SVM algorithm will transform the original input space into a higher dimensional feature space, where the new features are potentially linearly separable. In this paper we present a study on the performance of these classifiers when applied to speech classification and provide computational results on phonemes from the TIMIT database.peer-reviewe

    Smart Phone Based Data Mining for Human Activity Recognition

    Get PDF
    AbstractAutomatic activity recognition systems aim to capture the state of the user and its environment by exploiting heterogeneous sensors, and permit continuous monitoring of numerous physiological signals, where these sensors are attached to the subject's body. This can be immensely useful in healthcare applications, for automatic and intelligent daily activity monitoring for elderly people. In this paper, we present novel data analytic scheme for intelligent Human Activity Recognition (AR) using smartphone inertial sensors based on information theory based feature ranking algorithm and classifiers based on random forests, ensemble learning and lazy learning. Extensive experiments with a publicly available database1 of human activity with smart phone inertial sensors show that the proposed approach can indeed lead to development of intelligent and automatic real time human activity monitoring for eHealth application scenarios for elderly, disabled and people with special needs

    A Subband-Based SVM Front-End for Robust ASR

    Full text link
    This work proposes a novel support vector machine (SVM) based robust automatic speech recognition (ASR) front-end that operates on an ensemble of the subband components of high-dimensional acoustic waveforms. The key issues of selecting the appropriate SVM kernels for classification in frequency subbands and the combination of individual subband classifiers using ensemble methods are addressed. The proposed front-end is compared with state-of-the-art ASR front-ends in terms of robustness to additive noise and linear filtering. Experiments performed on the TIMIT phoneme classification task demonstrate the benefits of the proposed subband based SVM front-end: it outperforms the standard cepstral front-end in the presence of noise and linear filtering for signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed front-end with a conventional front-end such as MFCC yields further improvements over the individual front ends across the full range of noise levels

    Human activity recognition on smartphones for mobile context awareness

    Get PDF
    Activity-Based Computing [1] aims to capture the state of the user and its environment by exploiting heterogeneous sensors in order to provide adaptation to exogenous computing resources. When these sensors are attached to the subject’s body, they permit continuous monitoring of numerous physiological signals. This has appealing use in healthcare applications, e.g. the exploitation of Ambient Intelligence (AmI) in daily activity monitoring for elderly people. In this paper, we present a system for human physical Activity Recognition (AR) using smartphone inertial sensors. As these mobile phones are limited in terms of energy and computing power, we propose a novel hardware-friendly approach for multiclass classification. This method adapts the standard Support Vector Machine (SVM) and exploits fixed-point arithmetic. In addition to the clear computational advantages of fixed-point arithmetic, it is easy to show the regularization effect of the number of bits and then the connections with the Statistical Learning Theory. A comparison with the traditional SVM shows a significant improvement in terms of computational costs while maintaining similar accuracy, which can contribute to develop more sustainable systems for AmI.Peer ReviewedPostprint (published version

    Voice-Based Gender Recognition Model Using FRT and Light GBM

    Get PDF
    Voice-based gender recognition is vital in many computer-aided voice analysis applications like Human-Computer Interaction, fraudulent call identification, etc. A powerful feature is needed for training the machine learning model to discriminate a gender as male or female from a voice signal. This work proposes the use of a gradient boosting model in conjunction with a novel Cumulative Point Index (CPI) feature computed by Forward Rajan Transform (FRT) for gender recognition from voice signals. Firstly, voice signals are preprocessed to remove the nonsignificant silence period and are further framed and windowed to make them stationary. Then CPI is computed using the first coefficients of FRT and concatenated to form a feature set, and it is used to train the Light Gradient Boosting Machine (LightGBM) to recognize the gender. This approach provides better accuracy and faster training compared with the state of the art techniques. Experimental results show the primacy of the FRTCPI over other standard features used in the literature. It is also shown that the proposed features, in combination with LightGBM, provide better accuracy of 95.26% with a less computational time of 2.25 s over the challenging large datasets like Speech Accent Archive, Voice Gender Dataset, Common Voice, and Texas Instruments/Massachusetts Institute of Technology corpus

    Blind recognition of space-time block code in MISO system

    Get PDF

    Automated Cough Assessment on a Mobile Platform

    Get PDF

    Energy efficient smartphone-based activity recognition using fixed-point arithmetic

    Get PDF
    In this paper we propose a novel energy efficient approach for the recognition of human activities using smartphones as wearable sensing devices, targeting assisted living applications such as remote patient activity monitoring for the disabled and the elderly. The method exploits fixed-point arithmetic to propose a modified multiclass Support Vector Machine (SVM) learning algorithm, allowing to better pre- serve the smartphone battery lifetime with respect to the conventional floating-point based formulation while maintaining comparable system accuracy levels. Experiments show comparative results between this approach and the traditional SVM in terms of recognition performance and battery consumption, highlighting the advantages of the proposed method.Peer ReviewedPostprint (published version
    corecore