1,276 research outputs found

    Glottal Source Cepstrum Coefficients Applied to NIST SRE 2010

    Get PDF
    Through the present paper, a novel feature set for speaker recognition based on glottal estimate information is presented. An iterative algorithm is used to derive the vocal tract and glottal source estimations from speech signal. In order to test the importance of glottal source information in speaker characterization, the novel feature set has been tested in the 2010 NIST Speaker Recognition Evaluation (NIST SRE10). The proposed system uses glottal estimate parameter templates and classical cepstral information to build a model for each speaker involved in the recognition process. ALIZE [1] open-source software has been used to create the GMM models for both background and target speakers. Compared to using mel-frequency cepstrum coefficients (MFCC), the misclassification rate for the NIST SRE 2010 reduced from 29.43% to 27.15% when glottal source features are use

    Mechanical and durability performance of lightweight concrete brick with palm oil fuel ash (POFA)

    Get PDF
    Lightweight building materials such as precast roof and wall panel has been widely used in the construction industries. This is because lightweight materials could benefits the economy and society in terms of manufacturing, transportation and handling cost. One of the most preferable lightweight material is Expanded Polystyrene (EPS). EPS consist of 98% of air and 2% of polystyrene. Therefore, EPS is very low in density which could contribute in the reduction of building materials mass. Abundance of studies has shown that EPS has significantly contribute to the reduction of brick density. EPS has been used as the aggregates replacement in concrete. However, the existing of EPS in the concrete has reduce the strength performance of the concrete. Due to this, researchers have extend their research in improvising the EPS concrete and brick strength with the addition of pozzolanic materials such as fly ash, rice husk ask, silica fume and etc [1-4]. The ability of these pozzolanic materials in enhancing the strength of brick or concrete has been proven..

    Numerical simulation analysis on water jet pressure distribution at various nozzle aperture

    Get PDF
    The low velocity water jet is required by small scale Unmanned Underwater Vehicle (UUV) to control its position, either to remain statics in its position or to perform a slow and steady locomotion. However, the water jet performance is influenced by the size of nozzle aperture. By studying the pressure distribution around the nozzle area, the water jet velocity could be determined and characterized. In this studies, the ejection pressure was fixed at 23.37 Pa according to the constant actuation. Studies were conducted using ANSYS Fluent software. The results show that the water jet velocity and dynamic pressure are higher for larger nozzle aperture size at constant pressure. The total pressure and dynamic pressure had the lowest pressure drop at certain nozzle aperture size but became constant when the nozzle size was wider. This finding is useful in designing the UUV that powered by contractile water jet thruster

    Voice Based Biometric System Feature Extraction Using MFCC and LPC Technique

    Full text link
    Now a day, interest in using biometric technologies for person authentication in security systems has grown rapidly.Voice is one of the most promising and mature biometric modalities for secured access control this paper gives an experimental overview of techniques used for feature extraction in speaker recognition. The research in speaker recognition have been evolved starting from short time features reflecting spectral properties of speech low-level or physical traits to the high level features (behavioral traits) such as prosody, phonetic information, conversational patterns etc. first give a brief overview of Speech processing and voice biometric relation and then describe some feature extraction technique. We have performed experiment for feature extraction of MFCC, LPC techniques

    GLOTTAL EXCITATION EXTRACTION OF VOICED SPEECH - JOINTLY PARAMETRIC AND NONPARAMETRIC APPROACHES

    Get PDF
    The goal of this dissertation is to develop methods to recover glottal flow pulses, which contain biometrical information about the speaker. The excitation information estimated from an observed speech utterance is modeled as the source of an inverse problem. Windowed linear prediction analysis and inverse filtering are first used to deconvolve the speech signal to obtain a rough estimate of glottal flow pulses. Linear prediction and its inverse filtering can largely eliminate the vocal-tract response which is usually modeled as infinite impulse response filter. Some remaining vocal-tract components that reside in the estimate after inverse filtering are next removed by maximum-phase and minimum-phase decomposition which is implemented by applying the complex cepstrum to the initial estimate of the glottal pulses. The additive and residual errors from inverse filtering can be suppressed by higher-order statistics which is the method used to calculate cepstrum representations. Some features directly provided by the glottal source\u27s cepstrum representation as well as fitting parameters for estimated pulses are used to form feature patterns that were applied to a minimum-distance classifier to realize a speaker identification system with very limited subjects

    Speaker Recognition Systems: A Tutorial

    Full text link
    Abstract This paper gives an overview of speaker recognition systems. Speaker recognition is the task of automatically recognizing who is speaking by identifying an unknown speaker among several reference speakers using speaker-specific information included in speech waves. The different classification of speaker recognition and speech processing techniques required for performing the recognition task are discussed. The basic modules of a speaker recognition system are outlined and discussed. Some of the techniques required to implement each module of the system were discussed and others are mentioned. The methods were also compared with one another. Finally, this paper concludes by giving a few research trends in speaker recognition for some year to come

    An Interactive and Efficient Voice Processing For Home Automation System

    Get PDF
    Home networking has evolved from linked personal computers to a more complex system that encompasses advanced security and automation applications. Once just reserved for high-end luxury homes, home networks are now a regular feature in residences. These networks allow users to consolidate heating, air conditioning, lighting, appliances, entertainment, intercom, telecommunication, surveillance and security systems into an easy-to-operate unified network. Interactive applications operated by voice recognition, for example integrated door security systems and the ability to control home appliances, are key features of home automation networks. This interactive capability depends on high-quality voice processing technology, including acoustic echo cancellation, low signal distortion and noise reduction techniques. A home automation system must also be scalable to allow future evolution, flexible to support field upgrades, interactive, easy-to-use, costefficient and reliable. This article introduces some of the voice quality performance issues and design challenges unique to home automation systems. It will discuss home automation network applications that rely on voice processing, and examine some of the critical features and functionality that can help ease design complexity and cost to deliver enhanced performance

    Physiologically-Motivated Feature Extraction Methods for Speaker Recognition

    Get PDF
    Speaker recognition has received a great deal of attention from the speech community, and significant gains in robustness and accuracy have been obtained over the past decade. However, the features used for identification are still primarily representations of overall spectral characteristics, and thus the models are primarily phonetic in nature, differentiating speakers based on overall pronunciation patterns. This creates difficulties in terms of the amount of enrollment data and complexity of the models required to cover the phonetic space, especially in tasks such as identification where enrollment and testing data may not have similar phonetic coverage. This dissertation introduces new features based on vocal source characteristics intended to capture physiological information related to the laryngeal excitation energy of a speaker. These features, including RPCC, GLFCC and TPCC, represent the unique characteristics of speech production not represented in current state-of-the-art speaker identification systems. The proposed features are evaluated through three experimental paradigms including cross-lingual speaker identification, cross song-type avian speaker identification and mono-lingual speaker identification. The experimental results show that the proposed features provide information about speaker characteristics that is significantly different in nature from the phonetically-focused information present in traditional spectral features. The incorporation of the proposed glottal source features offers significant overall improvement to the robustness and accuracy of speaker identification tasks
    corecore