580 research outputs found

    Verificación de identidad en la educación virtual mediante análisis biométrico basado en la dinámica del tecleo

    Get PDF
    Virtual education has become one of the tools most widely used by students at all educational levels, not just because of its convenience and flexibility, but also because it can expand educational coverage. All these benefits also bring along multiple issues in terms of security and reliability in the evaluation the of student’s knowledge because traditional identity verification strategies, such as the combination of username and password, do not guarantee that the student enrolled in the course really takes the exam. Therefore, a system with a different type of verification strategy should be designed to differentiate valid users from impostors. This study proposes a new verification system based on distances computed among Gaussian Mixture Models created with different writing task. The proposed approach is evaluated in two different modalities namely intrusive verification and non-intrusive verification. The intrusive mode provides a false positive rate of around 16 %, while the non-intrusive mode provides a false positive rate of 12 % In addition, the proposed strategy for non-intrusive verification is compared to a work previously reported in the literature and the results show that our approach reduces the equal error rate in about 24.3 %. The implemented strategy does not need additional hardware; only the computer keyboard is required to complete the user verification, which makes the system attractive, flexible, and practical for virtual education platforms.La educación virtual se ha convertido en una de las herramientas más utilizadas por los estudiantes en todos los niveles educativos, no solo por la comodidad y la flexibilidad, sino también por la posibilidad de ampliar la cobertura educativa en una población. Todos estos beneficios traen consigo múltiples problemas de seguridad y confiabilidad a la hora de evaluar el proceso de aprendizaje del estudiante, ya que las estrategias tradicionales de verificación de identidad, como la combinación de nombre de usuario y contraseña, no garantizan que el estudiante matriculado en el curso realmente realice el examen. Por lo tanto, es necesario diseñar un sistema con otro tipo de estrategia de verificación para diferenciar un usuario válido de un impostor. Este estudio propone un nuevo método de verificación, basado en el cálculo de distancias entre los modelos de mezclas gaussianas creados con diferentes tareas de escritura. El enfoque propuesto es evaluado en dos modalidades diferentes llamadas verificación intrusiva y verificación no intrusiva. El modo intrusivo proporciona una tasa de falsos positivos de 16 %, mientras el modo no intrusivo provee una tasa de falsos positivos de 12 %. Además, la estrategia propuesta para verificación no intrusiva es comparada con un trabajo previamente reportado en la literatura y los resultados muestran que nuestro enfoque reduce la tasa de error en aproximadamente un 24.3 %. La estrategia implementada no necesita hardware adicional, solo es requerido el teclado del computador para realizar la verificación, lo que hace que el sistema sea atractivo y flexible para ser usado en plataformas de educación virtual

    A Speech Intelligibility Estimation Method Based on Hidden Markov Model

    Get PDF
    This paper proposes a speech intelligibility estimation method based on hidden Markov model (HMM) that is widely used for speech recognition. The HMM-based method is a kind of non-intrusive speech quality measurement, which means it operates without a reference speech signal. The log-likelihood score of HMM is converted to a normalized intelligibility score. We estimate the speech intelligibility of standard digital speech coders. The experimental results show that the proposed HMM-based method gives improved performance than the conventional non-intrusive speech intelligibility evaluation tool

    Occupancy estimation in smart buildings using audio-processing techniques

    Get PDF
    In the past few years, several case studies have illustrated that the use of occupancy information in buildings leads to energy-efficient and low-cost HVAC operation. The widely presented techniques for occupancy estimation include temperature, humidity, CO2 concentration, image camera, motion sensor and passive infrared (PIR) sensor. So far little studies have been reported in literature to utilize audio and speech processing as indoor occupancy prediction technique. With rapid advances of audio and speech processing technologies, nowadays it is more feasible and attractive to integrate audio-based signal processing component into smart buildings. In this work, we propose to utilize audio processing techniques (i.e., speaker recognition and background audio energy estimation) to estimate room occupancy (i.e., the number of people inside a room). Theoretical analysis and simulation results demonstrate the accuracy and effectiveness of this proposed occupancy estimation technique. Based on the occupancy estimation, smart buildings will adjust the thermostat setups and HVAC operations, thus, achieving greater quality of service and drastic cost savings

    Speech Enhancement for Automatic Analysis of Child-Centered Audio Recordings

    Get PDF
    Analysis of child-centred daylong naturalist audio recordings has become a de-facto research protocol in the scientific study of child language development. The researchers are increasingly using these recordings to understand linguistic environment a child encounters in her routine interactions with the world. These audio recordings are captured by a microphone that a child wears throughout a day. The audio recordings, being naturalistic, contain a lot of unwanted sounds from everyday life which degrades the performance of speech analysis tasks. The purpose of this thesis is to investigate the utility of speech enhancement (SE) algorithms in the automatic analysis of such recordings. To this effect, several classical signal processing and modern machine learning-based SE methods were employed 1) as a denoiser for speech corrupted with additive noise sampled from real-life child-centred daylong recordings and 2) as front-end for downstream speech processing tasks of addressee classification (infant vs. adult-directed speech) and automatic syllable count estimation from the speech. The downstream tasks were conducted on data derived from a set of geographically, culturally, and linguistically diverse child-centred daylong audio recordings. The performance of denoising was evaluated through objective quality metrics (spectral distortion and instrumental intelligibility) and through the downstream task performance. Finally, the objective evaluation results were compared with downstream task performance results to find whether objective metrics can be used as a reasonable proxy to select SE front-end for a downstream task. The results obtained show that a recently proposed Long Short-Term Memory (LSTM)-based progressive learning architecture provides maximum performance gains in the downstream tasks in comparison with the other SE methods and baseline results. Classical signal processing-based SE methods also lead to competitive performance. From the comparison of objective assessment and downstream task performance results, no predictive relationship between task-independent objective metrics and performance of downstream tasks was found

    A non-intrusive method for estimating binaural speech intelligibility from noise-corrupted signals captured by a pair of microphones

    Get PDF
    A non-intrusive method is introduced to predict binaural speech intelligibility in noise directly from signals captured using a pair of microphones. The approach combines signal processing techniques in blind source separation and localisation, with an intrusive objective intelligibility measure (OIM). Therefore, unlike classic intrusive OIMs, this method does not require a clean reference speech signal and knowing the location of the sources to operate. The proposed approach is able to estimate intelligibility in stationary and fluctuating noises, when the noise masker is presented as a point or diffused source, and is spatially separated from the target speech source on a horizontal plane. The performance of the proposed method was evaluated in two rooms. When predicting subjective intelligibility measured as word recognition rate, this method showed reasonable predictive accuracy with correlation coefficients above 0.82, which is comparable to that of a reference intrusive OIM in most of the conditions. The proposed approach offers a solution for fast binaural intelligibility prediction, and therefore has practical potential to be deployed in situations where on-site speech intelligibility is a concern

    Spatial features of reverberant speech: estimation and application to recognition and diarization

    Get PDF
    Distant talking scenarios, such as hands-free calling or teleconference meetings, are essential for natural and comfortable human-machine interaction and they are being increasingly used in multiple contexts. The acquired speech signal in such scenarios is reverberant and affected by additive noise. This signal distortion degrades the performance of speech recognition and diarization systems creating troublesome human-machine interactions.This thesis proposes a method to non-intrusively estimate room acoustic parameters, paying special attention to a room acoustic parameter highly correlated with speech recognition degradation: clarity index. In addition, a method to provide information regarding the estimation accuracy is proposed. An analysis of the phoneme recognition performance for multiple reverberant environments is presented, from which a confusability metric for each phoneme is derived. This confusability metric is then employed to improve reverberant speech recognition performance. Additionally, room acoustic parameters can as well be used in speech recognition to provide robustness against reverberation. A method to exploit clarity index estimates in order to perform reverberant speech recognition is introduced. Finally, room acoustic parameters can also be used to diarize reverberant speech. A room acoustic parameter is proposed to be used as an additional source of information for single-channel diarization purposes in reverberant environments. In multi-channel environments, the time delay of arrival is a feature commonly used to diarize the input speech, however the computation of this feature is affected by reverberation. A method is presented to model the time delay of arrival in a robust manner so that speaker diarization is more accurately performed.Open Acces
    corecore