3,650 research outputs found

    Human abnormal behavior impact on speaker verification systems

    Get PDF
    Human behavior plays a major role in improving human-machine communication. The performance must be affected by abnormal behavior as systems are trained using normal utterances. The abnormal behavior is often associated with a change in the human emotional state. Different emotional states cause physiological changes in the human body that affect the vocal tract. Fear, anger, or even happiness we recognize as a deviation from a normal behavior. The whole spectrum of human-machine application is susceptible to behavioral changes. Abnormal behavior is a major factor, especially for security applications such as verification systems. Face, fingerprint, iris, or speaker verification is a group of the most common approaches to biometric authentication today. This paper discusses human normal and abnormal behavior and its impact on the accuracy and effectiveness of automatic speaker verification (ASV). The support vector machines classifier inputs are Mel-frequency cepstral coefficients and their dynamic changes. For this purpose, the Berlin Database of Emotional Speech was used. Research has shown that abnormal behavior has a major impact on the accuracy of verification, where the equal error rate increase to 37 %. This paper also describes a new design and application of the ASV system that is much more immune to the rejection of a target user with abnormal behavior.Web of Science6401274012

    A Novel Approach for Speech to Text Recognition System Using Hidden Markov Model

    Get PDF
    Speech recognition is the application of sophisticated algorithms which involve the transforming of the human voice to text. Speech identification is essential as it utilizes by several biometric identification systems and voice-controlled automation systems. Variations in recording equipment, speakers, situations, and environments make speech recognition a tough undertaking. Three major phases comprise speech recognition: speech pre-processing, feature extraction, and speech categorization. This work presents a comprehensive study with the objectives of comprehending, analyzing, and enhancing these models and approaches, such as Hidden Markov Models and Artificial Neural Networks, employed in the voice recognition system for feature extraction and classification

    Security and privacy problems in voice assistant applications: A survey

    Get PDF
    Voice assistant applications have become omniscient nowadays. Two models that provide the two most important functions for real-life applications (i.e., Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR) models and Speaker Identification (SI) models. According to recent studies, security and privacy threats have also emerged with the rapid development of the Internet of Things (IoT). The security issues researched include attack techniques toward machine learning models and other hardware components widely used in voice assistant applications. The privacy issues include technical-wise information stealing and policy-wise privacy breaches. The voice assistant application takes a steadily growing market share every year, but their privacy and security issues never stopped causing huge economic losses and endangering users' personal sensitive information. Thus, it is important to have a comprehensive survey to outline the categorization of the current research regarding the security and privacy problems of voice assistant applications. This paper concludes and assesses five kinds of security attacks and three types of privacy threats in the papers published in the top-tier conferences of cyber security and voice domain

    A survey on artificial intelligence-based acoustic source identification

    Get PDF
    The concept of Acoustic Source Identification (ASI), which refers to the process of identifying noise sources has attracted increasing attention in recent years. The ASI technology can be used for surveillance, monitoring, and maintenance applications in a wide range of sectors, such as defence, manufacturing, healthcare, and agriculture. Acoustic signature analysis and pattern recognition remain the core technologies for noise source identification. Manual identification of acoustic signatures, however, has become increasingly challenging as dataset sizes grow. As a result, the use of Artificial Intelligence (AI) techniques for identifying noise sources has become increasingly relevant and useful. In this paper, we provide a comprehensive review of AI-based acoustic source identification techniques. We analyze the strengths and weaknesses of AI-based ASI processes and associated methods proposed by researchers in the literature. Additionally, we did a detailed survey of ASI applications in machinery, underwater applications, environment/event source recognition, healthcare, and other fields. We also highlight relevant research directions

    Microphone smart device fingerprinting from video recordings

    Get PDF
    This report aims at summarizing the on-going research activity carried out by DG-JRC in the framework of the institutional project Authors and Victims Identification of Child Abuse on-line, concerning the use of microphone fingerprinting for source device classification. Starting from an exhaustive study of the State of Art regarding the matter, this report describes a feasibility study about the adoption of microphone fingerprinting for source identification of video recordings. A set of operational scenarios have been established in collaboration with EUROPOL law enforcers, according to investigators needs. A critical analysis of the obtained results has demonstrated the feasibility of microphone fingerprinting and it has suggested a set of recommendations, both in terms of usability and future researches in the field.JRC.E.3-Cyber and Digital Citizens' Securit

    Vulnerability assessment in the use of biometrics in unsupervised environments

    Get PDF
    Mención Internacional en el título de doctorIn the last few decades, we have witnessed a large-scale deployment of biometric systems in different life applications replacing the traditional recognition methods such as passwords and tokens. We approached a time where we use biometric systems in our daily life. On a personal scale, the authentication to our electronic devices (smartphones, tablets, laptops, etc.) utilizes biometric characteristics to provide access permission. Moreover, we access our bank accounts, perform various types of payments and transactions using the biometric sensors integrated into our devices. On the other hand, different organizations, companies, and institutions use biometric-based solutions for access control. On the national scale, police authorities and border control measures use biometric recognition devices for individual identification and verification purposes. Therefore, biometric systems are relied upon to provide a secured recognition where only the genuine user can be recognized as being himself. Moreover, the biometric system should ensure that an individual cannot be identified as someone else. In the literature, there are a surprising number of experiments that show the possibility of stealing someone’s biometric characteristics and use it to create an artificial biometric trait that can be used by an attacker to claim the identity of the genuine user. There were also real cases of people who successfully fooled the biometric recognition system in airports and smartphones [1]–[3]. That urges the necessity to investigate the potential threats and propose countermeasures that ensure high levels of security and user convenience. Consequently, performing security evaluations is vital to identify: (1) the security flaws in biometric systems, (2) the possible threats that may target the defined flaws, and (3) measurements that describe the technical competence of the biometric system security. Identifying the system vulnerabilities leads to proposing adequate security solutions that assist in achieving higher integrity. This thesis aims to investigate the vulnerability of fingerprint modality to presentation attacks in unsupervised environments, then implement mechanisms to detect those attacks and avoid the misuse of the system. To achieve these objectives, the thesis is carried out in the following three phases. In the first phase, the generic biometric system scheme is studied by analyzing the vulnerable points with special attention to the vulnerability to presentation attacks. The study reviews the literature in presentation attack and the corresponding solutions, i.e. presentation attack detection mechanisms, for six biometric modalities: fingerprint, face, iris, vascular, handwritten signature, and voice. Moreover, it provides a new taxonomy for presentation attack detection mechanisms. The proposed taxonomy helps to comprehend the issue of presentation attacks and how the literature tried to address it. The taxonomy represents a starting point to initialize new investigations that propose novel presentation attack detection mechanisms. In the second phase, an evaluation methodology is developed from two sources: (1) the ISO/IEC 30107 standard, and (2) the Common Evaluation Methodology by the Common Criteria. The developed methodology characterizes two main aspects of the presentation attack detection mechanism: (1) the resistance of the mechanism to presentation attacks, and (2) the corresponding threat of the studied attack. The first part is conducted by showing the mechanism's technical capabilities and how it influences the security and ease-of-use of the biometric system. The second part is done by performing a vulnerability assessment considering all the factors that affect the attack potential. Finally, a data collection is carried out, including 7128 fingerprint videos of bona fide and attack presentation. The data is collected using two sensing technologies, two presentation scenarios, and considering seven attack species. The database is used to develop dynamic presentation attack detection mechanisms that exploit the fingerprint spatio-temporal features. In the final phase, a set of novel presentation attack detection mechanisms is developed exploiting the dynamic features caused by the natural fingerprint phenomena such as perspiration and elasticity. The evaluation results show an efficient capability to detect attacks where, in some configurations, the mechanisms are capable of eliminating some attack species and mitigating the rest of the species while keeping the user convenience at a high level.En las últimas décadas, hemos asistido a un despliegue a gran escala de los sistemas biométricos en diferentes aplicaciones de la vida cotidiana, sustituyendo a los métodos de reconocimiento tradicionales, como las contraseñas y los tokens. Actualmente los sistemas biométricos ya forman parte de nuestra vida cotidiana: es habitual emplear estos sistemas para que nos proporcionen acceso a nuestros dispositivos electrónicos (teléfonos inteligentes, tabletas, ordenadores portátiles, etc.) usando nuestras características biométricas. Además, accedemos a nuestras cuentas bancarias, realizamos diversos tipos de pagos y transacciones utilizando los sensores biométricos integrados en nuestros dispositivos. Por otra parte, diferentes organizaciones, empresas e instituciones utilizan soluciones basadas en la biometría para el control de acceso. A escala nacional, las autoridades policiales y de control fronterizo utilizan dispositivos de reconocimiento biométrico con fines de identificación y verificación individual. Por lo tanto, en todas estas aplicaciones se confía en que los sistemas biométricos proporcionen un reconocimiento seguro en el que solo el usuario genuino pueda ser reconocido como tal. Además, el sistema biométrico debe garantizar que un individuo no pueda ser identificado como otra persona. En el estado del arte, hay un número sorprendente de experimentos que muestran la posibilidad de robar las características biométricas de alguien, y utilizarlas para crear un rasgo biométrico artificial que puede ser utilizado por un atacante con el fin de reclamar la identidad del usuario genuino. También se han dado casos reales de personas que lograron engañar al sistema de reconocimiento biométrico en aeropuertos y teléfonos inteligentes [1]–[3]. Esto hace que sea necesario investigar estas posibles amenazas y proponer contramedidas que garanticen altos niveles de seguridad y comodidad para el usuario. En consecuencia, es vital la realización de evaluaciones de seguridad para identificar (1) los fallos de seguridad de los sistemas biométricos, (2) las posibles amenazas que pueden explotar estos fallos, y (3) las medidas que aumentan la seguridad del sistema biométrico reduciendo estas amenazas. La identificación de las vulnerabilidades del sistema lleva a proponer soluciones de seguridad adecuadas que ayuden a conseguir una mayor integridad. Esta tesis tiene como objetivo investigar la vulnerabilidad en los sistemas de modalidad de huella dactilar a los ataques de presentación en entornos no supervisados, para luego implementar mecanismos que permitan detectar dichos ataques y evitar el mal uso del sistema. Para lograr estos objetivos, la tesis se desarrolla en las siguientes tres fases. En la primera fase, se estudia el esquema del sistema biométrico genérico analizando sus puntos vulnerables con especial atención a los ataques de presentación. El estudio revisa la literatura sobre ataques de presentación y las soluciones correspondientes, es decir, los mecanismos de detección de ataques de presentación, para seis modalidades biométricas: huella dactilar, rostro, iris, vascular, firma manuscrita y voz. Además, se proporciona una nueva taxonomía para los mecanismos de detección de ataques de presentación. La taxonomía propuesta ayuda a comprender el problema de los ataques de presentación y la forma en que la literatura ha tratado de abordarlo. Esta taxonomía presenta un punto de partida para iniciar nuevas investigaciones que propongan novedosos mecanismos de detección de ataques de presentación. En la segunda fase, se desarrolla una metodología de evaluación a partir de dos fuentes: (1) la norma ISO/IEC 30107, y (2) Common Evaluation Methodology por el Common Criteria. La metodología desarrollada considera dos aspectos importantes del mecanismo de detección de ataques de presentación (1) la resistencia del mecanismo a los ataques de presentación, y (2) la correspondiente amenaza del ataque estudiado. Para el primer punto, se han de señalar las capacidades técnicas del mecanismo y cómo influyen en la seguridad y la facilidad de uso del sistema biométrico. Para el segundo aspecto se debe llevar a cabo una evaluación de la vulnerabilidad, teniendo en cuenta todos los factores que afectan al potencial de ataque. Por último, siguiendo esta metodología, se lleva a cabo una recogida de datos que incluye 7128 vídeos de huellas dactilares genuinas y de presentación de ataques. Los datos se recogen utilizando dos tecnologías de sensor, dos escenarios de presentación y considerando siete tipos de instrumentos de ataque. La base de datos se utiliza para desarrollar y evaluar mecanismos dinámicos de detección de ataques de presentación que explotan las características espacio-temporales de las huellas dactilares. En la fase final, se desarrolla un conjunto de mecanismos novedosos de detección de ataques de presentación que explotan las características dinámicas causadas por los fenómenos naturales de las huellas dactilares, como la transpiración y la elasticidad. Los resultados de la evaluación muestran una capacidad eficiente de detección de ataques en la que, en algunas configuraciones, los mecanismos son capaces de eliminar completamente algunos tipos de instrumentos de ataque y mitigar el resto de los tipos manteniendo la comodidad del usuario en un nivel alto.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidente: Cristina Conde Vila.- Secretario: Mariano López García.- Vocal: Farzin Derav

    Emotion-aware voice interfaces based on speech signal processing

    Get PDF
    Voice interfaces (VIs) will become increasingly widespread in current daily lives as AI techniques progress. VIs can be incorporated into smart devices like smartphones, as well as integrated into autos, home automation systems, computer operating systems, and home appliances, among other things. Current speech interfaces, however, are unaware of users’ emotional states and hence cannot support real communication. To overcome these limitations, it is necessary to implement emotional awareness in future VIs. This thesis focuses on how speech signal processing (SSP) and speech emotion recognition (SER) can enable VIs to gain emotional awareness. Following an explanation of what emotion is and how neural networks are implemented, this thesis presents the results of several user studies and surveys. Emotions are complicated, and they are typically characterized using category and dimensional models. They can be expressed verbally or nonverbally. Although existing voice interfaces are unaware of users’ emotional states and cannot support natural conversations, it is possible to perceive users’ emotions by speech based on SSP in future VIs. One section of this thesis, based on SSP, investigates mental restorative effects on humans and their measures from speech signals. SSP is less intrusive and more accessible than traditional measures such as attention scales or response tests, and it can provide a reliable assessment for attention and mental restoration. SSP can be implemented into future VIs and utilized in future HCI user research. The thesis then moves on to present a novel attention neural network based on sparse correlation features. The detection accuracy of emotions in the continuous speech was demonstrated in a user study utilizing recordings from a real classroom. In this section, a promising result will be shown. In SER research, it is unknown if existing emotion detection methods detect acted emotions or the genuine emotion of the speaker. Another section of this thesis is concerned with humans’ ability to act on their emotions. In a user study, participants were instructed to imitate five fundamental emotions. The results revealed that they struggled with this task; nevertheless, certain emotions were easier to replicate than others. A further study concern is how VIs should respond to users’ emotions if SER techniques are implemented in VIs and can recognize users’ emotions. The thesis includes research on ways for dealing with the emotions of users. In a user study, users were instructed to make sad, angry, and terrified VI avatars happy and were asked if they would like to be treated the same way if the situation were reversed. According to the results, the majority of participants tended to respond to these unpleasant emotions with neutral emotion, but there is a difference among genders in emotion selection. For a human-centered design approach, it is important to understand what the users’ preferences for future VIs are. In three distinct cultures, a questionnaire-based survey on users’ attitudes and preferences for emotion-aware VIs was conducted. It was discovered that there are almost no gender differences. Cluster analysis found that there are three fundamental user types that exist in all cultures: Enthusiasts, Pragmatists, and Sceptics. As a result, future VI development should consider diverse sorts of consumers. In conclusion, future VIs systems should be designed for various sorts of users as well as be able to detect the users’ disguised or actual emotions using SER and SSP technologies. Furthermore, many other applications, such as restorative effects assessments, can be included in the VIs system
    • …
    corecore