5 research outputs found

    Biometric system verification close to "real world" conditions

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-04391-8_31Proceedings of Joint COST 2101 and 2102 International Conference, BioID_MultiComm 2009, Madrid, Spain.In this paper we present an autonomous biometric device developed in the framework of a national project. This system is able to capture speech, hand-geometry, online signature and face, and can open a door when the user is positively verified. Nevertheless the main purpose is to acquire a database without supervision (normal databases are collected in the presence of a supervisor that tells you what to do in front of the device, which is an unrealistic situation). This system will permit us to explain the main differences between what we call "real conditions" as opposed to "laboratory conditions".This work has been supported by FEDER and MEC, TEC2006-13141-C03/TCM, and COST-2102

    Anchor model fusion for emotion recognition in speech

    Full text link
    Proceedings of Joint COST 2101 and 2102 International Conference, BioID_MultiComm 2009, Madrid (Spain)The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-04391-8_7In this work, a novel method for system fusion in emotion recognition for speech is presented. The proposed approach, namely Anchor Model Fusion (AMF), exploits the characteristic behaviour of the scores of a speech utterance among different emotion models, by a mapping to a back-end anchor-model feature space followed by a SVM classifier. Experiments are presented in three different databases: Ahumada III, with speech obtained from real forensic cases; and SUSAS Actual and SUSAS Simulated. Results comparing AMF with a simple sum-fusion scheme after normalization show a significant performance improvement of the proposed technique for two of the three experimental set-ups, without degrading performance in the third one.This work has been financed under project TEC2006-13170-C02-01

    Comparative Analysis of Arabic Vowels using Formants and an Automatic Speech Recognition System

    Get PDF
    Arabic, the world's second most spoken language in terms of number of speakers, has not received much attention from the traditional speech processing research community. This study is specifically concerned with the analysis of vowels in modern standard Arabic dialect. The first and second formant values in these vowels are investigated and the differences and similarities between the vowels explored using consonant-vowels-consonant (CVC) utterances. For this purpose, a Hidden Markov Model (HMM) based recognizer is built to classify the vowels and the performance of the recognizer analyzed to help understand the similarities and dissimilarities between the phonetic features of vowels. The vowels are also analyzed in both time and frequency domains, and the consistent findings of the analysis are expected to enable future Arabic speech processing tasks such as vowel and speech recognition and classification

    Towards An Intelligent Fuzzy Based Multimodal Two Stage Speech Enhancement System

    Get PDF
    This thesis presents a novel two stage multimodal speech enhancement system, making use of both visual and audio information to filter speech, and explores the extension of this system with the use of fuzzy logic to demonstrate proof of concept for an envisaged autonomous, adaptive, and context aware multimodal system. The design of the proposed cognitively inspired framework is scalable, meaning that it is possible for the techniques used in individual parts of the system to be upgraded and there is scope for the initial framework presented here to be expanded. In the proposed system, the concept of single modality two stage filtering is extended to include the visual modality. Noisy speech information received by a microphone array is first pre-processed by visually derived Wiener filtering employing the novel use of the Gaussian Mixture Regression (GMR) technique, making use of associated visual speech information, extracted using a state of the art Semi Adaptive Appearance Models (SAAM) based lip tracking approach. This pre-processed speech is then enhanced further by audio only beamforming using a state of the art Transfer Function Generalised Sidelobe Canceller (TFGSC) approach. This results in a system which is designed to function in challenging noisy speech environments (using speech sentences with different speakers from the GRID corpus and a range of noise recordings), and both objective and subjective test results (employing the widely used Perceptual Evaluation of Speech Quality (PESQ) measure, a composite objective measure, and subjective listening tests), showing that this initial system is capable of delivering very encouraging results with regard to filtering speech mixtures in difficult reverberant speech environments. Some limitations of this initial framework are identified, and the extension of this multimodal system is explored, with the development of a fuzzy logic based framework and a proof of concept demonstration implemented. Results show that this proposed autonomous,adaptive, and context aware multimodal framework is capable of delivering very positive results in difficult noisy speech environments, with cognitively inspired use of audio and visual information, depending on environmental conditions. Finally some concluding remarks are made along with proposals for future work

    Application of CBIR techniques for the purpose of biometric identification based on human gait

    Get PDF
    Intenzivan razvoj informaciono-komunikacionih tehnologija otvorio je vrata primeni biometrijskih tehnologija u menadžmentu identiteta. Biometrijski modalitet koji ima veliki potencijal za primenu u praksi je ljudski hod. Njega odlikuju neinvazivnost i neintruzivnost. Ovakve osobine posebno pogoduju primeni u uslovima tehnologije prismotre. Zahvaljujući tome, ovaj biometrijski modalitet tokom prethodnih godina izaziva veliko interesovanje akademske zajednice. Ovo interesovanje rezultiralo je razvojem velikog broja pristupa za prepoznavanje osoba na osnovu hoda. Uprkos tome, primena biometrijskih tehnologija zasnovanih na ljudskom hodu u praksi i dalje zaostaje za dobro ustanovljenim modalitetima poput otiska prsta, lica ili glasa. Glavni razlog je nedostatak odgovarajućeg pristupa koji bi omogućio stabilnu primenu u realnim uslovima. Cilj ovog rada je predlog novog postupka za prepoznavanje osoba na osnovu hoda koji bi omogućio razvoj robusnog i pristupačnog biometrijskog sistema. Inicijalno, urađen je sveobuhvatan pregled oblasti i aktuelnih istraživanja na osnovu čega je predložen novi postupak. Predloženi postupak se zasniva na ideji da se sekvenca ljudskog hoda može predstaviti kao jedna nepomična 2D slika. Ovakav postupak omogućio bi da se za potrebe prepoznavanja primene generičke metode za pretragu slika na osnovu sadržaja. Na ovakav način problem bi bio prenet iz prostorno-vremenskog domena u prostorni domen, konkretno domen 2D nepomične slike, koji je poznat i u kome postoji veliki broj dokazanih rešenja. Za potrebe akvizicije, postupak se oslanja na novu tehnologiju iz oblasti interakcije čovek-računar, Microsoft Kinect. Na osnovu predloženog postupka razvijen je modularni laboratorijski prototip kao i okruženje za testiranje i evaluaciju. Naučna zasnovanost i opravdanost predloženog postupka proverena je nizom eksperimenata. Eksperimenti su organizovani na takav način da ispitaju različite faktore koji tokom primene postupka mogu uticati na konačne performanse u prepoznavanju. Na osnovu dobijenih rezultata može se zaključiti da predloženi postupak odlilkuje visok stepen robusnosti kao i visoka preciznost u prepoznavanju...Intense progress of information and communications technology enabled application of biometric technology in identity management. Human gait, as a biometric modality, has great potential for practical application. This is due to its noninvasive and nonintrusive nature. Surveillance technology is especially fertile ground for recognition based on human gait. These facts caused spike in academic interest for this biometric modality. This in turn resulted in development of large number of different approaches to human gait recognition. Nevertheless, practical application of biometric technology based on human gait still trails those well established modalities such as fingerprint, face or voice. Main reason for this is lacking of such approach that would enable stable use in realistic conditions. Goal of this paper is to propose a new approach for human gait recognition that would result in robust and affordable biometric system. Initially, a comprehensive review of research area and existing research was done that served as a base for the proposition of new approach. This new approach is based on the idea that human gait sequence can be represented as a single 2D still image. Using images would open the possibility of applying Content Based Image Retrieval (CBIR) techniques for the purpose of final recognition. This procedure shifts the problem form spatio-temporal towards spatial domain, specifically the space of 2D still image that is well researched and familiar. For acquisition purposes approach relies on new human-computer interaction technology, Microsoft Kinect. As proof of concept, a modular laboratory prototype was developed as well as environment for testing and evaluation. Foundation of the proposed approach was tested through a series of experiments. Empirical evaluation was performed in such a manner to investigate the influence of different contributing factors to system performance. Based on retrieved results a conclusion is reached that the proposed approach is highly robust and achieves high recognition rates..
    corecore