36 research outputs found

    Speeded Up Robust Features Descriptor for Iris Recognition Systems

    Get PDF
    اكتسبت النظم البايومترية اهتماما كبيرا لعدة تطبيقات. كان تحديد القزحية أحد أكثر التقنيات البايومترية تطوراً للمصادقة الفعالة. نظام التعرف على القزحية الحالية يقدم نتائج دقيقة وموثوق بها على أساس الصور المأخوذة بالأشعة التحت الحمراء (NIR) عندما يتم التقاط الصور في مسافة ثابتة مع تعاون المستخدم. ولكن بالنسبة لصور العين الملونة التي تم الحصول عليها تحت الطول الموجي المرئي (VW) دون التعاون بين المستخدمين، فإن كفاءة التعرف على القزحية تتأثر بسبب الضوضاء مثل صور عدم وضوح العين، و تداخل الرموش ، والانسداد  بالأجفان وغيرها. يهدف هذا العمل إلى استخدام (SURF) لاسترداد خصائص القزحية في كل من صور قزحية NIR والطيف المرئي. يتم استخدام هذا النهج وتقييمه على قواعد بيانات CASIA v1and IITD v1 كصورة قزحية NIR وUBIRIS v1 كصورة ملونة. وأظهرت النتائج معدل دقة عالية (98.1 ٪) على CASIA v1, (98.2) على IITD v1 و (83٪) على UBIRIS v1 تقييمها بالمقارنة مع الأساليب الأخرى.Biometric systems have gained significant attention for several applications. Iris identification was one of the most sophisticated biometrical techniques for effective and confident authentication. Current iris identification system offers accurate and reliable results based on near- infra -red light (NIR) images when images are taken in a restricted area with fixed-distance user cooperation. However, for the color eye images obtained under visible wavelength (VW) without cooperation between the users, the efficiency of iris recognition degrades because of noise such as eye blurring images, eye lashing, occlusion and reflection. This works aims to use Speeded up robust features Descriptor (SURF) to retrieve the iris's characteristics in both NIR iris images and visible spectrum. This approach is used and evaluated on the CASIA v1and IITD v1 databases as NIR iris image and UBIRIS v1 as color image. The evaluation results showed a high accuracy rate 98.1 % on CASIA v1, 98.2 on IITD v1 and 83% on UBIRIS v1 evaluated by comparing to the other method

    Open-set Speaker Identification

    Get PDF
    This study is motivated by the growing need for effective extraction of intelligence and evidence from audio recordings in the fight against crime, a need made ever more apparent with the recent expansion of criminal and terrorist organisations. The main focus is to enhance open-set speaker identification process within the speaker identification systems, which are affected by noisy audio data obtained under uncontrolled environments such as in the street, in restaurants or other places of businesses. Consequently, two investigations are initially carried out including the effects of environmental noise on the accuracy of open-set speaker recognition, which thoroughly cover relevant conditions in the considered application areas, such as variable training data length, background noise and real world noise, and the effects of short and varied duration reference data in open-set speaker recognition. The investigations led to a novel method termed “vowel boosting” to enhance the reliability in speaker identification when operating with varied duration speech data under uncontrolled conditions. Vowels naturally contain more speaker specific information. Therefore, by emphasising this natural phenomenon in speech data, it enables better identification performance. The traditional state-of-the-art GMM-UBMs and i-vectors are used to evaluate “vowel boosting”. The proposed approach boosts the impact of the vowels on the speaker scores, which improves the recognition accuracy for the specific case of open-set identification with short and varied duration of speech material

    Comparative Study And Analysis Of Quality Based Multibiometric Technique Using Fuzzy Inference System

    Get PDF
    Biometric is a science and technology of measuring and analyzing biological data i.e. physical or behavioral traits which is able to uniquely recognize a person from others. Prior studies of biometric verification systems with fusion of several biometric sources have been proved to be outstanding over single biometric system. However, fusion approach without considering the quality information of the data used will affect the system performance where in some cases the performances of the fusion system may become worse compared to the performances of either one of the single systems. In order to overcome this limitation, this study proposes a quality based fusion scheme by designing a fuzzy inference system (FIS) which is able to determine the optimum weight to combine the parameter for fusion systems in changing conditions. For this purpose, fusion systems which combine two modalities i.e. speech and lip traits are experimented. For speech signal, Mel Frequency Cepstral Coefficient (MFCC) is used as features while region of interest (ROI) of lip image is employed as lip features. Support vector machine (SVM) is then executed as classifier to the verification system. For validation, common fusion schemes i.e. minimum rule, maximum rule, simple sum rule, weighted sum rule are compared to the proposed quality based fusion scheme. From the experimental results at 35dB SNR of speech and 0.8 quality density of lip, the EER percentages for speech, lip, minimum rule, maximum rule, simple sum rule, weighted sum rule systems are observed as 5.9210%, 37.2157%, 33.2676%, 31.1364%, 4.0112% and 14.9023%, respectively compared to the performances of sugeno-type FIS and mamdani-type FIS i.e. 1.9974% and 1.9745%

    Fusion of Audio and Visual Information for Implementing Improved Speech Recognition System

    Get PDF
    Speech recognition is a very useful technology because of its potential to develop applications, which are suitable for various needs of users. This research is an attempt to enhance the performance of a speech recognition system by combining the visual features (lip movement) with audio features. The results were calculated using utterances of numerals collected from participants inclusive of both male and female genders. Discrete Cosine Transform (DCT) coefficients were used for computing visual features and Mel Frequency Cepstral Coefficients (MFCC) were used for computing audio features. The classification was then carried out using Support Vector Machine (SVM). The results obtained from the combined/fused system were compared with the recognition rates of two standalone systems (Audio only and visual only)

    Emotion Recognition from Acted and Spontaneous Speech

    Get PDF
    Dizertační práce se zabývá rozpoznáním emočního stavu mluvčích z řečového signálu. Práce je rozdělena do dvou hlavních častí, první část popisuju navržené metody pro rozpoznání emočního stavu z hraných databází. V rámci této části jsou představeny výsledky rozpoznání použitím dvou různých databází s různými jazyky. Hlavními přínosy této části je detailní analýza rozsáhlé škály různých příznaků získaných z řečového signálu, návrh nových klasifikačních architektur jako je například „emoční párování“ a návrh nové metody pro mapování diskrétních emočních stavů do dvou dimenzionálního prostoru. Druhá část se zabývá rozpoznáním emočních stavů z databáze spontánní řeči, která byla získána ze záznamů hovorů z reálných call center. Poznatky z analýzy a návrhu metod rozpoznání z hrané řeči byly využity pro návrh nového systému pro rozpoznání sedmi spontánních emočních stavů. Jádrem navrženého přístupu je komplexní klasifikační architektura založena na fúzi různých systémů. Práce se dále zabývá vlivem emočního stavu mluvčího na úspěšnosti rozpoznání pohlaví a návrhem systému pro automatickou detekci úspěšných hovorů v call centrech na základě analýzy parametrů dialogu mezi účastníky telefonních hovorů.Doctoral thesis deals with emotion recognition from speech signals. The thesis is divided into two main parts; the first part describes proposed approaches for emotion recognition using two different multilingual databases of acted emotional speech. The main contributions of this part are detailed analysis of a big set of acoustic features, new classification schemes for vocal emotion recognition such as “emotion coupling” and new method for mapping discrete emotions into two-dimensional space. The second part of this thesis is devoted to emotion recognition using multilingual databases of spontaneous emotional speech, which is based on telephone records obtained from real call centers. The knowledge gained from experiments with emotion recognition from acted speech was exploited to design a new approach for classifying seven emotional states. The core of the proposed approach is a complex classification architecture based on the fusion of different systems. The thesis also examines the influence of speaker’s emotional state on gender recognition performance and proposes system for automatic identification of successful phone calls in call center by means of dialogue features.

    Reconhecimento biométrico considerando a deformação não linear da íris humana

    Get PDF
    The biometric systems that use the information on iris texture has received great attention in recent years. The extraordinary variation in iris texture allows the creation of recognition and identification systems with almost zero error rates. However, in general, researches ignore the problems associated with contraction and dilation iris movements that can result in significant differences between the enrollment images and the probe image. This work, in addition to developing a traditional iris recognition system, comprising the steps of detection, segmentation, normalization, encoding and comparison, determines quantitatively the iris motion effect in recognition system accuracy. In addition, this paper proposes a new method to reduce the influence of dynamic iris, verified by decidability and the Equal Error Rate (EER), obtained in the comparison between iris codes in very different expansion states. The new method uses Dynamic Time Warping technique to correct and compare the gradient vectors extracted from iris texture. Thus, the most discriminant features of the test image and enrollment image are aligned and compared, considering the non-linear distortion of the iris tissue. Experimental results using dynamic images indicate that system performance gets worse with comparison on images in different states contraction. For direct comparison with contracted and dilated iris the proposed method improves the decidability of 3.50 to 4.39 and EER of 9.69% to 3.36%.Coordenação de Aperfeiçoamento de Pessoal de Nível SuperiorFundação de Amparo a Pesquisa do Estado de Minas GeraisTese (Doutorado)Os sistemas biométricos que utilizam a informação contida na textura da íris têm recebido grande atenção nos últimos anos. A grande variação em textura da íris permite o desenvolvimento de sistemas de reconhecimento e de identificação com taxas de erro quase nulas. Entretanto, de forma geral, as pesquisas nesta área ignoram os problemas associados aos movimentos de contração e dilatação da íris que geram diferenças significativas entre as imagens inscritas em uma base de dados e a imagem de teste. Este trabalho, além de desenvolver um sistema de reconhecimento de íris tradicional, composto pelas etapas de detecção, segmentação, normalização, codificação e comparação, determina de forma quantitativa o efeito dos movimentos da íris na precisão do sistema de reconhecimento. Além disso, este trabalho propõe um novo método para diminuir a influência da dinâmica da íris, verificado pela decidibilidade e pela Taxa de Erro Igual (EER), obtidas na comparação entre códigos de íris em estados de dilatação bem diferentes. O novo método utiliza a técnica Dynamic Time Warping para corrigir e comparar os vetores de gradientes extraídos da textura da íris. Dessa forma, as características mais discriminantes da imagem de teste e da imagem da galeria são alinhadas e comparadas, considerando a deformação não linear do tecido da íris. Os resultados experimentais, utilizando imagens dinâmicas, indicam que a performance do sistema piora quando a comparação é feita com imagens em estados de contração diferentes. Para a comparação direta entre íris bem contraída com íris bem dilatada o método proposto melhora a decidibilidade de 3,50 para 4,39 e a EER de 9,69% para 3,36%

    Discriminative classifiers for speaker recognition

    Get PDF
    Speaker Recognition, Speaker Verification, Sparse Kernel Logistic Regression, Support Vector MachineMagdeburg, Univ., Fak. für Elektrotechnik und Informationstechnik, Diss., 2008von Marcel Kat
    corecore