977 research outputs found

    Unfamiliar facial identity registration and recognition performance enhancement

    Get PDF
    The work in this thesis aims at studying the problems related to the robustness of a face recognition system where specific attention is given to the issues of handling the image variation complexity and inherent limited Unique Characteristic Information (UCI) within the scope of unfamiliar identity recognition environment. These issues will be the main themes in developing a mutual understanding of extraction and classification tasking strategies and are carried out as a two interdependent but related blocks of research work. Naturally, the complexity of the image variation problem is built up from factors including the viewing geometry, illumination, occlusion and other kind of intrinsic and extrinsic image variation. Ideally, the recognition performance will be increased whenever the variation is reduced and/or the UCI is increased. However, the variation reduction on 2D facial images may result in loss of important clues or UCI data for a particular face alternatively increasing the UCI may also increase the image variation. To reduce the lost of information, while reducing or compensating the variation complexity, a hybrid technique is proposed in this thesis. The technique is derived from three conventional approaches for the variation compensation and feature extraction tasks. In this first research block, transformation, modelling and compensation approaches are combined to deal with the variation complexity. The ultimate aim of this combination is to represent (transformation) the UCI without losing the important features by modelling and discard (compensation) and reduce the level of the variation complexity of a given face image. Experimental results have shown that discarding a certain obvious variation will enhance the desired information rather than sceptical in losing the interested UCI. The modelling and compensation stages will benefit both variation reduction and UCI enhancement. Colour, gray level and edge image information are used to manipulate the UCI which involve the analysis on the skin colour, facial texture and features measurement respectively. The Derivative Linear Binary transformation (DLBT) technique is proposed for the features measurement consistency. Prior knowledge of input image with symmetrical properties, the informative region and consistency of some features will be fully utilized in preserving the UCI feature information. As a result, the similarity and dissimilarity representation for identity parameters or classes are obtained from the selected UCI representation which involves the derivative features size and distance measurement, facial texture and skin colour. These are mainly used to accommodate the strategy of unfamiliar identity classification in the second block of the research work. Since all faces share similar structure, classification technique should be able to increase the similarities within the class while increase the dissimilarity between the classes. Furthermore, a smaller class will result on less burden on the identification or recognition processes. The proposed method or collateral classification strategy of identity representation introduced in this thesis is by manipulating the availability of the collateral UCI for classifying the identity parameters of regional appearance, gender and age classes. In this regard, the registration of collateral UCI s have been made in such a way to collect more identity information. As a result, the performance of unfamiliar identity recognition positively is upgraded with respect to the special UCI for the class recognition and possibly with the small size of the class. The experiment was done using data from our developed database and open database comprising three different regional appearances, two different age groups and two different genders and is incorporated with pose and illumination image variations

    AUTOMATIC LIP-READING OF HEARING IMPAIRED PEOPLE

    Get PDF
    Inability to use speech interfaces greatly limits the deaf and hearing impaired people in the possibility of human-machine interaction. To solve this problem and to increase the accuracy and reliability of the automatic Russian sign language recognition system it is proposed to use lip-reading in addition to hand gestures recognition. Deaf and hearing impaired people use sign language as the main way of communication in everyday life. Sign language is a structured form of hand gestures and lips movements involving visual motions and signs, which is used as a communication system. Since sign language includes not only hand gestures, but also lip movements that mimic vocalized pronunciation, it is of interest to investigate how accurately such a visual speech can be recognized by a lip-reading system, especially considering the fact that the visual speech of hearing impaired people is often characterized with hyper-articulation, which should potentially facilitate its recognition. For this purpose, thesaurus of Russian sign language (TheRusLan) collected in SPIIRAS in 2018–19 was used. The database consists of color optical FullHD video recordings of 13 native Russian sign language signers (11 females and 2 males) from “Pavlovsk boarding school for the hearing impaired”. Each of the signers demonstrated 164 phrases for 5 times. This work covers the initial stages of this research, including data collection, data labeling, region-of-interest detection and methods for informative features extraction. The results of this study can later be used to create assistive technologies for deaf or hearing impaired people

    Review of constraints on vision-based gesture recognition for human–computer interaction

    Get PDF
    The ability of computers to recognise hand gestures visually is essential for progress in human-computer interaction. Gesture recognition has applications ranging from sign language to medical assistance to virtual reality. However, gesture recognition is extremely challenging not only because of its diverse contexts, multiple interpretations, and spatio-temporal variations but also because of the complex non-rigid properties of the hand. This study surveys major constraints on vision-based gesture recognition occurring in detection and pre-processing, representation and feature extraction, and recognition. Current challenges are explored in detail

    Objects extraction and recognition for camera-based interaction : heuristic and statistical approaches

    Get PDF
    In this thesis, heuristic and probabilistic methods are applied to a number of problems for camera-based interactions. The goal is to provide solutions for a vision based system that is able to extract and analyze interested objects in camera images and to use that information for various interactions for mobile usage. New methods and new attempts of combination of existing methods are developed for different applications, including text extraction from complex scene images, bar code reading performed by camera phones, and face/facial feature detection and facial expression manipulation. The application-driven problems of camera-based interaction can not be modeled by a uniform and straightforward model that has very strong simplifications of reality. The solutions we learned to be efficient were to apply heuristic but easy of implementation approaches at first to reduce the complexity of the problems and search for possible means, then use developed statistical learning approaches to deal with the remaining difficult but well-defined problems and get much better accuracy. The process can be evolved in some or all of the stages, and the combination of the approaches is problem-dependent. Contribution of this thesis resides in two aspects: firstly, new features and approaches are proposed either as heuristics or statistical means for concrete applications; secondly engineering design combining seveal methods for system optimization is studied. Geometrical characteristics and the alignment of text, texture features of bar codes, and structures of faces can all be extracted as heuristics for object extraction and further recognition. The boosting algorithm is one of the proper choices to perform probabilistic learning and to achieve desired accuracy. New feature selection techniques are proposed for constructing the weak learner and applying the boosting output in concrete applications. Subspace methods such as manifold learning algorithms are introduced and tailored for facial expression analysis and synthesis. A modified generalized learning vector quantization method is proposed to deal with the blurring of bar code images. Efficient implementations that combine the approaches in a rational joint point are presented and the results are illustrated.reviewe

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Facial Expression Analysis under Partial Occlusion: A Survey

    Full text link
    Automatic machine-based Facial Expression Analysis (FEA) has made substantial progress in the past few decades driven by its importance for applications in psychology, security, health, entertainment and human computer interaction. The vast majority of completed FEA studies are based on non-occluded faces collected in a controlled laboratory environment. Automatic expression recognition tolerant to partial occlusion remains less understood, particularly in real-world scenarios. In recent years, efforts investigating techniques to handle partial occlusion for FEA have seen an increase. The context is right for a comprehensive perspective of these developments and the state of the art from this perspective. This survey provides such a comprehensive review of recent advances in dataset creation, algorithm development, and investigations of the effects of occlusion critical for robust performance in FEA systems. It outlines existing challenges in overcoming partial occlusion and discusses possible opportunities in advancing the technology. To the best of our knowledge, it is the first FEA survey dedicated to occlusion and aimed at promoting better informed and benchmarked future work.Comment: Authors pre-print of the article accepted for publication in ACM Computing Surveys (accepted on 02-Nov-2017

    A new framework for sign language alphabet hand posture recognition using geometrical features through artificial neural network (part 1)

    Get PDF
    Hand pose tracking is essential in sign languages. An automatic recognition of performed hand signs facilitates a number of applications, especially for people with speech impairment to communication with normal people. This framework which is called ASLNN proposes a new hand posture recognition technique for the American sign language alphabet based on the neural network which works on the geometrical feature extraction of hands. A user’s hand is captured by a three-dimensional depth-based sensor camera; consequently, the hand is segmented according to the depth analysis features. The proposed system is called depth-based geometrical sign language recognition as named DGSLR. The DGSLR adopted in easier hand segmentation approach, which is further used in segmentation applications. The proposed geometrical feature extraction framework improves the accuracy of recognition due to unchangeable features against hand orientation compared to discrete cosine transform and moment invariant. The findings of the iterations demonstrate the combination of the extracted features resulted to improved accuracy rates. Then, an artificial neural network is used to drive desired outcomes. ASLNN is proficient to hand posture recognition and provides accuracy up to 96.78% which will be discussed on the additional paper of this authors in this journal

    Ubiquitous Technologies for Emotion Recognition

    Get PDF
    Emotions play a very important role in how we think and behave. As such, the emotions we feel every day can compel us to act and influence the decisions and plans we make about our lives. Being able to measure, analyze, and better comprehend how or why our emotions may change is thus of much relevance to understand human behavior and its consequences. Despite the great efforts made in the past in the study of human emotions, it is only now, with the advent of wearable, mobile, and ubiquitous technologies, that we can aim to sense and recognize emotions, continuously and in real time. This book brings together the latest experiences, findings, and developments regarding ubiquitous sensing, modeling, and the recognition of human emotions
    • …
    corecore