551 research outputs found

    New Method for Optimization of License Plate Recognition system with Use of Edge Detection and Connected Component

    Full text link
    License Plate recognition plays an important role on the traffic monitoring and parking management systems. In this paper, a fast and real time method has been proposed which has an appropriate application to find tilt and poor quality plates. In the proposed method, at the beginning, the image is converted into binary mode using adaptive threshold. Then, by using some edge detection and morphology operations, plate number location has been specified. Finally, if the plat has tilt, its tilt is removed away. This method has been tested on another paper data set that has different images of the background, considering distance, and angel of view so that the correct extraction rate of plate reached at 98.66%.Comment: 3rd IEEE International Conference on Computer and Knowledge Engineering (ICCKE 2013), October 31 & November 1, 2013, Ferdowsi Universit Mashha

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency

    Computational Models for the Automatic Learning and Recognition of Irish Sign Language

    Get PDF
    This thesis presents a framework for the automatic recognition of Sign Language sentences. In previous sign language recognition works, the issues of; user independent recognition, movement epenthesis modeling and automatic or weakly supervised training have not been fully addressed in a single recognition framework. This work presents three main contributions in order to address these issues. The first contribution is a technique for user independent hand posture recognition. We present a novel eigenspace Size Function feature which is implemented to perform user independent recognition of sign language hand postures. The second contribution is a framework for the classification and spotting of spatiotemporal gestures which appear in sign language. We propose a Gesture Threshold Hidden Markov Model (GT-HMM) to classify gestures and to identify movement epenthesis without the need for explicit epenthesis training. The third contribution is a framework to train the hand posture and spatiotemporal models using only the weak supervision of sign language videos and their corresponding text translations. This is achieved through our proposed Multiple Instance Learning Density Matrix algorithm which automatically extracts isolated signs from full sentences using the weak and noisy supervision of text translations. The automatically extracted isolated samples are then utilised to train our spatiotemporal gesture and hand posture classifiers. The work we present in this thesis is an important and significant contribution to the area of natural sign language recognition as we propose a robust framework for training a recognition system without the need for manual labeling

    Hand gesture recognition using Kinect.

    Get PDF
    Hand gesture recognition (HGR) is an important research topic because some situations require silent communication with sign languages. Computational HGR systems assist silent communication, and help people learn a sign language. In this thesis. a novel method for contact-less HGR using Microsoft Kinect for Xbox is described, and a real-time HCR system is implemented with Microsoft Visual Studio 2010. Two different scenarios for HGR are provided: the Popular Gesture with nine gestures, and the Numbers with nine gestures. The system allows the users to select a scenario, and it is able to detect hand gestures made by users. to identify fingers, and to recognize the meanings of gestures, and to display the meanings and pictures on screen. The accuracy of the HGR system is from 84% to 99% with single hand gestures, and from 90% to 100% if both hands perform the same gesture at the same time. Because the depth sensor of Kinect is an infrared camera, the lighting conditions. signers\u27 skin colors and clothing, and background have little impact on the performance of this system. The accuracy and the robustness make this system a versatile component that can be integrated in a variety of applications in daily life

    Towards Sign language recognition system in Human-Computer interactions

    Get PDF
    This paper presents a modeling approach for Sign language recognition which is introduced through a dialogue between deaf people and signing avatar. By this modeling approach, the system can be configurable as the scenario and the vocabulary can be modified or changed. Tow important elements are included to these modeling: the context and the prediction. The objective is to improve the reliability of sign language recognition system compared to the classic systems which generally do not use semantic concept

    Automatic Sign Language Recognition from Image Data

    Get PDF
    Tato práce se zabývá problematikou automatického rozpoznávání znakového jazyka z obrazových dat. Práce představuje pět hlavních přínosů v oblasti tvorby systému pro rozpoznávání, tvorby korpusů, extrakci příznaků z rukou a obličeje s využitím metod pro sledování pozice a pohybu rukou (tracking) a modelování znaků s využitím menších fonetických jednotek (sub-units). Metody využité v rozpoznávacím systému byly využity i k tvorbě vyhledávacího nástroje "search by example", který dokáže vyhledávat ve videozáznamech podle obrázku ruky. Navržený systém pro automatické rozpoznávání znakového jazyka je založen na statistickém přístupu s využitím skrytých Markovových modelů, obsahuje moduly pro analýzu video dat, modelování znaků a dekódování. Systém je schopen rozpoznávat jak izolované, tak spojité promluvy. Veškeré experimenty a vyhodnocení byly provedeny s vlastními korpusy UWB-06-SLR-A a UWB-07-SLR-P, první z nich obsahuje 25 znaků, druhý 378. Základní extrakce příznaků z video dat byla provedena na nízkoúrovňových popisech obrazu. Lepších výsledků bylo dosaženo s příznaky získaných z popisů vyšší úrovně porozumění obsahu v obraze, které využívají sledování pozice rukou a metodu pro segmentaci rukou v době překryvu s obličejem. Navíc, využitá metoda dokáže interpolovat obrazy s obličejem v době překryvu a umožňuje tak využít metody pro extrakci příznaků z obličeje, které by během překryvu nefungovaly, jako např. metoda active appearance models (AAM). Bylo porovnáno několik různých metod pro extrakci příznaků z rukou, jako např. local binary patterns (LBP), histogram of oriented gradients (HOG), vysokoúrovnové lingvistické příznaky a nové navržená metoda hand shape radial distance function (hRDF). Bylo také zkoumáno využití menších fonetických jednotek, než jsou celé znaky, tzv. sub-units. Pro první krok tvorby těchto jednotek byl navržen iterativní algoritmus, který tyto jednotky automaticky vytváří analýzou existujících dat. Bylo ukázáno, že tento koncept je vhodný pro modelování a rozpoznávání znaků. Kromě systému pro rozpoznávání je v práci navržen a představen systém "search by example", který funguje jako vyhledávací systém pro videa se záznamy znakového jazyka a může být využit například v online slovnících znakového jazyka, kde je v současné době složité či nemožné v takovýchto datech vyhledávat. Tento nástroj využívá metody, které byly použity v rozpoznávacím systému. Výstupem tohoto vyhledávacího nástroje je seřazený seznam videí, které obsahují stejný nebo podobný tvar ruky, které zadal uživatel, např. přes webkameru.Katedra kybernetikyObhájenoThis thesis addresses several issues of automatic sign language recognition, namely the creation of vision based sign language recognition framework, sign language corpora creation, feature extraction, making use of novel hand tracking with face occlusion handling, data-driven creation of sub-units and "search by example" tool for searching in sign language corpora using hand images as a search query. The proposed sign language recognition framework, based on statistical approach incorporating hidden Markov models (HMM), consists of video analysis, sign modeling and decoding modules. The framework is able to recognize both isolated signs and continuous utterances from video data. All experiments and evaluations were performed on two own corpora, UWB-06-SLR-A and UWB-07-SLR-P, the first containing 25 signs and second 378. As a baseline feature descriptors, low level image features are used. It is shown that better performance is gained by higher level features that employ hand tracking, which resolve occlusions of hands and face. As a side effect, the occlusion handling method interpolates face area in the frames during the occlusion and allows to use face feature descriptors that fail in such a case, for instance features extracted from active appearance models (AAM) tracker. Several state-of-the-art appearance-based feature descriptors were compared for tracked hands, such as local binary patterns (LBP), histogram of oriented gradients (HOG), high-level linguistic features or newly proposed hand shape radial distance function (denoted as hRDF) that enhances the feature description of hand-shape like concave regions. The concept of sub-units, that uses HMM models based on linguistic units smaller than whole sign and covers inner structures of the signs, was investigated in the proposed iterative method that is a first required step for data-driven construction of sub-units, and shows that such a concept is suitable for sign modeling and recognition tasks. Except of experiments in the sign language recognition, additional tool \textit{search by example} was created and evaluated. This tool is a search engine for sign language videos. Such a system can be incorporated into an online sign language dictionary where it is difficult to search in the sign language data. This proposed tool employs several methods which were examined in the sign language recognition task and allows to search in the video corpora based on an user-given query that consists of one or multiple images of hands. As a result, an ordered list of videos that contain the same or similar hand configurations is returned

    Pictures in Your Mind: Using Interactive Gesture-Controlled Reliefs to Explore Art

    Get PDF
    Tactile reliefs offer many benefits over the more classic raised line drawings or tactile diagrams, as depth, 3D shape, and surface textures are directly perceivable. Although often created for blind and visually impaired (BVI) people, a wider range of people may benefit from such multimodal material. However, some reliefs are still difficult to understand without proper guidance or accompanying verbal descriptions, hindering autonomous exploration. In this work, we present a gesture-controlled interactive audio guide (IAG) based on recent low-cost depth cameras that can be operated directly with the hands on relief surfaces during tactile exploration. The interactively explorable, location-dependent verbal and captioned descriptions promise rapid tactile accessibility to 2.5D spatial information in a home or education setting, to online resources, or as a kiosk installation at public places. We present a working prototype, discuss design decisions, and present the results of two evaluation studies: the first with 13 BVI test users and the second follow-up study with 14 test users across a wide range of people with differences and difficulties associated with perception, memory, cognition, and communication. The participant-led research method of this latter study prompted new, significant and innovative developments
    corecore