63 research outputs found

    3D Hand gesture recognition using a ZCam and an SVM-SMO classifier

    Get PDF
    The increasing number of new and complex computer-based applications has generated a need for a more natural interface between human users and computer-based applications. This problem can be solved by using hand gestures, one of the most natural means of communication between human beings. The difficulty in deploying a computer vision-based gesture application in a non-controlled environment can be solved by using new hardware which can capture 3D information. However, researchers and others still need complete solutions to perform reliable gesture recognition in such an environment. This paper presents a complete solution for the one-hand 3D gesture recognition problem, implements a solution, and proves its reliability. The solution is complete because it focuses both on the 3D gesture recognition and on understanding the scene being presented (so the user does not need to inform the system that he or she is about to initiate a new gesture). The selected approach models the gestures as a sequence of hand poses. This reduces the problem to one of recognizing the series of hand poses and building the gestures from this information. Additionally, the need to perform the gesture recognition in real time resulted in using a simple feature set that makes the required processing as streamlined as possible. Finally, the hand gesture recognition system proposed here was successfully implemented in two applications, one developed by a completely independent team and one developed as part of this research. The latter effort resulted in a device driver that adds 3D gestures to an open-source, platform-independent multi-touch framework called Sparsh-U

    Computational Models for the Automatic Learning and Recognition of Irish Sign Language

    Get PDF
    This thesis presents a framework for the automatic recognition of Sign Language sentences. In previous sign language recognition works, the issues of; user independent recognition, movement epenthesis modeling and automatic or weakly supervised training have not been fully addressed in a single recognition framework. This work presents three main contributions in order to address these issues. The first contribution is a technique for user independent hand posture recognition. We present a novel eigenspace Size Function feature which is implemented to perform user independent recognition of sign language hand postures. The second contribution is a framework for the classification and spotting of spatiotemporal gestures which appear in sign language. We propose a Gesture Threshold Hidden Markov Model (GT-HMM) to classify gestures and to identify movement epenthesis without the need for explicit epenthesis training. The third contribution is a framework to train the hand posture and spatiotemporal models using only the weak supervision of sign language videos and their corresponding text translations. This is achieved through our proposed Multiple Instance Learning Density Matrix algorithm which automatically extracts isolated signs from full sentences using the weak and noisy supervision of text translations. The automatically extracted isolated samples are then utilised to train our spatiotemporal gesture and hand posture classifiers. The work we present in this thesis is an important and significant contribution to the area of natural sign language recognition as we propose a robust framework for training a recognition system without the need for manual labeling

    Human-Centric Machine Vision

    Get PDF
    Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans

    Identification of Pecan Weevils Through Image Processing

    Get PDF
    The Pecan Weevil attacks the pecan nut, causes significant financial loss and can cause total crop failure. A traditional way of controlling this insect is by setting traps in the pecan orchard and regularly checking them for weevils. The objective of this study is to develop a recognition system that can serve in a wireless imaging network for monitoring pecan weevils. Recognition methods used in this study are based on template matching. The training set consisted of 205 pecan weevils and the testing set included 30 randomly selected pecan weevils and 75 other insects which typically exist in a pecan habitat. Five recognition methods, namely, Zernike moments, Region properties, Normalized cross-correlation, String matching, and Fourier descriptors methods were used in this recognition system. It was found that no single method was sufficiently robust to yield the desired recognition rate, especially in varying data sets. It was also found that region-based shape representation methods were better suited inBiosystems and Agricultural Engineerin

    Recording behaviour of indoor-housed farm animals automatically using machine vision technology: a systematic review

    Get PDF
    Large-scale phenotyping of animal behaviour traits is time consuming and has led to increased demand for technologies that can automate these procedures. Automated tracking of animals has been successful in controlled laboratory settings, but recording from animals in large groups in highly variable farm settings presents challenges. The aim of this review is to provide a systematic overview of the advances that have occurred in automated, high throughput image detection of farm animal behavioural traits with welfare and production implications. Peer-reviewed publications written in English were reviewed systematically following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. After identification, screening, and assessment for eligibility, 108 publications met these specifications and were included for qualitative synthesis. Data collected from the papers included camera specifications, housing conditions, group size, algorithm details, procedures, and results. Most studies utilized standard digital colour video cameras for data collection, with increasing use of 3D cameras in papers published after 2013. Papers including pigs (across production stages) were the most common (n = 63). The most common behaviours recorded included activity level, area occupancy, aggression, gait scores, resource use, and posture. Our review revealed many overlaps in methods applied to analysing behaviour, and most studies started from scratch instead of building upon previous work. Training and validation sample sizes were generally small (mean±s.d. groups = 3.8±5.8) and in data collection and testing took place in relatively controlled environments. To advance our ability to automatically phenotype behaviour, future research should build upon existing knowledge and validate technology under commercial settings and publications should explicitly describe recording conditions in detail to allow studies to be reproduced

    Action Recognition in Videos: from Motion Capture Labs to the Web

    Full text link
    This paper presents a survey of human action recognition approaches based on visual data recorded from a single video camera. We propose an organizing framework which puts in evidence the evolution of the area, with techniques moving from heavily constrained motion capture scenarios towards more challenging, realistic, "in the wild" videos. The proposed organization is based on the representation used as input for the recognition task, emphasizing the hypothesis assumed and thus, the constraints imposed on the type of video that each technique is able to address. Expliciting the hypothesis and constraints makes the framework particularly useful to select a method, given an application. Another advantage of the proposed organization is that it allows categorizing newest approaches seamlessly with traditional ones, while providing an insightful perspective of the evolution of the action recognition task up to now. That perspective is the basis for the discussion in the end of the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4 table

    Rethinking Pen Input Interaction: Enabling Freehand Sketching Through Improved Primitive Recognition

    Get PDF
    Online sketch recognition uses machine learning and artificial intelligence techniques to interpret markings made by users via an electronic stylus or pen. The goal of sketch recognition is to understand the intention and meaning of a particular user's drawing. Diagramming applications have been the primary beneficiaries of sketch recognition technology, as it is commonplace for the users of these tools to rst create a rough sketch of a diagram on paper before translating it into a machine understandable model, using computer-aided design tools, which can then be used to perform simulations or other meaningful tasks. Traditional methods for performing sketch recognition can be broken down into three distinct categories: appearance-based, gesture-based, and geometric-based. Although each approach has its advantages and disadvantages, geometric-based methods have proven to be the most generalizable for multi-domain recognition. Tools, such as the LADDER symbol description language, have shown to be capable of recognizing sketches from over 30 different domains using generalizable, geometric techniques. The LADDER system is limited, however, in the fact that it uses a low-level recognizer that supports only a few primitive shapes, the building blocks for describing higher-level symbols. Systems which support a larger number of primitive shapes have been shown to have questionable accuracies as the number of primitives increase, or they place constraints on how users must input shapes (e.g. circles can only be drawn in a clockwise motion; rectangles must be drawn starting at the top-left corner). This dissertation allows for a significant growth in the possibility of free-sketch recognition systems, those which place little to no drawing constraints on users. In this dissertation, we describe multiple techniques to recognize upwards of 18 primitive shapes while maintaining high accuracy. We also provide methods for producing confidence values and generating multiple interpretations, and explore the difficulties of recognizing multi-stroke primitives. In addition, we show the need for a standardized data repository for sketch recognition algorithm testing and propose SOUSA (sketch-based online user study application), our online system for performing and sharing user study sketch data. Finally, we will show how the principles we have learned through our work extend to other domains, including activity recognition using trained hand posture cues
    • 

    corecore