63 research outputs found
3D Hand gesture recognition using a ZCam and an SVM-SMO classifier
The increasing number of new and complex computer-based applications has generated a need for a more natural interface between human users and computer-based applications. This problem can be solved by using hand gestures, one of the most natural means of communication between human beings. The difficulty in deploying a computer vision-based gesture application in a non-controlled environment can be solved by using new hardware which can capture 3D information. However, researchers and others still need complete solutions to perform reliable gesture recognition in such an environment.
This paper presents a complete solution for the one-hand 3D gesture recognition problem, implements a solution, and proves its reliability. The solution is complete because it focuses both on the 3D gesture recognition and on understanding the scene being presented (so the user does not need to inform the system that he or she is about to initiate a new gesture). The selected approach models the gestures as a sequence of hand poses. This reduces the problem to one of recognizing the series of hand poses and building the gestures from this information. Additionally, the need to perform the gesture recognition in real time resulted in using a simple feature set that makes the required processing as streamlined as possible.
Finally, the hand gesture recognition system proposed here was successfully implemented in two applications, one developed by a completely independent team and one developed as part of this research. The latter effort resulted in a device driver that adds 3D gestures to an open-source, platform-independent multi-touch framework called Sparsh-U
Computational Models for the Automatic Learning and Recognition of Irish Sign Language
This thesis presents a framework for the automatic recognition of Sign Language
sentences. In previous sign language recognition works, the issues of;
user independent recognition, movement epenthesis modeling and automatic
or weakly supervised training have not been fully addressed in a single recognition
framework. This work presents three main contributions in order to
address these issues.
The first contribution is a technique for user independent hand posture
recognition. We present a novel eigenspace Size Function feature which is
implemented to perform user independent recognition of sign language hand
postures.
The second contribution is a framework for the classification and spotting
of spatiotemporal gestures which appear in sign language. We propose a
Gesture Threshold Hidden Markov Model (GT-HMM) to classify gestures
and to identify movement epenthesis without the need for explicit epenthesis
training.
The third contribution is a framework to train the hand posture and spatiotemporal
models using only the weak supervision of sign language videos
and their corresponding text translations. This is achieved through our proposed
Multiple Instance Learning Density Matrix algorithm which automatically
extracts isolated signs from full sentences using the weak and noisy
supervision of text translations. The automatically extracted isolated samples
are then utilised to train our spatiotemporal gesture and hand posture
classifiers.
The work we present in this thesis is an important and significant contribution
to the area of natural sign language recognition as we propose a
robust framework for training a recognition system without the need for
manual labeling
Human-Centric Machine Vision
Recently, the algorithms for the processing of the visual information have greatly evolved, providing efficient and effective solutions to cope with the variability and the complexity of real-world environments. These achievements yield to the development of Machine Vision systems that overcome the typical industrial applications, where the environments are controlled and the tasks are very specific, towards the use of innovative solutions to face with everyday needs of people. The Human-Centric Machine Vision can help to solve the problems raised by the needs of our society, e.g. security and safety, health care, medical imaging, and human machine interface. In such applications it is necessary to handle changing, unpredictable and complex situations, and to take care of the presence of humans
Identification of Pecan Weevils Through Image Processing
The Pecan Weevil attacks the pecan nut, causes significant financial loss and can cause total crop failure. A traditional way of controlling this insect is by setting traps in the pecan orchard and regularly checking them for weevils. The objective of this study is to develop a recognition system that can serve in a wireless imaging network for monitoring pecan weevils. Recognition methods used in this study are based on template matching. The training set consisted of 205 pecan weevils and the testing set included 30 randomly selected pecan weevils and 75 other insects which typically exist in a pecan habitat. Five recognition methods, namely, Zernike moments, Region properties, Normalized cross-correlation, String matching, and Fourier descriptors methods were used in this recognition system. It was found that no single method was sufficiently robust to yield the desired recognition rate, especially in varying data sets. It was also found that region-based shape representation methods were better suited inBiosystems and Agricultural Engineerin
Recording behaviour of indoor-housed farm animals automatically using machine vision technology: a systematic review
Large-scale phenotyping of animal behaviour traits is time consuming and has led to increased demand for technologies that can automate these procedures. Automated tracking of animals has been successful in controlled laboratory settings, but recording from animals in large groups in highly variable farm settings presents challenges. The aim of this review is to provide a systematic overview of the advances that have occurred in automated, high throughput image detection of farm animal behavioural traits with welfare and production implications. Peer-reviewed publications written in English were reviewed systematically following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. After identification, screening, and assessment for eligibility, 108 publications met these specifications and were included for qualitative synthesis. Data collected from the papers included camera specifications, housing conditions, group size, algorithm details, procedures, and results. Most studies utilized standard digital colour video cameras for data collection, with increasing use of 3D cameras in papers published after 2013. Papers including pigs (across production stages) were the most common (n = 63). The most common behaviours recorded included activity level, area occupancy, aggression, gait scores, resource use, and posture. Our review revealed many overlaps in methods applied to analysing behaviour, and most studies started from scratch instead of building upon previous work. Training and validation sample sizes were generally small (mean±s.d. groups = 3.8±5.8) and in data collection and testing took place in relatively controlled environments. To advance our ability to automatically phenotype behaviour, future research should build upon existing knowledge and validate technology under commercial settings and publications should explicitly describe recording conditions in detail to allow studies to be reproduced
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
Rethinking Pen Input Interaction: Enabling Freehand Sketching Through Improved Primitive Recognition
Online sketch recognition uses machine learning and artificial intelligence techniques
to interpret markings made by users via an electronic stylus or pen. The
goal of sketch recognition is to understand the intention and meaning of a particular
user's drawing. Diagramming applications have been the primary beneficiaries
of sketch recognition technology, as it is commonplace for the users of these tools to
rst create a rough sketch of a diagram on paper before translating it into a machine
understandable model, using computer-aided design tools, which can then be used to
perform simulations or other meaningful tasks.
Traditional methods for performing sketch recognition can be broken down into
three distinct categories: appearance-based, gesture-based, and geometric-based. Although
each approach has its advantages and disadvantages, geometric-based methods
have proven to be the most generalizable for multi-domain recognition. Tools, such as
the LADDER symbol description language, have shown to be capable of recognizing
sketches from over 30 different domains using generalizable, geometric techniques.
The LADDER system is limited, however, in the fact that it uses a low-level recognizer
that supports only a few primitive shapes, the building blocks for describing
higher-level symbols. Systems which support a larger number of primitive shapes have
been shown to have questionable accuracies as the number of primitives increase, or
they place constraints on how users must input shapes (e.g. circles can only be drawn
in a clockwise motion; rectangles must be drawn starting at the top-left corner).
This dissertation allows for a significant growth in the possibility of free-sketch
recognition systems, those which place little to no drawing constraints on users. In
this dissertation, we describe multiple techniques to recognize upwards of 18 primitive
shapes while maintaining high accuracy. We also provide methods for producing
confidence values and generating multiple interpretations, and explore the difficulties
of recognizing multi-stroke primitives. In addition, we show the need for a standardized
data repository for sketch recognition algorithm testing and propose SOUSA
(sketch-based online user study application), our online system for performing and
sharing user study sketch data. Finally, we will show how the principles we have
learned through our work extend to other domains, including activity recognition
using trained hand posture cues
- âŠ