2,100 research outputs found
A CNN Based Framework for Unistroke Numeral Recognition in Air-Writing
Air-writing refers to virtually writing linguistic characters through hand
gestures in three-dimensional space with six degrees of freedom. This paper
proposes a generic video camera-aided convolutional neural network (CNN) based
air-writing framework. Gestures are performed using a marker of fixed color in
front of a generic video camera, followed by color-based segmentation to
identify the marker and track the trajectory of the marker tip. A pre-trained
CNN is then used to classify the gesture. The recognition accuracy is further
improved using transfer learning with the newly acquired data. The performance
of the system varies significantly on the illumination condition due to
color-based segmentation. In a less fluctuating illumination condition, the
system is able to recognize isolated unistroke numerals of multiple languages.
The proposed framework has achieved 97.7%, 95.4% and 93.7% recognition rates in
person independent evaluations on English, Bengali and Devanagari numerals,
respectively.Comment: Accepted in The International Conference on Frontiers of Handwriting
Recognition (ICFHR) 201
Collaborative robot control with hand gestures
Mestrado de dupla diplomação com a Université Libre de TunisThis thesis focuses on hand gesture recognition by proposing an architecture to control a collaborative robot in real-time vision based on hand detection, tracking, and gesture recognition for interaction with an application via hand gestures. The first stage of our system allows detecting and tracking a bar e hand in a cluttered background using skin detection and contour comparison. The second stage allows recognizing hand gestures using a Machine learning method algorithm. Finally an interface has been developed to control the robot over.
Our hand gesture recognition system consists of two parts, in the first part for every frame captured from a camera we extract the keypoints for every training image using a machine learning algorithm, and we appoint the keypoints from every image into a keypoint map. This map is treated as an input for our processing algorithm which uses several methods to recognize the fingers in each hand.
In the second part, we use a 3D camera with Infrared capabilities to get a 3D model of the hand to implement it in our system, after that we track the fingers in each hand and recognize them which made it possible to count the extended fingers and to distinguish each finger pattern.
An interface to control the robot has been made that utilizes the previous steps that gives a real-time process and a dynamic 3D representation.Esta dissertação trata do reconhecimento de gestos realizados com a mão humana, propondo uma arquitetura para interagir com um robô colaborativo, baseado em visão computacional, rastreamento e reconhecimento de gestos. O primeiro estágio do sistema desenvolvido permite detectar e rastrear a presença de uma mão em um fundo desordenado usando detecção de pele e comparação de contornos. A segunda fase permite reconhecer os gestos das mãos usando um algoritmo do método de aprendizado de máquina. Finalmente, uma interface foi desenvolvida para interagir com robô. O sistema de reconhecimento de gestos manuais está dividido em duas partes. Na primeira parte, para cada quadro capturado de uma câmera, foi extraÃdo os pontos-chave de cada imagem de treinamento usando um algoritmo de aprendizado de máquina e nomeamos os pontos-chave de cada imagem em um mapa de pontos-chave. Este mapa é tratado como uma entrada para o algoritmo de processamento que usa vários métodos para reconhecer os dedos em cada mão. Na segunda parte, foi utilizado uma câmera 3D com recursos de infravermelho para obter um modelo 3D da mão para implementá-lo em no sistema desenvolvido, e então, foi realizado os rastreio dos dedos de cada mão seguido pelo reconhecimento que possibilitou contabilizar os dedos estendidos e para distinguir cada padrão de dedo. Foi elaborado uma interface para interagir com o robô manipulador que utiliza as etapas anteriores que fornece um processo em tempo real e uma representação 3D dinâmica
To Draw or Not to Draw: Recognizing Stroke-Hover Intent in Gesture-Free Bare-Hand Mid-Air Drawing Tasks
Over the past several decades, technological advancements have introduced new modes of communication
with the computers, introducing a shift from traditional mouse and keyboard interfaces.
While touch based interactions are abundantly being used today, latest developments in computer
vision, body tracking stereo cameras, and augmented and virtual reality have now enabled communicating
with the computers using spatial input in the physical 3D space. These techniques are now
being integrated into several design critical tasks like sketching, modeling, etc. through sophisticated
methodologies and use of specialized instrumented devices. One of the prime challenges in
design research is to make this spatial interaction with the computer as intuitive as possible for the
users.
Drawing curves in mid-air with fingers, is a fundamental task with applications to 3D sketching,
geometric modeling, handwriting recognition, and authentication. Sketching in general, is a
crucial mode for effective idea communication between designers. Mid-air curve input is typically
accomplished through instrumented controllers, specific hand postures, or pre-defined hand gestures,
in presence of depth and motion sensing cameras. The user may use any of these modalities
to express the intention to start or stop sketching. However, apart from suffering with issues like
lack of robustness, the use of such gestures, specific postures, or the necessity of instrumented
controllers for design specific tasks further result in an additional cognitive load on the user.
To address the problems associated with different mid-air curve input modalities, the presented
research discusses the design, development, and evaluation of data driven models for intent recognition
in non-instrumented, gesture-free, bare-hand mid-air drawing tasks.
The research is motivated by a behavioral study that demonstrates the need for such an approach
due to the lack of robustness and intuitiveness while using hand postures and instrumented
devices. The main objective is to study how users move during mid-air sketching, develop qualitative
insights regarding such movements, and consequently implement a computational approach to
determine when the user intends to draw in mid-air without the use of an explicit mechanism (such
as an instrumented controller or a specified hand-posture). By recording the user’s hand trajectory,
the idea is to simply classify this point as either hover or stroke. The resulting model allows for
the classification of points on the user’s spatial trajectory.
Drawing inspiration from the way users sketch in mid-air, this research first specifies the necessity
for an alternate approach for processing bare hand mid-air curves in a continuous fashion.
Further, this research presents a novel drawing intent recognition work flow for every recorded
drawing point, using three different approaches. We begin with recording mid-air drawing data
and developing a classification model based on the extracted geometric properties of the recorded
data. The main goal behind developing this model is to identify drawing intent from critical geometric
and temporal features. In the second approach, we explore the variations in prediction
quality of the model by improving the dimensionality of data used as mid-air curve input. Finally,
in the third approach, we seek to understand the drawing intention from mid-air curves using
sophisticated dimensionality reduction neural networks such as autoencoders. Finally, the broad
level implications of this research are discussed, with potential development areas in the design
and research of mid-air interactions
Deep Belief Networks for Recognizing Handwriting Captured by Leap Motion Controller
Leap Motion controller is an input device that can track hands and fingers position quickly and precisely. In some gaming environment, a need may arise to capture letters written in the air by Leap Motion, which cannot be directly done right now. In this paper, we propose an approach to capture and recognize which letter has been drawn by the user with Leap Motion. This approach is based on Deep Belief Networks (DBN) with Resilient Backpropagation (Rprop) fine-tuning. To assess the performance of our proposed approach, we conduct experiments involving 30,000 samples of handwritten capital letters, 8,000 of which are to be recognized. Our experiments indicate that DBN with Rprop achieves an accuracy of 99.71%, which is better than DBN with Backpropagation or Multi-Layer Perceptron (MLP), either with Backpropagation or with Rprop. Our experiments also show that Rprop makes the process of fine-tuning significantly faster and results in a much more accurate recognition compared to ordinary Backpropagation. The time needed to recognize a letter is in the order of 5,000 microseconds, which is excellent even for online gaming experience
Accessible options for deaf people in e-Learning platforms: technology solutions for sign language translation
AbstractThis paper presents a study on potential technology solutions for enhancing the communication process for deaf people on e-learning platforms through translation of Sign Language (SL). Considering SL in its global scope as a spatial-visual language not limited to gestures or hand/forearm movement, but also to other non-dexterity markers such as facial expressions, it is necessary to ascertain whether the existing technology solutions can be effective options for the SL integration on e-learning platforms. Thus, we aim to present a list of potential technology options for the recognition, translation and presentation of SL (and potential problems) through the analysis of assistive technologies, methods and techniques, and ultimately to contribute for the development of the state of the art and ensure digital inclusion of the deaf people in e-learning platforms. The analysis show that some interesting technology solutions are under research and development to be available for digital platforms in general, but yet some critical challenges must solved and an effective integration of these technologies in e-learning platforms in particular is still missing
A Pervasive Middleware for Activity Recognition with Smartphones
Title from PDF of title page, viewed on August 28, 2015Thesis advisor: Yugyung LeeVitaIncludes bibliographic references (pages 61-67)Thesis (M.S.)--School of Computing and Engineering. University of Missouri--Kansas City, 2015Activity Recognition (AR) is an important research topic in pervasive computing. With the rapid increase in the use of pervasive devices, huge sensor data is generated from diverse devices on a daily basis. Analysis of the sensor data is a significant area of research for AR. There are several devices and techniques available for AR, but the increasing number of sensor devices and data demands new approaches for adaptive, lightweight and accurate AR. We propose a new middleware called the Pervasive Middleware for Activity Recognition (PEMAR) to address these problems. We implemented PEMAR on a Big Data platform incorporating machine-learning techniques to make it adaptive and accurate for the AR of sensor data. The middleware is composed of the following: (1) Filtering and Segmentation to detect different activities; (2) A human centered adaptive approach to create accurate personal models, leveraging on the existing impersonal models; (3) An activity library to serve different mobile applications; and (4) Activity Recognition services to accurately perform AR. We evaluated recognition accuracy of PEMAR using a generated dataset (15 activities, 50 subjects) and USC-Human Activity Dataset (12 activities, 14 subjects) and observed a better accuracy for personal trained AR compared to impersonal trained AR. We tested the applicability and adaptivity of PEMAR by using several motion based applications.Introduction -- Related work -- Middleware for gesture recognition -- Implementation and applications -- Results and evaluation -- Conclusion and future wor
- …