44 research outputs found
End-to-End Multiview Gesture Recognition for Autonomous Car Parking System
The use of hand gestures can be the most intuitive human-machine interaction medium.
The early approaches for hand gesture recognition used device-based methods. These
methods use mechanical or optical sensors attached to a glove or markers, which hinders
the natural human-machine communication. On the other hand, vision-based methods are
not restrictive and allow for a more spontaneous communication without the need of an
intermediary between human and machine. Therefore, vision gesture recognition has been
a popular area of research for the past thirty years.
Hand gesture recognition finds its application in many areas, particularly the automotive
industry where advanced automotive human-machine interface (HMI) designers are
using gesture recognition to improve driver and vehicle safety. However, technology advances
go beyond active/passive safety and into convenience and comfort. In this context,
one of America’s big three automakers has partnered with the Centre of Pattern Analysis
and Machine Intelligence (CPAMI) at the University of Waterloo to investigate expanding
their product segment through machine learning to provide an increased driver convenience
and comfort with the particular application of hand gesture recognition for autonomous
car parking.
In this thesis, we leverage the state-of-the-art deep learning and optimization techniques
to develop a vision-based multiview dynamic hand gesture recognizer for self-parking system.
We propose a 3DCNN gesture model architecture that we train on a publicly available
hand gesture database. We apply transfer learning methods to fine-tune the pre-trained
gesture model on a custom-made data, which significantly improved the proposed system
performance in real world environment. We adapt the architecture of the end-to-end solution
to expand the state of the art video classifier from a single image as input (fed by
monocular camera) to a multiview 360 feed, offered by a six cameras module. Finally, we
optimize the proposed solution to work on a limited resources embedded platform (Nvidia
Jetson TX2) that is used by automakers for vehicle-based features, without sacrificing the
accuracy robustness and real time functionality of the system
Various Approaches of Support vector Machines and combined Classifiers in Face Recognition
In this paper we present the various approaches used in face recognition from 2001-2012.because in last decade face recognition is using in many fields like Security sectors, identity authentication. Today we need correct and speedy performance in face recognition. This time the face recognition technology is in matured stage because research is conducting continuously in this field. Some extensions of Support vector machine (SVM) is reviewed that gives amazing performance in face recognition.Here we also review some papers of combined classifier approaches that is also a dynamic research area in a pattern recognition
A Real Time Hand Gesture Recognition System Based on DFT and SVM
[[abstract]]Vision based band gesture recognition provides a more nature and powerful means for human-computer interaction. A fast detection process of hand gesture and an effective feature extraction process are presented. The proposed a hand gesture recognition algorithm comprises four main steps. First use Cam-shift algorithm to track skin color after closing process. Second, in order to extract feature, we use BEA to extract the boundary of the hand. Third, the benefits of Fourier descriptor are invariance to the starting point of the boundary, deformation, and rotation, and therefore transform the starting point of the boundary by Fourier transformation. Finally, outline feature for the nonlinear non-separable type of data was classified by using SVM. Experimental results showed the accuracy is 93.4% in average and demonstrated the feasibility of proposed system.[[incitationindex]]EI[[booktype]]電子版[[booktype]]紙
Data Leakage and Evaluation Issues in Micro-Expression Analysis
Micro-expressions have drawn increasing interest lately due to various
potential applications. The task is, however, difficult as it incorporates many
challenges from the fields of computer vision, machine learning and emotional
sciences. Due to the spontaneous and subtle characteristics of
micro-expressions, the available training and testing data are limited, which
make evaluation complex. We show that data leakage and fragmented evaluation
protocols are issues among the micro-expression literature. We find that fixing
data leaks can drastically reduce model performance, in some cases even making
the models perform similarly to a random classifier. To this end, we go through
common pitfalls, propose a new standardized evaluation protocol using facial
action units with over 2000 micro-expression samples, and provide an open
source library that implements the evaluation protocols in a standardized
manner. Code will be available in \url{https://github.com/tvaranka/meb}
Avaliação comparativa de técnicas para reconhecimento de gestos estáticos e dinâmicos com foco em precisão e desempenho
Dissertação (mestrado) - Universidade Federal de Santa Catarina, Centro Tecnológico, Programa de Pós-Graduação em Ciência da Computação, Florianópolis, 2010É um comportamento comum aos seres humanos utilizar gestos como forma de expressão, como um complemento à fala ou como uma forma auto-contida de comunicação. No campo da Interação Humano-Computador, esse comportamento pode ser adotado na construção de interfaces alternativas, objetivando facilitar o relacionamento entre os elementos humano e computacional. Atualmente, várias técnicas para reconhecimento de gestos são descritas na literatura; porém, as validações dessas técnicas são executadas de maneira isolada, o que dificulta a comparação entre elas. Para reduzir essa lacuna, este trabalho apresenta uma comparação entre técnicas estabelecidas para o reconhecimento de gestos estáticos (posturas) e gestos dinâmicos (trajetórias). Essas técnicas são organizadas de forma a avaliar um conjunto de dados comum, adquirido por meio de uma luva instrumentada e um rastreador de movimento, gerando resultados em termos de precisão e desempenho. Especificamente para trajetórias, o processo de avaliação considera técnicas conhecidas (redes neurais e modelos ocultos de Markov) e uma nova heurística baseada em autômatos finitos determinísticos, idealizada e desenvolvida pelos autores. Os resultados obtidos mostram que o classificador baseado em uma SVM (Support Vector Machine) apresentou a melhor generalização, com as melhores taxas de reconhecimento para posturas. Para trajetórias, por sua vez, o classificador baseado em uma rede neural gerou os melhores resultados. Em termos de desempenho, todos os métodos apresentaram resultados suficientemente rápidos para serem usados de forma interativa. Finalmente, o presente trabalho identifica e discute um conjunto de critérios relevantes que deve ser observado nas etapas de construção, treinamento e avaliação dos classificadores, e sua relação com os resultados finais
Handshape recognition using principal component analysis and convolutional neural networks applied to sign language
Handshape recognition is an important problem in computer vision with significant societal impact. However, it is not an easy task, since hands are naturally deformable objects. Handshape recognition contains open problems, such as low accuracy or low speed, and despite a large number of proposed approaches, no solution has been found to solve these open problems. In this thesis, a new image dataset for Irish Sign Language (ISL) recognition is introduced. A deeper study using only 2D images is presented on Principal Component Analysis (PCA) in two stages. A comparison between approaches that do not need features (known as end-to-end) and feature-based approaches is carried out.
The dataset was collected by filming six human subjects performing ISL handshapes and movements. Frames from the videos were extracted. Afterwards the redundant images were filtered with an iterative image selection process that selects the images which keep the dataset
diverse.
The accuracy of PCA can be improved using blurred images and interpolation. Interpolation is only feasible with a small number of points. For this reason two-stage PCA is proposed. In other words, PCA is applied to another PCA space. This makes the interpolation possible and improves the accuracy in recognising a shape at a translation and rotation unknown in the training stage.
Finally classification is done with two different approaches: (1) End-to-end approaches and (2) feature-based approaches. For (1) Convolutional Neural Networks (CNNs) and other classifiers are tested directly over raw pixels, whereas for (2) PCA is mostly used to extract features and again different algorithms are tested for classification. Finally, results are presented showing accuracy and speed for (1) and (2) and how blurring affects the accuracy