189 research outputs found

    Hand Pointing Detection Using Live Histogram Template of Forehead Skin

    Full text link
    Hand pointing detection has multiple applications in many fields such as virtual reality and control devices in smart homes. In this paper, we proposed a novel approach to detect pointing vector in 2D space of a room. After background subtraction, face and forehead is detected. In the second step, forehead skin H-S plane histograms in HSV space is calculated. By using these histogram templates of users skin, and back projection method, skin areas are detected. The contours of hand are extracted using Freeman chain code algorithm. Next step is finding fingertips. Points in hand contour which are candidates for the fingertip can be found in convex defects of convex hull and contour. We introduced a novel method for finding the fingertip based on the special points on the contour and their relationships. Our approach detects hand-pointing vectors in live video from a common webcam with 94%TP and 85%TN.Comment: Accepted for oral presentation in DSP201

    Multi-sensor fusion for human-robot interaction in crowded environments

    Get PDF
    For challenges associated with the ageing population, robot assistants are becoming a promising solution. Human-Robot Interaction (HRI) allows a robot to understand the intention of humans in an environment and react accordingly. This thesis proposes HRI techniques to facilitate the transition of robots from lab-based research to real-world environments. The HRI aspects addressed in this thesis are illustrated in the following scenario: an elderly person, engaged in conversation with friends, wishes to attract a robot's attention. This composite task consists of many problems. The robot must detect and track the subject in a crowded environment. To engage with the user, it must track their hand movement. Knowledge of the subject's gaze would ensure that the robot doesn't react to the wrong person. Understanding the subject's group participation would enable the robot to respect existing human-human interaction. Many existing solutions to these problems are too constrained for natural HRI in crowded environments. Some require initial calibration or static backgrounds. Others deal poorly with occlusions, illumination changes, or real-time operation requirements. This work proposes algorithms that fuse multiple sensors to remove these restrictions and increase the accuracy over the state-of-the-art. The main contributions of this thesis are: A hand and body detection method, with a probabilistic algorithm for their real-time association when multiple users and hands are detected in crowded environments; An RGB-D sensor-fusion hand tracker, which increases position and velocity accuracy by combining a depth-image based hand detector with Monte-Carlo updates using colour images; A sensor-fusion gaze estimation system, combining IR and depth cameras on a mobile robot to give better accuracy than traditional visual methods, without the constraints of traditional IR techniques; A group detection method, based on sociological concepts of static and dynamic interactions, which incorporates real-time gaze estimates to enhance detection accuracy.Open Acces

    Safe Driving using Vision-based Hand Gesture Recognition System in Non-uniform Illumination Conditions

    Get PDF
    Nowadays, there is tremendous growth in in-car interfaces for driver safety and comfort, but controlling these devices while driving requires the driver's attention. One of the solutions to reduce the number of glances at these interfaces is to design an advanced driver assistance system (ADAS). A vision-based touch-less hand gesture recognition system is proposed here for in-car human-machine interfaces (HMI). The performance of such systems is unreliable under ambient illumination conditions, which change during the course of the day. Thus, the main focus of this work was to design a system that is robust towards changing lighting conditions. For this purpose, a homomorphic filter with adaptive thresholding binarization is used. Also, gray-level edge-based segmentation ensures that it is generalized for users of different skin tones and background colors. This work was validated on selected gestures from the Cambridge Hand Gesture Database captured in five sets of non-uniform illumination conditions that closely resemble in-car illumination conditions, yielding an overall system accuracy of 91%, an average frame-by-frame accuracy of 81.38%, and a latency of 3.78 milliseconds. A prototype of the proposed system was implemented on a Raspberry Pi 3 interface together with an Android application, which demonstrated its suitability for non-critical in-car interfaces like infotainment systems

    Classification of Humans into Ayurvedic Prakruti Types using Computer Vision

    Get PDF
    Ayurveda, a 5000 years old Indian medical science, believes that the universe and hence humans are made up of five elements namely ether, fire, water, earth, and air. The three Doshas (Tridosha) Vata, Pitta, and Kapha originated from the combinations of these elements. Every person has a unique combination of Tridosha elements contributing to a person’s ‘Prakruti’. Prakruti governs the physiological and psychological tendencies in all living beings as well as the way they interact with the environment. This balance influences their physiological features like the texture and colour of skin, hair, eyes, length of fingers, the shape of the palm, body frame, strength of digestion and many more as well as the psychological features like their nature (introverted, extroverted, calm, excitable, intense, laidback), and their reaction to stress and diseases. All these features are coded in the constituents at the time of a person’s creation and do not change throughout their lifetime. Ayurvedic doctors analyze the Prakruti of a person either by assessing the physical features manually and/or by examining the nature of their heartbeat (pulse). Based on this analysis, they diagnose, prevent and cure the disease in patients by prescribing precision medicine. This project focuses on identifying Prakruti of a person by analysing his facial features like hair, eyes, nose, lips and skin colour using facial recognition techniques in computer vision. This is the first of its kind research in this problem area that attempts to bring image processing into the domain of Ayurveda

    Gesture Recognition Based on Computer Vision on a Standalone System

    Get PDF
    Our project uses computer vision methods gesture recognition in which a camera interfaced to a system captures real time images and after further processing able to recognize the gesture shown to be interpreted. Our project mainly aims at hand gestures and after extracting information we try to produce it as an audio or in some visual form. We have used adaptive background subtraction with Haar classifiers to implement segmentation then we used convex hull and convex defects along with other feature extraction algorithms to interpret the gesture. First, this is implemented on a PC or laptop and then to produce a standalone system, we have to perform all this steps on a system which is dedicated to perform only the given specified task. For this we have chosen Beaglebone Black as a platform to implement our idea. The development comes with ARM Cortex A8 processor supported by NEON processor for video and image processing. It works on a clock frequency of maximum 1 GHz. It is 32 bit processor but it can be used in thumb mode i.e. it can work in 16 bit mode. This board supports Ubuntu, Android with some modification. Our first task is to interface a camera to the board so that it can capture images and store those as matrices followed by our steps to modify the installed Operating System to our purpose and implement all the above processes so that we can come up with a system which can perform gesture recognition

    Fotofacesua: sistema de gestão fotográfica da Universidade de Aveiro

    Get PDF
    Nowadays, automation is present in basically every computational system. With the raise of Machine Learning algorithms through the years, the necessity of a human being to intervene in a system has dropped a lot. Although, in Universities, Companies and even governmental Institutions there are some systems that are have not been automatized. One of these cases, is the profile photo management, that stills requires human intervention to check if the image follows the Institution set of criteria that are obligatory to submit a new photo. FotoFaces is a system for updating the profile photos of collaborators at the University of Aveiro that allows the collaborator to submit a new photo and, automatically, through a set of image processing algorithms, decide if the photo meets a set of predifined criteria. One of the main advantages of this system is that it can be used in any institution and can be adapted to different needs by just changing the algorithms or criteria considered. This Dissertation describes some improvements implemented in the existing system, as well as some new features in terms of the available algorithms. The main contributions to the system are the following: sunglasses detection, hat detection and background analysis. For the first two, it was necessary to create a new database and label it to train, validate and test a deep transfer learning network, used to detect sunglasses and hats. In addition, several tests were performed varying the parameters of the network and using some machine learning and pre-processing techniques on the input images. Finally, the background analysis consists of the implementation and testing of 2 existing algorithms in the literature, one low level and the other deep learning. Overall, the results obtained in the improvement of the existing algorithms, as well as the performance of the new image processing modules, allowed the creation of a more robust (improved production version algorithms) and versatile (addition of new algorithms to the system) profile photo update system.Atualmente, a automação está presente em basicamente todos os sistemas computacionais. Com o aumento dos algoritmos de Aprendizagem Máquina ao longo dos anos, a necessidade de um ser humano intervir num sistema caiu bastante. Embora, em Universidades, Empresas e até Instituições governamentais, existam alguns sistemas que não foram automatizados. Um desses casos, é a gestão de fotos de perfil, que requer intervenção humana para verificar se a imagem segue o conjunto de critérios da Instituição que são obrigatórios para a submissão de uma nova foto. O FotoFaces é um sistema de atualização de fotos do perfil dos colaboradores na Universidade de Aveiro que permite ao colaborador submeter uma nova foto e, automaticamente, através de um conjunto de algoritmos de processamnto de imagem, decidir se a foto cumpre um conjunto de critérios predefinidos. Uma das principais vantagens deste sistema é que pode ser utilizado em qualquer Instituição e pode ser adaptado às diferentes necessidades alterando apenas os algoritmos ou os critérios considerados. Esta Dissertação descreve algumas melhorias implementadas no sistema existente, bem como algumas funcionalidades novas ao nível dos algoritmos disponíveis. As principais contribuições para o sistema são as seguintes: detecção de óculos de sol, detecção de chapéus e análise de background. Para as duas primeiras, foi necessário criar uma nova base de dados e rotulá-la para treinar, validar e testar uma rede de aprendizagem profunda por transferência, utilizada para detectar os óculos de sol e chapéus. Além disso, foram feitos vários testes variando os parâmetros da rede e usando algumas técnicas de aprendizagem máquina e pré-processamento sobre as imagens de entrada. Por fim, a análise do fundo consiste na implementação e teste de 2 algoritmos existentes na literatura, um de baixo nível e outro de aprendizagem profunda. Globalmente, os resultados obtidos na melhoria dos algoritmos existentes, bem como o desempenho dos novos módulos de processamneto de imagem, permitiram criar um sistema de atualização de fotos do perfil mais robusto (melhoria dos algoritmos da versão de produção) e versátil (adição de novos algoritmos ao sistema).Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Development of new intelligent autonomous robotic assistant for hospitals

    Get PDF
    Continuous technological development in modern societies has increased the quality of life and average life-span of people. This imposes an extra burden on the current healthcare infrastructure, which also creates the opportunity for developing new, autonomous, assistive robots to help alleviate this extra workload. The research question explored the extent to which a prototypical robotic platform can be created and how it may be implemented in a hospital environment with the aim to assist the hospital staff with daily tasks, such as guiding patients and visitors, following patients to ensure safety, and making deliveries to and from rooms and workstations. In terms of major contributions, this thesis outlines five domains of the development of an actual robotic assistant prototype. Firstly, a comprehensive schematic design is presented in which mechanical, electrical, motor control and kinematics solutions have been examined in detail. Next, a new method has been proposed for assessing the intrinsic properties of different flooring-types using machine learning to classify mechanical vibrations. Thirdly, the technical challenge of enabling the robot to simultaneously map and localise itself in a dynamic environment has been addressed, whereby leg detection is introduced to ensure that, whilst mapping, the robot is able to distinguish between people and the background. The fourth contribution is geometric collision prediction into stabilised dynamic navigation methods, thus optimising the navigation ability to update real-time path planning in a dynamic environment. Lastly, the problem of detecting gaze at long distances has been addressed by means of a new eye-tracking hardware solution which combines infra-red eye tracking and depth sensing. The research serves both to provide a template for the development of comprehensive mobile assistive-robot solutions, and to address some of the inherent challenges currently present in introducing autonomous assistive robots in hospital environments.Open Acces

    Computer vision methods for unconstrained gesture recognition in the context of sign language annotation

    Get PDF
    Cette thèse porte sur l'étude des méthodes de vision par ordinateur pour la reconnaissance de gestes naturels dans le contexte de l'annotation de la Langue des Signes. La langue des signes (LS) est une langue gestuelle développée par les sourds pour communiquer. Un énoncé en LS consiste en une séquence de signes réalisés par les mains, accompagnés d'expressions du visage et de mouvements du haut du corps, permettant de transmettre des informations en parallèles dans le discours. Même si les signes sont définis dans des dictionnaires, on trouve une très grande variabilité liée au contexte lors de leur réalisation. De plus, les signes sont souvent séparés par des mouvements de co-articulation. Cette extrême variabilité et l'effet de co-articulation représentent un problème important dans les recherches en traitement automatique de la LS. Il est donc nécessaire d'avoir de nombreuses vidéos annotées en LS, si l'on veut étudier cette langue et utiliser des méthodes d'apprentissage automatique. Les annotations de vidéo en LS sont réalisées manuellement par des linguistes ou experts en LS, ce qui est source d'erreur, non reproductible et extrêmement chronophage. De plus, la qualité des annotations dépend des connaissances en LS de l'annotateur. L'association de l'expertise de l'annotateur aux traitements automatiques facilite cette tâche et représente un gain de temps et de robustesse. Le but de nos recherches est d'étudier des méthodes de traitement d'images afin d'assister l'annotation des corpus vidéo: suivi des composantes corporelles, segmentation des mains, segmentation temporelle, reconnaissance de gloses. Au cours de cette thèse nous avons étudié un ensemble de méthodes permettant de réaliser l'annotation en glose. Dans un premier temps, nous cherchons à détecter les limites de début et fin de signe. Cette méthode d'annotation nécessite plusieurs traitements de bas niveau afin de segmenter les signes et d'extraire les caractéristiques de mouvement et de forme de la main. D'abord nous proposons une méthode de suivi des composantes corporelles robuste aux occultations basée sur le filtrage particulaire. Ensuite, un algorithme de segmentation des mains est développé afin d'extraire la région des mains même quand elles se trouvent devant le visage. Puis, les caractéristiques de mouvement sont utilisées pour réaliser une première segmentation temporelle des signes qui est par la suite améliorée grâce à l'utilisation de caractéristiques de forme. En effet celles-ci permettent de supprimer les limites de segmentation détectées en milieu des signes. Une fois les signes segmentés, on procède à l'extraction de caractéristiques visuelles pour leur reconnaissance en termes de gloses à l'aide de modèles phonologiques. Nous avons évalué nos algorithmes à l'aide de corpus internationaux, afin de montrer leur avantages et limitations. L'évaluation montre la robustesse de nos méthodes par rapport à la dynamique et le grand nombre d'occultations entre les différents membres. L'annotation résultante est indépendante de l'annotateur et représente un gain de robustese important.This PhD thesis concerns the study of computer vision methods for the automatic recognition of unconstrained gestures in the context of sign language annotation. Sign Language (SL) is a visual-gestural language developed by deaf communities. Continuous SL consists on a sequence of signs performed one after another involving manual and non-manual features conveying simultaneous information. Even though standard signs are defined in dictionaries, we find a huge variability caused by the context-dependency of signs. In addition signs are often linked by movement epenthesis which consists on the meaningless gesture between signs. The huge variability and the co-articulation effect represent a challenging problem during automatic SL processing. It is necessary to have numerous annotated video corpus in order to train statistical machine translators and study this language. Generally the annotation of SL video corpus is manually performed by linguists or computer scientists experienced in SL. However manual annotation is error-prone, unreproducible and time consuming. In addition de quality of the results depends on the SL annotators knowledge. Associating annotator knowledge to image processing techniques facilitates the annotation task increasing robustness and speeding up the required time. The goal of this research concerns on the study and development of image processing technique in order to assist the annotation of SL video corpus: body tracking, hand segmentation, temporal segmentation, gloss recognition. Along this PhD thesis we address the problem of gloss annotation of SL video corpus. First of all we intend to detect the limits corresponding to the beginning and end of a sign. This annotation method requires several low level approaches for performing temporal segmentation and for extracting motion and hand shape features. First we propose a particle filter based approach for robustly tracking hand and face robust to occlusions. Then a segmentation method for extracting hand when it is in front of the face has been developed. Motion is used for segmenting signs and later hand shape is used to improve the results. Indeed hand shape allows to delete limits detected in the middle of a sign. Once signs have been segmented we proceed to the gloss recognition using lexical description of signs. We have evaluated our algorithms using international corpus, in order to show their advantages and limitations. The evaluation has shown the robustness of the proposed methods with respect to high dynamics and numerous occlusions between body parts. Resulting annotation is independent on the annotator and represents a gain on annotation consistency
    corecore