329 research outputs found

    Gesture passwords: concepts, methods and challenges

    Full text link
    Biometrics are a convenient alternative to traditional forms of access control such as passwords and pass-cards since they rely solely on user-specific traits. Unlike alphanumeric passwords, biometrics cannot be given or told to another person, and unlike pass-cards, are always “on-hand.” Perhaps the most well-known biometrics with these properties are: face, speech, iris, and gait. This dissertation proposes a new biometric modality: gestures. A gesture is a short body motion that contains static anatomical information and changing behavioral (dynamic) information. This work considers both full-body gestures such as a large wave of the arms, and hand gestures such as a subtle curl of the fingers and palm. For access control, a specific gesture can be selected as a “password” and used for identification and authentication of a user. If this particular motion were somehow compromised, a user could readily select a new motion as a “password,” effectively changing and renewing the behavioral aspect of the biometric. This thesis describes a novel framework for acquiring, representing, and evaluating gesture passwords for the purpose of general access control. The framework uses depth sensors, such as the Kinect, to record gesture information from which depth maps or pose features are estimated. First, various distance measures, such as the log-euclidean distance between feature covariance matrices and distances based on feature sequence alignment via dynamic time warping, are used to compare two gestures, and train a classifier to either authenticate or identify a user. In authentication, this framework yields an equal error rate on the order of 1-2% for body and hand gestures in non-adversarial scenarios. Next, through a novel decomposition of gestures into posture, build, and dynamic components, the relative importance of each component is studied. The dynamic portion of a gesture is shown to have the largest impact on biometric performance with its removal causing a significant increase in error. In addition, the effects of two types of threats are investigated: one due to self-induced degradations (personal effects and the passage of time) and the other due to spoof attacks. For body gestures, both spoof attacks (with only the dynamic component) and self-induced degradations increase the equal error rate as expected. Further, the benefits of adding additional sensor viewpoints to this modality are empirically evaluated. Finally, a novel framework that leverages deep convolutional neural networks for learning a user-specific “style” representation from a set of known gestures is proposed and compared to a similar representation for gesture recognition. This deep convolutional neural network yields significantly improved performance over prior methods. A byproduct of this work is the creation and release of multiple publicly available, user-centric (as opposed to gesture-centric) datasets based on both body and hand gestures

    Learning at the Ends: From Hand to Tool Affordances in Humanoid Robots

    Full text link
    One of the open challenges in designing robots that operate successfully in the unpredictable human environment is how to make them able to predict what actions they can perform on objects, and what their effects will be, i.e., the ability to perceive object affordances. Since modeling all the possible world interactions is unfeasible, learning from experience is required, posing the challenge of collecting a large amount of experiences (i.e., training data). Typically, a manipulative robot operates on external objects by using its own hands (or similar end-effectors), but in some cases the use of tools may be desirable, nevertheless, it is reasonable to assume that while a robot can collect many sensorimotor experiences using its own hands, this cannot happen for all possible human-made tools. Therefore, in this paper we investigate the developmental transition from hand to tool affordances: what sensorimotor skills that a robot has acquired with its bare hands can be employed for tool use? By employing a visual and motor imagination mechanism to represent different hand postures compactly, we propose a probabilistic model to learn hand affordances, and we show how this model can generalize to estimate the affordances of previously unseen tools, ultimately supporting planning, decision-making and tool selection tasks in humanoid robots. We present experimental results with the iCub humanoid robot, and we publicly release the collected sensorimotor data in the form of a hand posture affordances dataset.Comment: dataset available at htts://vislab.isr.tecnico.ulisboa.pt/, IEEE International Conference on Development and Learning and on Epigenetic Robotics (ICDL-EpiRob 2017

    An Efficient Deep Convolutional Neural Network Model For Yoga Pose Recognition Using Single Images

    Full text link
    Pose recognition deals with designing algorithms to locate human body joints in a 2D/3D space and run inference on the estimated joint locations for predicting the poses. Yoga poses consist of some very complex postures. It imposes various challenges on the computer vision algorithms like occlusion, inter-class similarity, intra-class variability, viewpoint complexity, etc. This paper presents YPose, an efficient deep convolutional neural network (CNN) model to recognize yoga asanas from RGB images. The proposed model consists of four steps as follows: (a) first, the region of interest (ROI) is segmented using segmentation based approaches to extract the ROI from the original images; (b) second, these refined images are passed to a CNN architecture based on the backbone of EfficientNets for feature extraction; (c) third, dense refinement blocks, adapted from the architecture of densely connected networks are added to learn more diversified features; and (d) fourth, global average pooling and fully connected layers are applied for the classification of the multi-level hierarchy of the yoga poses. The proposed model has been tested on the Yoga-82 dataset. It is a publicly available benchmark dataset for yoga pose recognition. Experimental results show that the proposed model achieves the state-of-the-art on this dataset. The proposed model obtained an accuracy of 93.28%, which is an improvement over the earlier state-of-the-art (79.35%) with a margin of approximately 13.9%. The code will be made publicly available

    Extraction of biomedical indicators from gait videos

    Get PDF
    Gait has been an extensively investigated topic in recent years. Through the analysis of gait it is possible to detect pathologies, which makes this analysis very important to assess anomalies and, consequently, help in the diagnosis and rehabilitation of patients. There are some systems for analyzing gait, but they are usually either systems with subjective evaluations or systems used in specialized laboratories with complex equipment, which makes them very expensive and inaccessible. However, there has been a significant effort of making available simpler and more accurate systems for gait analysis and classification. This dissertation reviews recent gait analysis and classification systems, presents a new database with videos of 21 subjects, simulating 4 different pathologies as well as normal gait, and also presents a web application that allows the user to remotely access an automatic classification system and thus obtain the expected classification and heatmaps for the given input. The classification system is based on the use of gait representation images such as the Gait Energy Image (GEI) and the Skeleton Gait Energy Image (SEI), which are used as input to a VGG-19 Convolutional Neural Network (CNN) that is used to perform classification. This classification system is a vision-based system. To sum up, the developed web application aims to show the usefulness of the classification system, making it possible for anyone to access it.A marcha tem sido um tema muito investigado nos últimos anos. Através da análise da marcha é possível detetar patologias, o que torna esta análise muito importante para avaliar anómalias e consequentemente, ajudar no diagnóstico e na reabilitação dos pacientes. Existem alguns sistemas para analisar a marcha, mas habitualmente, ou estão sujeitos a uma interpretação subjetiva, ou são sistemas usados em laboratórios especializados com equipamento complexo, o que os torna muito dispendiosos e inacessíveis. No entanto, tem havido um esforço significativo com o objectivo de disponibilizar sistemas mais simples e mais precisos para análise e classificação da marcha. Esta dissertação revê os sistemas de análise e classificação da marcha desenvolvidos recentemente, apresenta uma nova base de dados com vídeos de 21 sujeitos, a simular 4 patologias diferentes bem como marcha normal, e apresenta também uma aplicação web que permite ao utilizador aceder remotamente a um sistema automático de classificação e assim, obter a classificação prevista e mapas de características respectivos de acordo com a entrada dada. O sistema de classificação baseia-se no uso de imagens de representação da marcha como a "Gait Energy Image" (GEI) e "Skeleton Gait Energy Image" (SEI), que são usadas como entrada numa rede neuronal convolucional VGG-19 que é usada para realizar a classificação. Este sistema de classificação corresponde a um sistema baseado na visão. Em suma, a aplicação web desenvolvida tem como finalidade mostrar a utilidade do sistema de classificação, tornando possível o acesso a qualquer pessoa

    Human shape modelling for carried object detection and segmentation

    Get PDF
    La détection des objets transportés est un des prérequis pour développer des systèmes qui cherchent à comprendre les activités impliquant des personnes et des objets. Cette thèse présente de nouvelles méthodes pour détecter et segmenter les objets transportés dans des vidéos de surveillance. Les contributions sont divisées en trois principaux chapitres. Dans le premier chapitre, nous introduisons notre détecteur d’objets transportés, qui nous permet de détecter un type générique d’objets. Nous formulons la détection d’objets transportés comme un problème de classification de contours. Nous classifions le contour des objets mobiles en deux classes : objets transportés et personnes. Un masque de probabilités est généré pour le contour d’une personne basé sur un ensemble d’exemplaires (ECE) de personnes qui marchent ou se tiennent debout de différents points de vue. Les contours qui ne correspondent pas au masque de probabilités généré sont considérés comme des candidats pour être des objets transportés. Ensuite, une région est assignée à chaque objet transporté en utilisant la Coupe Biaisée Normalisée (BNC) avec une probabilité obtenue par une fonction pondérée de son chevauchement avec l’hypothèse du masque de contours de la personne et du premier plan segmenté. Finalement, les objets transportés sont détectés en appliquant une Suppression des Non-Maxima (NMS) qui élimine les scores trop bas pour les objets candidats. Le deuxième chapitre de contribution présente une approche pour détecter des objets transportés avec une méthode innovatrice pour extraire des caractéristiques des régions d’avant-plan basée sur leurs contours locaux et l’information des super-pixels. Initiallement, un objet bougeant dans une séquence vidéo est segmente en super-pixels sous plusieurs échelles. Ensuite, les régions ressemblant à des personnes dans l’avant-plan sont identifiées en utilisant un ensemble de caractéristiques extraites de super-pixels dans un codebook de formes locales. Ici, les régions ressemblant à des humains sont équivalentes au masque de probabilités de la première méthode (ECE). Notre deuxième détecteur d’objets transportés bénéficie du nouveau descripteur de caractéristiques pour produire une carte de probabilité plus précise. Les compléments des super-pixels correspondants aux régions ressemblant à des personnes dans l’avant-plan sont considérés comme une carte de probabilité des objets transportés. Finalement, chaque groupe de super-pixels voisins avec une haute probabilité d’objets transportés et qui ont un fort support de bordure sont fusionnés pour former un objet transporté. Finalement, dans le troisième chapitre, nous présentons une méthode pour détecter et segmenter les objets transportés. La méthode proposée adopte le nouveau descripteur basé sur les super-pixels pour iii identifier les régions ressemblant à des objets transportés en utilisant la modélisation de la forme humaine. En utilisant l’information spatio-temporelle des régions candidates, la consistance des objets transportés récurrents, vus dans le temps, est obtenue et sert à détecter les objets transportés. Enfin, les régions d’objets transportés sont raffinées en intégrant de l’information sur leur apparence et leur position à travers le temps avec une extension spatio-temporelle de GrabCut. Cette étape finale sert à segmenter avec précision les objets transportés dans les séquences vidéo. Nos méthodes sont complètement automatiques, et font des suppositions minimales sur les personnes, les objets transportés, et les les séquences vidéo. Nous évaluons les méthodes décrites en utilisant deux ensembles de données, PETS 2006 et i-Lids AVSS. Nous évaluons notre détecteur et nos méthodes de segmentation en les comparant avec l’état de l’art. L’évaluation expérimentale sur les deux ensembles de données démontre que notre détecteur d’objets transportés et nos méthodes de segmentation surpassent de façon significative les algorithmes compétiteurs.Detecting carried objects is one of the requirements for developing systems that reason about activities involving people and objects. This thesis presents novel methods to detect and segment carried objects in surveillance videos. The contributions are divided into three main chapters. In the first, we introduce our carried object detector which allows to detect a generic class of objects. We formulate carried object detection in terms of a contour classification problem. We classify moving object contours into two classes: carried object and person. A probability mask for person’s contours is generated based on an ensemble of contour exemplars (ECE) of walking/standing humans in different viewing directions. Contours that are not falling in the generated hypothesis mask are considered as candidates for carried object contours. Then, a region is assigned to each carried object candidate contour using Biased Normalized Cut (BNC) with a probability obtained by a weighted function of its overlap with the person’s contour hypothesis mask and segmented foreground. Finally, carried objects are detected by applying a Non-Maximum Suppression (NMS) method which eliminates the low score carried object candidates. The second contribution presents an approach to detect carried objects with an innovative method for extracting features from foreground regions based on their local contours and superpixel information. Initially, a moving object in a video frame is segmented into multi-scale superpixels. Then human-like regions in the foreground area are identified by matching a set of extracted features from superpixels against a codebook of local shapes. Here the definition of human like regions is equivalent to a person’s probability map in our first proposed method (ECE). Our second carried object detector benefits from the novel feature descriptor to produce a more accurate probability map. Complement of the matching probabilities of superpixels to human-like regions in the foreground are considered as a carried object probability map. At the end, each group of neighboring superpixels with a high carried object probability which has strong edge support is merged to form a carried object. Finally, in the third contribution we present a method to detect and segment carried objects. The proposed method adopts the new superpixel-based descriptor to identify carried object-like candidate regions using human shape modeling. Using spatio-temporal information of the candidate regions, consistency of recurring carried object candidates viewed over time is obtained and serves to detect carried objects. Last, the detected carried object regions are refined by integrating information of their appearances and their locations over time with a spatio-temporal extension of GrabCut. This final stage is used to accurately segment carried objects in frames. Our methods are fully automatic, and make minimal assumptions about a person, carried objects and videos. We evaluate the aforementioned methods using two available datasets PETS 2006 and i-Lids AVSS. We compare our detector and segmentation methods against a state-of-the-art detector. Experimental evaluation on the two datasets demonstrates that both our carried object detection and segmentation methods significantly outperform competing algorithms

    Analysis of the hands in egocentric vision: A survey

    Full text link
    Egocentric vision (a.k.a. first-person vision - FPV) applications have thrived over the past few years, thanks to the availability of affordable wearable cameras and large annotated datasets. The position of the wearable camera (usually mounted on the head) allows recording exactly what the camera wearers have in front of them, in particular hands and manipulated objects. This intrinsic advantage enables the study of the hands from multiple perspectives: localizing hands and their parts within the images; understanding what actions and activities the hands are involved in; and developing human-computer interfaces that rely on hand gestures. In this survey, we review the literature that focuses on the hands using egocentric vision, categorizing the existing approaches into: localization (where are the hands or parts of them?); interpretation (what are the hands doing?); and application (e.g., systems that used egocentric hand cues for solving a specific problem). Moreover, a list of the most prominent datasets with hand-based annotations is provided

    Sensor fusion in smart camera networks for ambient intelligence

    Get PDF
    This short report introduces the topics of PhD research that was conducted on 2008-2013 and was defended on July 2013. The PhD thesis covers sensor fusion theory, gathers it into a framework with design rules for fusion-friendly design of vision networks, and elaborates on the rules through fusion experiments performed with four distinct applications of Ambient Intelligence

    인간공학적 자세 평가를 위한 비디오 기반의 작업 자세 입력 시스템 개발

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 공과대학 산업공학과, 2022. 8. 윤명환.Work-related musculoskeletal disorders are a crucial problem for the worker’s safety and productivity of the workplace. The purpose of this study is to propose and develop a video-based work pose entry system for ergonomic postural assessment methods, Rapid Upper Limb Assessment(RULA) and Rapid Entire Body Assessment(REBA). This study developed a work pose entry system using the YOLOv3 algorithm for human tracking and the SPIN approach for 3D human pose estimation. The work pose entry system takes in a 2D video and scores of few evaluation items as input and outputs a final RULA or REBA score and the corresponding action level. An experiment for validation was conducted to 20 evaluators which were classified into two groups, experienced and novice, based on their level of knowledge or experience on ergonomics and musculoskeletal disorders. Participants were asked to manually evaluate working postures of 20 working videos taken at an automobile assembly plant, recording their scores on an Excel worksheet. Scores were generated by the work pose entry system based on individual items that need to be inputted, and the results of manual evaluation and results from the work pose entry system were compared. Descriptive statistics and Mann-Whitney U test showed that using the proposed work pose entry system decreased the difference and standard deviation between the groups. Also, findings showed that experienced evaluators tend to score higher than novice evaluators. Fisher’s exact test was also conducted on evaluation items that are inputted into the work pose entry system, and results have shown that some items that may seem apparent can be perceived differently between groups as well. The work pose entry system developed in this study can contribute to increasing consistency of ergonomic risk assessment and reducing time and effort of ergonomic practitioners during the process. Directions for future research on developing work pose entry systems for ergonomic posture assessment using computer vision are also suggested in the current study.작업 관련 근골격계 질환은 근로자의 안전과 작업장의 생산성 향상에 중요한 문제다. 본 연구의 목적은 인간공학적 자세 분석에 사용되는 대표적인 방법인 Rapid Upper Limb Assessment(RULA) 및 Rapid Entire Body Assessment(REBA)를 위한 비디오 기반의 작업 자세 입력 시스템을 제안하는 것이다. 본 연구는 영상 내 사람 탐지 및 추적을 위한 YOLOv3 알고리즘과 3차원 사람 자세 추정을 위한 SPIN 접근법을 사용하는 시스템을 개발했다. 해당 작업 자세 입력 시스템은 2차원 영상과 몇 개의 평가 항목 점수를 입력으로 받아 최종 RULA 또는 REBA 점수와 해당 조치수준(Action level)을 출력한다. 본 연구에서 제안하는 작업 자세 입력 시스템이 일관적인 결과를 산출하는지 알아보기 위해 인간공학 및 근골격계 질환에 대한 지식이나 경험을 기준으로 숙련된 평가자와 초보 평가자의 두 그룹으로 분류된 평가자 20명을 대상으로 검증 실험을 진행했다. 참가자들은 국내 자동차 조립 공장에서 찍은 20개의 작업 영상의 작업 자세를 수동으로 평가하여 Excel 워크시트에 점수를 기록하였다. 시스템 사용 시 입력해야 하는 개별 항목을 기준으로 시스템을 통한 점수를 생성하고 기존의 전통적인 방법으로 평가한 결과와 시스템에서 얻은 결과를 비교하였으며, 기술 통계와 Mann-Whitney U test는 제안된 시스템을 사용하면 그룹 간의 차이와 표준편차가 감소한다는 것을 보여주었다. 또한, 경험이 많은 평가자들이 초보 평가자들보다 더 높은 점수를 받는 경향이 있다는 것을 보여주었다. 시스템에 입력되는 평가 항목과 경험 정도와의 관계를 확인하기 위해 Fisher’s exact test를 수행하였으며, 결과는 명백해 보일 수 있는 일부 항목도 그룹 간에 다르게 인식될 수 있음을 보여주었다. 이 도구에서 개발된 작업 자세 입력 시스템은 인간공학적 자세 평가의 일관성을 높이고 평가 과정 중 중에 인간공학적 평가자의 시간과 노력을 줄이는 데 기여할 수 있다. 또한 컴퓨터 비전을 활용한 인간공학적 자세 평가를 위한 작업 자세 입력 시스템 개발에 대한 향후 연구 방향도 이번 연구에서 제시된다.Chapter 1 Introduction 1 1.1 Background 1 1.2 Research Objectives 4 1.3 Organization of the Thesis 5 Chapter 2 Literature Review 6 2.1 Overview 6 2.2 Work-related Musculoskeletal Disorders 6 2.3 Ergonomic Posture Analysis 7 2.3.1 Self-reports 7 2.3.2 Observational Methods 7 2.3.3 Direct Methods 15 2.3.4 Vision-based Methods 17 2.4 3D Human Pose Estimation 19 2.4.1 Model-free Approaches 20 2.4.2 Model-based Approaches 21 Chapter 3 Proposed System Design 23 3.1 Overview 23 3.2 Human Tracking 24 3.3 3D Human Pose Estimation 24 3.4 Score Calculation 26 3.4.1 Posture Score Calculation 26 3.4.2 Output of the Proposed System 31 Chapter 4 Validation Experiment 32 4.1 Hypotheses 32 4.2 Methods 32 4.2.1 Participants 32 4.2.2 Apparatus 33 4.2.3 Procedure 33 4.2.4 Data Analysis 37 4.3 Results 38 4.3.1 RULA 38 4.3.2 REBA 46 4.3.3 Evaluation Items for Manual Input 54 Chapter 5 Discussion 56 5.1 Group Difference 56 5.1.1 RULA 57 5.1.2 REBA 57 5.2 Evaluation Items for Manual Input 58 5.3 Proposed Work Pose Entry System 59 Chapter 6 Conclusion 62 6.1 Conclusion 62 6.2 Limitation, Contribution, and Future Direction 62 Bibliography 65 국문초록 77석
    corecore