339 research outputs found

    Hand posture recognition using modified Ensemble of Shape Functions and Global Radius-based Surface Descriptor

    Get PDF
    This paper presents an approach to recognition of static hand gestures based on data acquired from 3D cameras and point cloud descriptors: Ensemble of Shape Functions and Global Radius-based Surface Descriptor. We describe the recognition algorithm consisting of: hand segmentation, noise removal and downsampling of point cloud, dividing point cloud bounding box to cells, feature extraction and normalization, gesture classification. Modifications of the descriptors are proposed in order to increase hand posture recognition rates and decrease quantity of used features as well as computational cost of the algorithm. The experiments performed on four challenging datasets using cross-validation tests prove the usefulness of our approach

    Human Action Recognition via Fused Kinematic Structure and Surface Representation

    Get PDF
    Human action recognition from visual data has remained a challenging problem in the field of computer vision and pattern recognition. This dissertation introduces a new methodology for human action recognition using motion features extracted from kinematic structure, and shape features extracted from surface representation of human body. Motion features are used to provide sufficient information about human movement, whereas shape features are used to describe the structure of silhouette. These features are fused at the kernel level using Multikernel Learning (MKL) technique to enhance the overall performance of human action recognition. In fact, there are advantages in using multiple types of features for human action recognition, especially, if the features are complementary to each other (e.g. kinematic/motion features and shape features). For instance, challenging problems such as inter-class similarity among actions and performance variation, which cannot be resolved easily by using a single type of feature, can be handled by fusing multiple types of features. This dissertation presents a new method for representing the human body surface provided by depth map (3-D) using spherical harmonics representation. The advantage of using the spherical harmonics representation is to represent the whole body surface into a nite series of spherical harmonics coefficients. Furthermore, these series can be used to describe the pose of the body using the phase information encoded inside the coefficients. Another method for detecting/tracking distal limb segments using the kinematic structure is developed. The advantage of using the distal limb segments is to extract discriminative features that can provide sufficient and compact information to recognize human actions. Our experimental results show that the aforementioned methods for human action description are complementary to each other. Hence, combining both features can enhance the robustness of action recognition. In this context, a framework to fuse multiple features using MKL technique is developed. The experimental results show that this framework is promising in incorporating multiple features in different domain for automated recognition of human action

    Human shape modelling for carried object detection and segmentation

    Get PDF
    La détection des objets transportés est un des prérequis pour développer des systèmes qui cherchent à comprendre les activités impliquant des personnes et des objets. Cette thèse présente de nouvelles méthodes pour détecter et segmenter les objets transportés dans des vidéos de surveillance. Les contributions sont divisées en trois principaux chapitres. Dans le premier chapitre, nous introduisons notre détecteur d’objets transportés, qui nous permet de détecter un type générique d’objets. Nous formulons la détection d’objets transportés comme un problème de classification de contours. Nous classifions le contour des objets mobiles en deux classes : objets transportés et personnes. Un masque de probabilités est généré pour le contour d’une personne basé sur un ensemble d’exemplaires (ECE) de personnes qui marchent ou se tiennent debout de différents points de vue. Les contours qui ne correspondent pas au masque de probabilités généré sont considérés comme des candidats pour être des objets transportés. Ensuite, une région est assignée à chaque objet transporté en utilisant la Coupe Biaisée Normalisée (BNC) avec une probabilité obtenue par une fonction pondérée de son chevauchement avec l’hypothèse du masque de contours de la personne et du premier plan segmenté. Finalement, les objets transportés sont détectés en appliquant une Suppression des Non-Maxima (NMS) qui élimine les scores trop bas pour les objets candidats. Le deuxième chapitre de contribution présente une approche pour détecter des objets transportés avec une méthode innovatrice pour extraire des caractéristiques des régions d’avant-plan basée sur leurs contours locaux et l’information des super-pixels. Initiallement, un objet bougeant dans une séquence vidéo est segmente en super-pixels sous plusieurs échelles. Ensuite, les régions ressemblant à des personnes dans l’avant-plan sont identifiées en utilisant un ensemble de caractéristiques extraites de super-pixels dans un codebook de formes locales. Ici, les régions ressemblant à des humains sont équivalentes au masque de probabilités de la première méthode (ECE). Notre deuxième détecteur d’objets transportés bénéficie du nouveau descripteur de caractéristiques pour produire une carte de probabilité plus précise. Les compléments des super-pixels correspondants aux régions ressemblant à des personnes dans l’avant-plan sont considérés comme une carte de probabilité des objets transportés. Finalement, chaque groupe de super-pixels voisins avec une haute probabilité d’objets transportés et qui ont un fort support de bordure sont fusionnés pour former un objet transporté. Finalement, dans le troisième chapitre, nous présentons une méthode pour détecter et segmenter les objets transportés. La méthode proposée adopte le nouveau descripteur basé sur les super-pixels pour iii identifier les régions ressemblant à des objets transportés en utilisant la modélisation de la forme humaine. En utilisant l’information spatio-temporelle des régions candidates, la consistance des objets transportés récurrents, vus dans le temps, est obtenue et sert à détecter les objets transportés. Enfin, les régions d’objets transportés sont raffinées en intégrant de l’information sur leur apparence et leur position à travers le temps avec une extension spatio-temporelle de GrabCut. Cette étape finale sert à segmenter avec précision les objets transportés dans les séquences vidéo. Nos méthodes sont complètement automatiques, et font des suppositions minimales sur les personnes, les objets transportés, et les les séquences vidéo. Nous évaluons les méthodes décrites en utilisant deux ensembles de données, PETS 2006 et i-Lids AVSS. Nous évaluons notre détecteur et nos méthodes de segmentation en les comparant avec l’état de l’art. L’évaluation expérimentale sur les deux ensembles de données démontre que notre détecteur d’objets transportés et nos méthodes de segmentation surpassent de façon significative les algorithmes compétiteurs.Detecting carried objects is one of the requirements for developing systems that reason about activities involving people and objects. This thesis presents novel methods to detect and segment carried objects in surveillance videos. The contributions are divided into three main chapters. In the first, we introduce our carried object detector which allows to detect a generic class of objects. We formulate carried object detection in terms of a contour classification problem. We classify moving object contours into two classes: carried object and person. A probability mask for person’s contours is generated based on an ensemble of contour exemplars (ECE) of walking/standing humans in different viewing directions. Contours that are not falling in the generated hypothesis mask are considered as candidates for carried object contours. Then, a region is assigned to each carried object candidate contour using Biased Normalized Cut (BNC) with a probability obtained by a weighted function of its overlap with the person’s contour hypothesis mask and segmented foreground. Finally, carried objects are detected by applying a Non-Maximum Suppression (NMS) method which eliminates the low score carried object candidates. The second contribution presents an approach to detect carried objects with an innovative method for extracting features from foreground regions based on their local contours and superpixel information. Initially, a moving object in a video frame is segmented into multi-scale superpixels. Then human-like regions in the foreground area are identified by matching a set of extracted features from superpixels against a codebook of local shapes. Here the definition of human like regions is equivalent to a person’s probability map in our first proposed method (ECE). Our second carried object detector benefits from the novel feature descriptor to produce a more accurate probability map. Complement of the matching probabilities of superpixels to human-like regions in the foreground are considered as a carried object probability map. At the end, each group of neighboring superpixels with a high carried object probability which has strong edge support is merged to form a carried object. Finally, in the third contribution we present a method to detect and segment carried objects. The proposed method adopts the new superpixel-based descriptor to identify carried object-like candidate regions using human shape modeling. Using spatio-temporal information of the candidate regions, consistency of recurring carried object candidates viewed over time is obtained and serves to detect carried objects. Last, the detected carried object regions are refined by integrating information of their appearances and their locations over time with a spatio-temporal extension of GrabCut. This final stage is used to accurately segment carried objects in frames. Our methods are fully automatic, and make minimal assumptions about a person, carried objects and videos. We evaluate the aforementioned methods using two available datasets PETS 2006 and i-Lids AVSS. We compare our detector and segmentation methods against a state-of-the-art detector. Experimental evaluation on the two datasets demonstrates that both our carried object detection and segmentation methods significantly outperform competing algorithms

    Biometric Systems

    Get PDF
    Because of the accelerating progress in biometrics research and the latest nation-state threats to security, this book's publication is not only timely but also much needed. This volume contains seventeen peer-reviewed chapters reporting the state of the art in biometrics research: security issues, signature verification, fingerprint identification, wrist vascular biometrics, ear detection, face detection and identification (including a new survey of face recognition), person re-identification, electrocardiogram (ECT) recognition, and several multi-modal systems. This book will be a valuable resource for graduate students, engineers, and researchers interested in understanding and investigating this important field of study

    Gesture passwords: concepts, methods and challenges

    Full text link
    Biometrics are a convenient alternative to traditional forms of access control such as passwords and pass-cards since they rely solely on user-specific traits. Unlike alphanumeric passwords, biometrics cannot be given or told to another person, and unlike pass-cards, are always “on-hand.” Perhaps the most well-known biometrics with these properties are: face, speech, iris, and gait. This dissertation proposes a new biometric modality: gestures. A gesture is a short body motion that contains static anatomical information and changing behavioral (dynamic) information. This work considers both full-body gestures such as a large wave of the arms, and hand gestures such as a subtle curl of the fingers and palm. For access control, a specific gesture can be selected as a “password” and used for identification and authentication of a user. If this particular motion were somehow compromised, a user could readily select a new motion as a “password,” effectively changing and renewing the behavioral aspect of the biometric. This thesis describes a novel framework for acquiring, representing, and evaluating gesture passwords for the purpose of general access control. The framework uses depth sensors, such as the Kinect, to record gesture information from which depth maps or pose features are estimated. First, various distance measures, such as the log-euclidean distance between feature covariance matrices and distances based on feature sequence alignment via dynamic time warping, are used to compare two gestures, and train a classifier to either authenticate or identify a user. In authentication, this framework yields an equal error rate on the order of 1-2% for body and hand gestures in non-adversarial scenarios. Next, through a novel decomposition of gestures into posture, build, and dynamic components, the relative importance of each component is studied. The dynamic portion of a gesture is shown to have the largest impact on biometric performance with its removal causing a significant increase in error. In addition, the effects of two types of threats are investigated: one due to self-induced degradations (personal effects and the passage of time) and the other due to spoof attacks. For body gestures, both spoof attacks (with only the dynamic component) and self-induced degradations increase the equal error rate as expected. Further, the benefits of adding additional sensor viewpoints to this modality are empirically evaluated. Finally, a novel framework that leverages deep convolutional neural networks for learning a user-specific “style” representation from a set of known gestures is proposed and compared to a similar representation for gesture recognition. This deep convolutional neural network yields significantly improved performance over prior methods. A byproduct of this work is the creation and release of multiple publicly available, user-centric (as opposed to gesture-centric) datasets based on both body and hand gestures

    Hand pose recognition using a consumer depth camera

    Get PDF
    [no abstract

    Advances in Monocular Exemplar-based Human Body Pose Analysis: Modeling, Detection and Tracking

    Get PDF
    Esta tesis contribuye en el análisis de la postura del cuerpo humano a partir de secuencias de imágenes adquiridas con una sola cámara. Esta temática presenta un amplio rango de potenciales aplicaciones en video-vigilancia, video-juegos o aplicaciones biomédicas. Las técnicas basadas en patrones han tenido éxito, sin embargo, su precisión depende de la similitud del punto de vista de la cámara y de las propiedades de la escena entre las imágenes de entrenamiento y las de prueba. Teniendo en cuenta un conjunto de datos de entrenamiento capturado mediante un número reducido de cámaras fijas, paralelas al suelo, se han identificado y analizado tres escenarios posibles con creciente nivel de dificultad: 1) una cámara estática paralela al suelo, 2) una cámara de vigilancia fija con un ángulo de visión considerablemente diferente, y 3) una secuencia de video capturada con una cámara en movimiento o simplemente una sola imagen estática

    On the Feasibility of Interoperable Schemes in Hand Biometrics

    Get PDF
    Personal recognition through hand-based biometrics has attracted the interest of many researchers in the last twenty years. A significant number of proposals based on different procedures and acquisition devices have been published in the literature. However, comparisons between devices and their interoperability have not been thoroughly studied. This paper tries to fill this gap by proposing procedures to improve the interoperability among different hand biometric schemes. The experiments were conducted on a database made up of 8,320 hand images acquired from six different hand biometric schemes, including a flat scanner, webcams at different wavelengths, high quality cameras, and contactless devices. Acquisitions on both sides of the hand were included. Our experiment includes four feature extraction methods which determine the best performance among the different scenarios for two of the most popular hand biometrics: hand shape and palm print. We propose smoothing techniques at the image and feature levels to reduce interdevice variability. Results suggest that comparative hand shape offers better performance in terms of interoperability than palm prints, but palm prints can be more effective when using similar sensors

    Automated Facial Anthropometry Over 3D Face Surface Textured Meshes

    Get PDF
    The automation of human face measurement means facing mayor technical and technological challenges. The use of 3D scanning technology is widely accepted in the scientific community and it offers the possibility of developing non-invasive measurement techniques. However, the selection of the points that form the basis of the measurements is a task that still requires human intervention. This work introduces digital image processing methods for automatic localization of facial features. The first goal was to examine different ways to represent 3D shapes and to evaluate whether these could be used as representative features of facial attributes, in order to locate them automatically. Based on the above, a non-rigid registration procedure was developed to estimate dense point-to-point correspondence between two surfaces. The method is able to register 3D models of faces in the presence of facial expressions. Finally, a method that uses both shape and appearance information of the surface, was designed for automatic localization of a set of facial features that are the basis for determining anthropometric ratios, which are widely used in fields such as ergonomics, forensics, surgical planning, among othersResumen : La automatización de la medición del rostro humano implica afrontar grandes desafíos técnicos y tecnológicos. Una alternativa de solución que ha encontrado gran aceptación dentro de la comunidad científica, corresponde a la utilización de tecnología de digitalización 3D con lo cual ha sido posible el desarrollo de técnicas de medición no invasivas. Sin embargo, la selección de los puntos que son la base de las mediciones es una tarea que aún requiere de la intervención humana. En este trabajo se presentan métodos de procesamiento digital de imágenes para la localización automática de características faciales. Lo primero que se hizo fue estudiar diversas formas de representar la forma en 3D y cómo estas podían contribuir como características representativas de los atributos faciales con el fin de poder ubicarlos automáticamente. Con base en lo anterior, se desarrolló un método para la estimación de correspondencia densa entre dos superficies a partir de un procedimiento de registro no rígido, el cual se enfocó a modelos de rostros 3D en presencia de expresiones faciales. Por último, se plantea un método, que utiliza tanto información de la forma como de la apariencia de las superficies, para la localización automática de un conjunto de características faciales que son la base para determinar índices antropométricos ampliamente utilizados en campos tales como la ergonomía, ciencias forenses, planeación quirúrgica, entre otrosDoctorad
    • …
    corecore