9 research outputs found

    People tracking and re-identification by face recognition for RGB-D camera networks

    Get PDF
    This paper describes a face recognition-based people tracking and re-identification system for RGB-D camera networks. The system tracks people and learns their faces online to keep track of their identities even if they move out from the camera's field of view once. For robust people re-identification, the system exploits the combination of a deep neural network- based face representation and a Bayesian inference-based face classification method. The system also provides a predefined people identification capability: it associates the online learned faces with predefined people face images and names to know the people's whereabouts, thus, allowing a rich human-system interaction. Through experiments, we validate the re-identification and the predefined people identification capabilities of the system and show an example of the integration of the system with a mobile robot. The overall system is built as a Robot Operating System (ROS) module. As a result, it simplifies the integration with the many existing robotic systems and algorithms which use such middleware. The code of this work has been released as open-source in order to provide a baseline for the future publications in this field

    Real-time RGB-Depth preception of humans for robots and camera networks

    Get PDF
    This thesis deals with robot and camera network perception using RGB-Depth data. The goal is to provide efficient and robust algorithms for interacting with humans. For this reason, a special care has been devoted to design algorithms which can run in real-time on consumer computers and embedded cards. The main contribution of this thesis is the 3D body pose estimation of the human body. We propose two novel algorithms which take advantage of the data stream of a RGB-D camera network outperforming the state-of-the-art performance in both single-view and multi-view tests. While the first algorithm works on point cloud data which is feasible also with no external light, the second one performs better, since it deals with multiple persons with negligible overhead and does not rely on the synchronization between the different cameras in the network. The second contribution regards long-term people re-identification in camera networks. This is particularly challenging since we cannot rely on appearance cues, in order to be able to re-identify people also in different days. We address this problem by proposing a face-recognition framework based on a Convolutional Neural Network and a Bayes inference system to re-assign the correct ID and person name to each new track. The third contribution is about Ambient Assisted Living. We propose a prototype of an assistive robot which periodically patrols a known environment, reporting unusual events as people fallen on the ground. To this end, we developed a fast and robust approach which can work also in dimmer scenes and is validated using a new publicly-available RGB-D dataset recorded on-board of our open-source robot prototype. As a further contribution of this work, in order to boost the research on this topics and to provide the best benefit to the robotics and computer vision community, we released under open-source licenses most of the software implementations of the novel algorithms described in this work

    Robust and Efficient Semantic Sensor Registration for Mobile Robotics in Unorganized, Natural, Scenes

    Full text link
    Advances in sensing and computing hardware have led to renewed interest in registration algorithms. In particular, the proliferation of 3D LIDAR sensors and RGBD cameras and their use in robotic systems require efficient, robust, and accurate estimation algorithms for use in mapping, localization, and tracking tasks. Most modern approaches to autonomous driving require localizing and calibrating multiple LIDAR sensors, both of which are registration tasks. Meanwhile, tasks in the domain of indoor robotics require both localizing the robot and localizing objects of interest in the environment. The registration problem is that of trying to find the rigid body transformation between two measurements. This can include consecutive measurements (producing an odometry estimate), measurements from disparate points in time (such as for localization and mapping), and between different sensors (such as for calibrating multiple sensors on a platform). Semantic detection and segmentation have similarly significantly progressed. Semantic inference on images and point clouds has shown increasing value in vision-based applications. The application of Convolutional Neural Networks (CNNs) has improved the computational efficiency of semantic segmentation techniques with superior performance in both indoor and outdoor benchmarks. Together with pose estimation techniques, multiple scenes can be segmented and combined to perform semantic mapping or object tracking; nevertheless, most semantic mapping and object tracking research has focused on performing pose estimation, and then semantic inference. So far, most research has not focused on joint semantic and metric estimation. This thesis focuses on leveraging semantic inference to enable efficient and robust sensor registration. In robotics, semantic inference is increasingly used for downstream reasoning tasks. This thesis explores how that inference can be used in upstream task such as egomotion estimation,object pose estimation, and multisensor calibration. This work is based on improving the Iterative Closest Point (ICP) algorithm. Our first contribution in this thesis explores how probabilistic semantic labels can be used in sensor registration. We present an approach that uses the Expectation Maximization (EM) technique to improve associations in the ICP framework. We also use an M-Estimator and optimize directly on the SE(3) manifold to improve the robustness. Our results on publicly available indoor and outdoor data sets show that semantics can help improve registration accuracy. For the second contribution, we add informative channels to the semantic ICP framework to aid in object-level registration. This includes work on using sparse kernels to represent intensity and color channels for regularizing the registration problem, and work on curvature based alignment to improve object pose estimation. Both of these techniques extend registration algorithms beyond their purely geometric base. The third part presents our contribution on reformulating the registration problem as a mixed integer program (MIP). Most previous approaches to sensor registration use gradient-based optimization techniques. If the cost function used is nonconvex, they are prone to getting caught in local minima. The problem is reformulated as a MIP by linearizing the cost function and representing the data association as an integer valued variable. This thesis focuses on developing robust, accurate registration techniques for mobile robotics applications. It presents results and proposed evaluation in the areas of indoor home robotics and autonomous driving, many of which are publicly available benchmark data sets. Sensor registration is a fundamental component of many robotic systems, and the advances proposed in this thesis have the potential to benefit many more aspects of perceptual systems.PHDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155198/1/sparki_1.pd

    Reconnaissance de visage robuste aux occultations

    Get PDF
    Face recognition is an important technology in computer vision, which often acts as an essential component in biometrics systems, HCI systems, access control systems, multimedia indexing applications, etc. Partial occlusion, which significantly changes the appearance of part of a face, cannot only cause large performance deterioration of face recognition, but also can cause severe security issues. In this thesis, we focus on the occlusion problem in automatic face recognition in non-controlled environments. Toward this goal, we propose a framework that consists of applying explicit occlusion analysis and processing to improve face recognition under different occlusion conditions. We demonstrate in this thesis that the proposed framework is more efficient than the methods based on non-explicit occlusion treatments from the literature. We identify two new types of facial occlusions, namely the sparse occlusion and dynamic occlusion. Solutions are presented to handle the identified occlusion problems in more advanced surveillance context. Recently, the emerging Kinect sensor has been successfully applied in many computer vision fields. We introduce this new sensor in the context of face recognition, particularly in presence of occlusions, and demonstrate its efficiency compared with traditional 2D cameras. Finally, we propose two approaches based on 2D and 3D to improve the baseline face recognition techniques. Improving the baseline methods can also have the positive impact on the recognition results when partial occlusion occurs.La reconnaissance faciale est une technologie importante en vision par ordinateur, avec un rôle central en biométrie, interface homme-machine, contrôle d’accès, indexation multimédia, etc. L’occultation partielle, qui change complétement l’apparence d’une partie du visage, ne provoque pas uniquement une dégradation des performances en reconnaissance faciale, mai peut aussi avoir des conséquences en termes de sécurité. Dans cette thèse, nous concentrons sur le problème des occultations en reconnaissance faciale en environnements non contrôlés. Nous proposons une séquence qui consiste à analyser de manière explicite les occultations et à fiabiliser la reconnaissance faciale soumises à diverses occultations. Nous montrons dans cette thèse que l’approche proposée est plus efficace que les méthodes de l’état de l’art opérant sans traitement explicite dédié aux occultations. Nous identifions deux nouveaux types d’occultations, à savoir éparses et dynamiques. Des solutions sont introduites pour gérer ces problèmes d’occultation nouvellement identifiés dans un contexte de vidéo surveillance avancé. Récemment, le nouveau capteur Kinect a été utilisé avec succès dans de nombreuses applications en vision par ordinateur. Nous introduisons ce nouveau capteur dans le contexte de la reconnaissance faciale, en particulier en présence d’occultations, et démontrons son efficacité par rapport aux caméras traditionnelles. Finalement, nous proposons deux approches basées 2D et 3D permettant d’améliorer les techniques de base en reconnaissance de visages. L’amélioration des méthodes de base peut alors générer un impact positif sur les résultats de reconnaissance en présence d’occultations

    Probabilistic Feature-Based Registration for Interventional Medicine

    Get PDF
    The need to compute accurate spatial alignment between multiple representations of patient anatomy is a problem that is fundamental to many applications in computer-integrated interventional medicine. One class of methods for computing such alignments is feature-based registration, which aligns geometric information of the shapes being registered, such as salient landmarks or models of shape surfaces. A popular algorithm for surface-based registration is the Iterative Closest Point (ICP) algorithm, which treats one shape as a cloud of points that is registered to a second shape by iterating between point-correspondence and point-registration phases until convergence. In this dissertation, a class of "most likely point" variants on the ICP algorithm is developed that offers several advantages over ICP, such as high registration accuracy and the ability to confidently assess the quality of a registration outcome. The proposed algorithms are based on a probabilistic interpretation of the registration problem, wherein the point-correspondence and point-registration phases optimize the probability of shape alignment based on feature uncertainty models rather than minimizing the Euclidean distance between the shapes as in ICP. This probabilistic framework is used to model anisotropic errors in the shape measurements and to provide a natural context for incorporating oriented-point data into the registration problem, such as shape surface normals. The proposed algorithms are evaluated through a range of simulation-, phantom-, and clinical-based studies, which demonstrate significant improvement in registration outcomes relative to ICP and state-of-the-art methods
    corecore