97 research outputs found

    Model-Based High-Dimensional Pose Estimation with Application to Hand Tracking

    Get PDF
    This thesis presents novel techniques for computer vision based full-DOF human hand motion estimation. Our main contributions are: A robust skin color estimation approach; A novel resolution-independent and memory efficient representation of hand pose silhouettes, which allows us to compute area-based similarity measures in near-constant time; A set of new segmentation-based similarity measures; A new class of similarity measures that work for nearly arbitrary input modalities; A novel edge-based similarity measure that avoids any problematic thresholding or discretizations and can be computed very efficiently in Fourier space; A template hierarchy to minimize the number of similarity computations needed for finding the most likely hand pose observed; And finally, a novel image space search method, which we naturally combine with our hierarchy. Consequently, matching can efficiently be formulated as a simultaneous template tree traversal and function maximization

    Calibration-free Pedestrian Partial Pose Estimation Using a High-mounted Kinect

    Get PDF
    Les applications de l’analyse du comportement humain ont subit de rapides développements durant les dernières décades, tant au niveau des systèmes de divertissements que pour des applications professionnelles comme les interfaces humain-machine, les systèmes d’assistance de conduite automobile ou des systèmes de protection des piétons. Cette thèse traite du problème de reconnaissance de piétons ainsi qu’à l’estimation de leur orientation en 3D. Cette estimation est faite dans l’optique que la connaissance de cette orientation est bénéfique tant au niveau de l’analyse que de la prédiction du comportement des piétons. De ce fait, cette thèse propose à la fois une nouvelle méthode pour détecter les piétons et une manière d’estimer leur orientation, par l’intégration séquentielle d’un module de détection et un module d’estimation d’orientation. Pour effectuer cette détection de piéton, nous avons conçu un classificateur en cascade qui génère automatiquement une boîte autour des piétons détectés dans l’image. Suivant cela, des régions sont extraites d’un nuage de points 3D afin de classifier l’orientation du torse du piéton. Cette classification se base sur une image synthétique grossière par tramage (rasterization) qui simule une caméra virtuelle placée immédiatement au-dessus du piéton détecté. Une machine à vecteurs de support effectue la classification à partir de cette image de synthèse, pour l’une des 10 orientations discrètes utilisées lors de l’entrainement (incréments de 30 degrés). Afin de valider les performances de notre approche d’estimation d’orientation, nous avons construit une base de données de référence contenant 764 nuages de points. Ces données furent capturées à l’aide d’une caméra Kinect de Microsoft pour 30 volontaires différents, et la vérité-terrain sur l’orientation fut établie par l’entremise d’un système de capture de mouvement Vicon. Finalement, nous avons démontré les améliorations apportées par notre approche. En particulier, nous pouvons détecter des piétons avec une précision de 95.29% et estimer l’orientation du corps (dans un intervalle de 30 degrés) avec une précision de 88.88%. Nous espérons ainsi que nos résultats de recherche puissent servir de point de départ à d’autres recherches futures.The application of human behavior analysis has undergone rapid development during the last decades from entertainment system to professional one, as Human Robot Interaction (HRI), Advanced Driver Assistance System (ADAS), Pedestrian Protection System (PPS), etc. Meanwhile, this thesis addresses the problem of recognizing pedestrians and estimating their body orientation in 3D based on the fact that estimating a person’s orientation is beneficial in determining their behavior. In this thesis, a new method is proposed for detecting and estimating the orientation, in which the result of a pedestrian detection module and a orientation estimation module are integrated sequentially. For the goal of pedestrian detection, a cascade classifier is designed to draw a bounding box around the detected pedestrian. Following this, extracted regions are given to a discrete orientation classifier to estimate pedestrian body’s orientation. This classification is based on a coarse, rasterized depth image simulating a top-view virtual camera, and uses a support vector machine classifier that was trained to distinguish 10 orientations (30 degrees increments). In order to test the performance of our approach, a new benchmark database contains 764 sets of point cloud for body-orientation classification was captured. For this benchmark, a Kinect recorded the point cloud of 30 participants and a marker-based motion capture system (Vicon) provided the ground truth on their orientation. Finally we demonstrated the improvements brought by our system, as it detected pedestrian with an accuracy of 95:29% and estimated the body orientation with an accuracy of 88:88%.We hope it can provide a new foundation for future researches

    High-performance hardware accelerators for image processing in space applications

    Get PDF
    Mars is a hard place to reach. While there have been many notable success stories in getting probes to the Red Planet, the historical record is full of bad news. The success rate for actually landing on the Martian surface is even worse, roughly 30%. This low success rate must be mainly credited to the Mars environment characteristics. In the Mars atmosphere strong winds frequently breath. This phenomena usually modifies the lander descending trajectory diverging it from the target one. Moreover, the Mars surface is not the best place where performing a safe land. It is pitched by many and close craters and huge stones, and characterized by huge mountains and hills (e.g., Olympus Mons is 648 km in diameter and 27 km tall). For these reasons a mission failure due to a landing in huge craters, on big stones or on part of the surface characterized by a high slope is highly probable. In the last years, all space agencies have increased their research efforts in order to enhance the success rate of Mars missions. In particular, the two hottest research topics are: the active debris removal and the guided landing on Mars. The former aims at finding new methods to remove space debris exploiting unmanned spacecrafts. These must be able to autonomously: detect a debris, analyses it, in order to extract its characteristics in terms of weight, speed and dimension, and, eventually, rendezvous with it. In order to perform these tasks, the spacecraft must have high vision capabilities. In other words, it must be able to take pictures and process them with very complex image processing algorithms in order to detect, track and analyse the debris. The latter aims at increasing the landing point precision (i.e., landing ellipse) on Mars. Future space-missions will increasingly adopt Video Based Navigation systems to assist the entry, descent and landing (EDL) phase of space modules (e.g., spacecrafts), enhancing the precision of automatic EDL navigation systems. For instance, recent space exploration missions, e.g., Spirity, Oppurtunity, and Curiosity, made use of an EDL procedure aiming at following a fixed and precomputed descending trajectory to reach a precise landing point. This approach guarantees a maximum landing point precision of 20 km. By comparing this data with the Mars environment characteristics, it is possible to understand how the mission failure probability still remains really high. A very challenging problem is to design an autonomous-guided EDL system able to even more reduce the landing ellipse, guaranteeing to avoid the landing in dangerous area of Mars surface (e.g., huge craters or big stones) that could lead to the mission failure. The autonomous behaviour of the system is mandatory since a manual driven approach is not feasible due to the distance between Earth and Mars. Since this distance varies from 56 to 100 million of km approximately due to the orbit eccentricity, even if a signal transmission at the light speed could be possible, in the best case the transmission time would be around 31 minutes, exceeding so the overall duration of the EDL phase. In both applications, algorithms must guarantee self-adaptability to the environmental conditions. Since the Mars (and in general the space) harsh conditions are difficult to be predicted at design time, these algorithms must be able to automatically tune the internal parameters depending on the current conditions. Moreover, real-time performances are another key factor. Since a software implementation of these computational intensive tasks cannot reach the required performances, these algorithms must be accelerated via hardware. For this reasons, this thesis presents my research work done on advanced image processing algorithms for space applications and the associated hardware accelerators. My research activity has been focused on both the algorithm and their hardware implementations. Concerning the first aspect, I mainly focused my research effort to integrate self-adaptability features in the existing algorithms. While concerning the second, I studied and validated a methodology to efficiently develop, verify and validate hardware components aimed at accelerating video-based applications. This approach allowed me to develop and test high performance hardware accelerators that strongly overcome the performances of the actual state-of-the-art implementations. The thesis is organized in four main chapters. Chapter 2 starts with a brief introduction about the story of digital image processing. The main content of this chapter is the description of space missions in which digital image processing has a key role. A major effort has been spent on the missions in which my research activity has a substantial impact. In particular, for these missions, this chapter deeply analizes and evaluates the state-of-the-art approaches and algorithms. Chapter 3 analyzes and compares the two technologies used to implement high performances hardware accelerators, i.e., Application Specific Integrated Circuits (ASICs) and Field Programmable Gate Arrays (FPGAs). Thanks to this information the reader may understand the main reasons behind the decision of space agencies to exploit FPGAs instead of ASICs for high-performance hardware accelerators in space missions, even if FPGAs are more sensible to Single Event Upsets (i.e., transient error induced on hardware component by alpha particles and solar radiation in space). Moreover, this chapter deeply describes the three available space-grade FPGA technologies (i.e., One-time Programmable, Flash-based, and SRAM-based), and the main fault-mitigation techniques against SEUs that are mandatory for employing space-grade FPGAs in actual missions. Chapter 4 describes one of the main contribution of my research work: a library of high-performance hardware accelerators for image processing in space applications. The basic idea behind this library is to offer to designers a set of validated hardware components able to strongly speed up the basic image processing operations commonly used in an image processing chain. In other words, these components can be directly used as elementary building blocks to easily create a complex image processing system, without wasting time in the debug and validation phase. This library groups the proposed hardware accelerators in IP-core families. The components contained in a same family share the same provided functionality and input/output interface. This harmonization in the I/O interface enables to substitute, inside a complex image processing system, components of the same family without requiring modifications to the system communication infrastructure. In addition to the analysis of the internal architecture of the proposed components, another important aspect of this chapter is the methodology used to develop, verify and validate the proposed high performance image processing hardware accelerators. This methodology involves the usage of different programming and hardware description languages in order to support the designer from the algorithm modelling up to the hardware implementation and validation. Chapter 5 presents the proposed complex image processing systems. In particular, it exploits a set of actual case studies, associated with the most recent space agency needs, to show how the hardware accelerator components can be assembled to build a complex image processing system. In addition to the hardware accelerators contained in the library, the described complex system embeds innovative ad-hoc hardware components and software routines able to provide high performance and self-adaptable image processing functionalities. To prove the benefits of the proposed methodology, each case study is concluded with a comparison with the current state-of-the-art implementations, highlighting the benefits in terms of performances and self-adaptability to the environmental conditions

    Automatic Food Intake Assessment Using Camera Phones

    Get PDF
    Obesity is becoming an epidemic phenomenon in most developed countries. The fundamental cause of obesity and overweight is an energy imbalance between calories consumed and calories expended. It is essential to monitor everyday food intake for obesity prevention and management. Existing dietary assessment methods usually require manually recording and recall of food types and portions. Accuracy of the results largely relies on many uncertain factors such as user\u27s memory, food knowledge, and portion estimations. As a result, the accuracy is often compromised. Accurate and convenient dietary assessment methods are still blank and needed in both population and research societies. In this thesis, an automatic food intake assessment method using cameras, inertial measurement units (IMUs) on smart phones was developed to help people foster a healthy life style. With this method, users use their smart phones before and after a meal to capture images or videos around the meal. The smart phone will recognize food items and calculate the volume of the food consumed and provide the results to users. The technical objective is to explore the feasibility of image based food recognition and image based volume estimation. This thesis comprises five publications that address four specific goals of this work: (1) to develop a prototype system with existing methods to review the literature methods, find their drawbacks and explore the feasibility to develop novel methods; (2) based on the prototype system, to investigate new food classification methods to improve the recognition accuracy to a field application level; (3) to design indexing methods for large-scale image database to facilitate the development of new food image recognition and retrieval algorithms; (4) to develop novel convenient and accurate food volume estimation methods using only smart phones with cameras and IMUs. A prototype system was implemented to review existing methods. Image feature detector and descriptor were developed and a nearest neighbor classifier were implemented to classify food items. A reedit card marker method was introduced for metric scale 3D reconstruction and volume calculation. To increase recognition accuracy, novel multi-view food recognition algorithms were developed to recognize regular shape food items. To further increase the accuracy and make the algorithm applicable to arbitrary food items, new food features, new classifiers were designed. The efficiency of the algorithm was increased by means of developing novel image indexing method in large-scale image database. Finally, the volume calculation was enhanced through reducing the marker and introducing IMUs. Sensor fusion technique to combine measurements from cameras and IMUs were explored to infer the metric scale of the 3D model as well as reduce noises from these sensors

    Fish4Knowledge: Collecting and Analyzing Massive Coral Reef Fish Video Data

    Get PDF
    This book gives a start-to-finish overview of the whole Fish4Knowledge project, in 18 short chapters, each describing one aspect of the project. The Fish4Knowledge project explored the possibilities of big video data, in this case from undersea video. Recording and analyzing 90 thousand hours of video from ten camera locations, the project gives a 3 year view of fish abundance in several tropical coral reefs off the coast of Taiwan. The research system built a remote recording network, over 100 Tb of storage, supercomputer processing, video target detection and

    Digital Surface Modelling in Developing Countries Using Spaceborne SAR Techniques

    Get PDF
    Topographic databases at the national level, in the form of Digital Surface Models (DSMs), are required for a large number of applications which have been spurred on by the increased use of Geographic Information Systems (GIS). Ground-Based (surveying, GPS, etc.) and traditional airborne approaches to generating topographic information are proving to be time consuming and costly for applications in developing countries. Where these countries are located in the tropical zone, they are affected by the additional problem of cloud cover which could cause delays for almost 75% of the year in obtaining optical imagery. The Caribbean happens to be one such affected territory that is in need of national digital topographic information for its GIS database developments, 3D visualization of landscapes and for use in the digital ortho-rectification of satellite imagery. The use of Synthetic Aperture Radar (SAR), with its cloud penetrating and day/night imaging capabilities, is emerging as a possible remote sensing tool for use in cloud affected territories. There has been success with airborne single-pass dual antennae systems (e.g. STAR 3i) and the Shuttle Radar Topographic Mapping (SRTM) mission. However, the use of these systems in the Caribbean are restrictive and datasets will not be generally available. The launching of imaging radar satellites such as ERS-1, ERS-2, Radarsat-1 and more recently Envisat have provided additional opportunities for augmenting the technologies available for generating medium accuracy, low cost, topographic information for developing countries by using the techniques of Radargrammetry (StereoSAR) and Interferometric SAR (InSAR). The primary aim of this research was to develop, from scratch, a prototype StereoSAR system based on automatic stereo matching and space intersection algorithms to generate medium accuracy, low cost DSMs, using various influencing parameters without any recourse to ground control points. The result was to be a software package to undertake this process for implementation on a personal computer. The DSMs generated from Radarsat-1 and Envisat SAR imagery were compared with a reference surface from airborne InSAR and conclusions with respect to the quality of the StereoSAR DSMs are presented. Work required to further improve the StereoSAR system is also suggested
    corecore