45 research outputs found
Pose independent target recognition system using pulsed Ladar imagery
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2004.Includes bibliographical references (p. 95-97).Although a number of object recognition techniques have been developed to process LADAR scanned terrain scenes, these techniques have had limited success in target discrimination in part due to low-resolution data and limits in available computation power. We present a pose-independent Automatic Target Detection and Recognition System that uses data from an airborne 3D imaging Ladar sensor. The Automatic Target Recognition system uses geometric shape and size signatures from target models to detect and recognize targets under heavy canopy and camouflage cover in extended terrain scenes. A method for data integration was developed to register multiple scene views to obtain a more complete 3D surface signature of a target. Automatic target detection was performed using the general approach of"3D cueing," which determines and ranks regions of interest within a large-scale scene based on the likelihood that they contain the respective target. Each region of interest is then passed to an ATR algorithm to accurately identify the target from among a library of target models. Automatic target recognition was performed using spin-image surface matching, a pose-independent algorithm that determines correspondences between a scene and a target of interest. Given a region of interest within a large-scale scene, the ATR algorithm either identifies the target from among a library of 10 target models or reports a "none of the above" outcome. The system performance was demonstrated on five measured scenes with targets both out in the open and under heavy canopy cover, where the target occupied between 1 to 10% of the scene by volume. The ATR section of the system was successfully demonstrated for twelve measured data scenes with targets both out in the open andunder heavy canopy and camouflage cover. Correct target identification was also demonstrated for targets with multiple movable parts that are in arbitrary orientations. The system achieved a high recognition rate (over 99%) along with a low false alarm rate (less than 0.01%) The contributions of this thesis research are: 1) I implemented a novel technique for reconstructing multiple-view 3D Ladar scenes. 2) I demonstrated that spin-image-based detection and recognition is feasible for terrain data collected in the field with a sensor that may be used in a tactical situation and 3) I demonstrated recognition of articulated objects, with multiple movable parts. Immediate benefits of the presented work will be to the area of Automatic Target Recognition of military ground vehicles, where the vehicles of interest may include articulated components with variable position relative to the body, and come in many possible configurations. Other application areas include human detection and recognition for Homeland Security, and registration of large or extended terrain scenes.by Alexandru N. Vasile.M.Eng
Recommended from our members
Recognizing human activity using RGBD data
textTraditional computer vision algorithms try to understand the world using visible light cameras. However, there are inherent limitations of this type of data source. First, visible light images are sensitive to illumination changes and background clutter. Second, the 3D structural information of the scene is lost when projecting the 3D world to 2D images. Recovering the 3D information from 2D images is a challenging problem. Range sensors have existed for over thirty years, which capture 3D characteristics of the scene. However, earlier range sensors were either too expensive, difficult to use in human environments, slow at acquiring data, or provided a poor estimation of distance. Recently, the easy access to the RGBD data at real-time frame rate is leading to a revolution in perception and inspired many new research using RGBD data. I propose algorithms to detect persons and understand the activities using RGBD data. I demonstrate the solutions to many computer vision problems may be improved with the added depth channel. The 3D structural information may give rise to algorithms with real-time and view-invariant properties in a faster and easier fashion. When both data sources are available, the features extracted from the depth channel may be combined with traditional features computed from RGB channels to generate more robust systems with enhanced recognition abilities, which may be able to deal with more challenging scenarios. As a starting point, the first problem is to find the persons of various poses in the scene, including moving or static persons. Localizing humans from RGB images is limited by the lighting conditions and background clutter. Depth image gives alternative ways to find the humans in the scene. In the past, detection of humans from range data is usually achieved by tracking, which does not work for indoor person detection. In this thesis, I propose a model based approach to detect the persons using the structural information embedded in the depth image. I propose a 2D head contour model and a 3D head surface model to look for the head-shoulder part of the person. Then, a segmentation scheme is proposed to segment the full human body from the background and extract the contour. I also give a tracking algorithm based on the detection result. I further research on recognizing human actions and activities. I propose two features for recognizing human activities. The first feature is drawn from the skeletal joint locations estimated from a depth image. It is a compact representation of the human posture called histograms of 3D joint locations (HOJ3D). This representation is view-invariant and the whole algorithm runs at real-time. This feature may benefit many applications to get a fast estimation of the posture and action of the human subject. The second feature is a spatio-temporal feature for depth video, which is called Depth Cuboid Similarity Feature (DCSF). The interest points are extracted using an algorithm that effectively suppresses the noise and finds salient human motions. DCSF is extracted centered on each interest point, which forms the description of the video contents. This descriptor can be used to recognize the activities with no dependence on skeleton information or pre-processing steps such as motion segmentation, tracking, or even image de-noising or hole-filling. It is more flexible and widely applicable to many scenarios. Finally, all the features herein developed are combined to solve a novel problem: first-person human activity recognition using RGBD data. Traditional activity recognition algorithms focus on recognizing activities from a third-person perspective. I propose to recognize activities from a first-person perspective with RGBD data. This task is very novel and extremely challenging due to the large amount of camera motion either due to self exploration or the response of the interaction. I extracted 3D optical flow features as the motion descriptor, 3D skeletal joints features as posture descriptors, spatio-temporal features as local appearance descriptors to describe the first-person videos. To address the ego-motion of the camera, I propose an attention mask to guide the recognition procedures and separate the features on the ego-motion region and independent-motion region. The 3D features are very useful at summarizing the discerning information of the activities. In addition, the combination of the 3D features with existing 2D features brings more robust recognition results and make the algorithm capable of dealing with more challenging cases.Electrical and Computer Engineerin
Trajectory Tracking Control of an Autonomous Ground Vehicle
This thesis proposes a solution to the problem of making an autonomous nonholonomic ground vehicle track a special trajectory while following a reference velocity profile. The proposed strategies have been analyzed, simulated and eventually implemented and verified in Alice, Team Caltech's contribution to the 2007 DARPA Urban Challenge competition for autonomous vehicles. The system architecture of Alice is reviewed. A kinematic vehicle model is derived. Lateral and longitudinal controllers are proposed and analyzed, with emphasis on the nonlinear state feedback lateral controller. Relevant implementation aspects and contingency management is discussed. Finally, results from simulation and field tests are presented and discussed
Classification of non-heat generating outdoor objects in thermal scenes for autonomous robots
We have designed and implemented a physics-based adaptive Bayesian pattern classification model that uses a passive thermal infrared imaging system to automatically characterize non-heat generating objects in unstructured outdoor environments for mobile robots. In the context of this research, non-heat generating objects are defined as objects that are not a source for their own emission of thermal energy, and so exclude people, animals, vehicles, etc. The resulting classification model complements an autonomous bot\u27s situational awareness by providing the ability to classify smaller structures commonly found in the immediate operational environment. Since GPS depends on the availability of satellites and onboard terrain maps which are often unable to include enough detail for smaller structures found in an operational environment, bots will require the ability to make decisions such as go through the hedges or go around the brick wall. A thermal infrared imaging modality mounted on a small mobile bot is a favorable choice for receiving enough detailed information to automatically interpret objects at close ranges while unobtrusively traveling alongside pedestrians. The classification of indoor objects and heat generating objects in thermal scenes is a solved problem. A missing and essential piece in the literature has been research involving the automatic characterization of non-heat generating objects in outdoor environments using a thermal infrared imaging modality for mobile bots. Seeking to classify non-heat generating objects in outdoor environments using a thermal infrared imaging system is a complex problem due to the variation of radiance emitted from the objects as a result of the diurnal cycle of solar energy. The model that we present will allow bots to see beyond vision to autonomously assess the physical nature of the surrounding structures for making decisions without the need for an interpretation by humans.;Our approach is an application of Bayesian statistical pattern classification where learning involves labeled classes of data (supervised classification), assumes no formal structure regarding the density of the data in the classes (nonparametric density estimation), and makes direct use of prior knowledge regarding an object class\u27s existence in a bot\u27s immediate area of operation when making decisions regarding class assignments for unknown objects. We have used a mobile bot to systematically capture thermal infrared imagery for two categories of non-heat generating objects (extended and compact) in several different geographic locations. The extended objects consist of objects that extend beyond the thermal camera\u27s field of view, such as brick walls, hedges, picket fences, and wood walls. The compact objects consist of objects that are within the thermal camera\u27s field of view, such as steel poles and trees. We used these large representative data sets to explore the behavior of thermal-physical features generated from the signals emitted by the classes of objects and design our Adaptive Bayesian Classification Model. We demonstrate that our novel classification model not only displays exceptional performance in characterizing non-heat generating outdoor objects in thermal scenes but it also outperforms the traditional KNN and Parzen classifiers