63 research outputs found

    Appearance modeling for persistent object tracking in wide-area and full motion video

    Get PDF
    Object tracking is a core element of computer vision and autonomous systems. As such single and multiple object tracking has been widely investigated especially for full motion video sequences. The acquisition of wide-area motion imagery (WAMI) from moving airborne platforms is a much more recent sensor innovation that has an array of defense and civilian applications with numerous opportunities for providing a unique combination of dense spatial and temporal coverage unmatched by other sensor systems. Airborne WAMI presents a host of challenges for object tracking including large data volume, multi-camera arrays, image stabilization, low resolution targets, target appearance variability and high background clutter especially in urban environments. Time varying low frame rate large imagery poses a range of difficulties in terms of reliable long term multi-target tracking. The focus of this thesis is on the Likelihood of Features Tracking (LOFT) testbed system that is an appearance based (single instance) object tracker designed specifcally for WAMI and follows the track before detect paradigm. The motivation for tracking using dynamics before detecting was so that large scale data can be handled in an environment where computational cost can be kept at a bare minimum. Searching for an object everywhere on a large frame is not practical as there are many similar objects, clutter, high rise structures in case of urban scenes and comes with the additional burden of greatly increased computational cost. LOFT bypasses this difficulty by using filtering and dynamics to constrain the search area to a more realistic region within the large frame and uses multiple features to discern objects of interest. The objects of interest are expected as input in the form of bounding boxes to the algorithm. The main goal of this work is to present an appearance update modeling strategy that fits LOFT's track before detect paradigm and to showcase the accuracy of the overall system as compared with other state of the art tracking algorithms and also with and without the presence of this strategy. The update strategy using various information cues from the Radon Transform was designed with certain performance parameters in mind such as minimal increase in computational cost and a considerable increase in precision and recall rates of the overall system. This has been demonstrated with supporting performance numbers using standard evaluation techniques as in literature. The extensions of LOFT WAMI tracker to include a more detailed appearance model with an update strategy that is well suited for persistent target tracking is novel in the opinion of the author. Key engineering contributions have been made with the help of this work wherein the core LOFT has been evaluated as part several government research and development programs including the Air Force Research Lab's Command, Control, Communications, Computers, Intelligence, Surveillance and Reconnaissance (C4ISR) Enterprise to the Edge (CETE), Army Research Lab's Advanced Video Activity Analytics (AVAA) and a proposed fine grained distributed computing architecture on the cloud for processing at the edge. A simplified version of LOFT was developed for tracking objects in standard videos and entered in the Visual Object Tracking (VOT) Challenge competition that is held in conjunction with the leading computer vision conferences. LOFT incorporating the proposed appearance adaptation module produces significantly better tracking results in aerial WAMI of urban scenes

    Robust real-time tracking in smart camera networks

    Get PDF

    Bayesian Model Based Tracking with Application to Cell Segmentation and Tracking

    Get PDF
    The goal of this research is to develop a model-based tracking framework with biomedical imaging applications. This is an interdisciplinary area of research with interests in machine vision, image processing, and biology. This thesis presents methods of image modeling, tracking, and data association applied to problems in multi-cellular image analysis, especially hematopoietic stem cell (HSC) images at the current stage. The focus of this research is on the development of a robust image analysis interface capable of detecting, locating, and tracking individual hematopoietic stem cells (HSCs), which proliferate and differentiate to different blood cell types continuously during their lifetime, and are of substantial interest in gene therapy, cancer, and stem-cell research. Such a system can be potentially employed in the future to track different groups of HSCs extracted from bone marrow and recognize the best candidates based on some biomedical-biological criteria. Selected candidates can further be used for bone marrow transplantation (BMT) which is a medical procedure for the treatment of various incurable diseases such as leukemia, lymphomas, aplastic anemia, immune deficiency disorders, multiple myeloma and some solid tumors. Tracking HSCs over time is a localization-based tracking problem which is one of the most challenging tracking problems to be solved. The proposed cell tracking system consists of three inter-related stages: i) Cell detection/localization, ii) The association of detected cells, iii) Background estimation/subtraction. that will be discussed in detail

    Gait Recognition: Databases, Representations, and Applications

    No full text
    There has been considerable progress in automatic recognition of people by the way they walk since its inception almost 20 years ago: there is now a plethora of technique and data which continue to show that a person’s walking is indeed unique. Gait recognition is a behavioural biometric which is available even at a distance from a camera when other biometrics may be occluded, obscured or suffering from insufficient image resolution (e.g. a blurred face image or a face image occluded by mask). Since gait recognition does not require subject cooperation due to its non-invasive capturing process, it is expected to be applied for criminal investigation from CCTV footages in public and private spaces. This article introduces current progress, a research background, and basic approaches for gait recognition in the first three sections, and two important aspects of gait recognition, the gait databases and gait feature representations are described in the following sections.Publicly available gait databases are essential for benchmarking individual approaches, and such databases should contain a sufficient number of subjects as well as covariate factors to realize statistically reliable performance evaluation and also robust gait recognition. Gait recognition researchers have therefore built such useful gait databases which incorporate subject diversities and/or rich covariate factors.Gait feature representation is also an important aspect for effective and efficient gait recognition. We describe the two main approaches to representation: model-free (appearance-based) approaches and model-based approaches. In particular, silhouette-based model-free approaches predominate in recent studies and many have been proposed and are described in detail.Performance evaluation results of such recent gait feature representations on two of the publicly available gait databases are reported: USF Human ID with rich covariate factors such as views, surface, bag, shoes, time elapse; and OU-ISIR LP with more than 4,000 subjects. Since gait recognition is suitable for criminal investigation applications of the gait recognition to forensics are addressed with real criminal cases in the application section. Finally, several open problems of the gait recognition are discussed to show future research avenues of the gait recognition

    Sea-Surface Object Detection Based on Electro-Optical Sensors: A Review

    Get PDF
    Sea-surface object detection is critical for navigation safety of autonomous ships. Electrooptical (EO) sensors, such as video cameras, complement radar on board in detecting small obstacle sea-surface objects. Traditionally, researchers have used horizon detection, background subtraction, and foreground segmentation techniques to detect sea-surface objects. Recently, deep learning-based object detection technologies have been gradually applied to sea-surface object detection. This article demonstrates a comprehensive overview of sea-surface object-detection approaches where the advantages and drawbacks of each technique are compared, covering four essential aspects: EO sensors and image types, traditional object-detection methods, deep learning methods, and maritime datasets collection. In particular, sea-surface object detections based on deep learning methods are thoroughly analyzed and compared with highly influential public datasets introduced as benchmarks to verify the effectiveness of these approaches. The arti

    Frequency Domain Decomposition of Digital Video Containing Multiple Moving Objects

    Get PDF
    Motion estimation has been dominated by time domain methods such as block matching and optical flow. However, these methods have problems with multiple moving objects in the video scene, moving backgrounds, noise, and fractional pixel/frame motion. This dissertation proposes a frequency domain method (FDM) that solves these problems. The methodology introduced here addresses multiple moving objects, with or without a moving background, 3-D frequency domain decomposition of digital video as the sum of locally translational (or, in the case of background, a globally translational motion), with high noise rejection. Additionally, via a version of the chirp-Z, fractional pixel/frame motion detection and quantification is accomplished. Furthermore, images of particular moving objects can be extracted and reconstructed from the frequency domain. Finally, this method can be integrated into a larger system to support motion analysis. The method presented here has been tested with synthetic data, realistic, high fidelity simulations, and actual data from established video archives to verify the claims made for the method, all presented here. In addition, a convincing comparison with an up-and-coming spatial domain method, incremental principal component pursuit (iPCP), is presented, where the FDM performs markedly better than its competition

    Efficient Human Activity Recognition in Large Image and Video Databases

    Get PDF
    Vision-based human action recognition has attracted considerable interest in recent research for its applications to video surveillance, content-based search, healthcare, and interactive games. Most existing research deals with building informative feature descriptors, designing efficient and robust algorithms, proposing versatile and challenging datasets, and fusing multiple modalities. Often, these approaches build on certain conventions such as the use of motion cues to determine video descriptors, application of off-the-shelf classifiers, and single-factor classification of videos. In this thesis, we deal with important but overlooked issues such as efficiency, simplicity, and scalability of human activity recognition in different application scenarios: controlled video environment (e.g.~indoor surveillance), unconstrained videos (e.g.~YouTube), depth or skeletal data (e.g.~captured by Kinect), and person images (e.g.~Flicker). In particular, we are interested in answering questions like (a) is it possible to efficiently recognize human actions in controlled videos without temporal cues? (b) given that the large-scale unconstrained video data are often of high dimension low sample size (HDLSS) nature, how to efficiently recognize human actions in such data? (c) considering the rich 3D motion information available from depth or motion capture sensors, is it possible to recognize both the actions and the actors using only the motion dynamics of underlying activities? and (d) can motion information from monocular videos be used for automatically determining saliency regions for recognizing actions in still images
    corecore