194 research outputs found

    3D city scale reconstruction using wide area motion imagery

    Get PDF
    3D reconstruction is one of the most challenging but also most necessary part of computer vision. It is generally applied everywhere, from remote sensing to medical imaging and multimedia. Wide Area Motion Imagery is a field that has gained traction over the recent years. It consists in using an airborne large field of view sensor to cover a typically over a square kilometer area for each captured image. This is particularly valuable data for analysis but the amount of information is overwhelming for any human analyst. Algorithms to efficiently and automatically extract information are therefore needed and 3D reconstruction plays a critical part in it, along with detection and tracking. This dissertation work presents novel reconstruction algorithms to compute a 3D probabilistic space, a set of experiments to efficiently extract photo realistic 3D point clouds and a range of transformations for possible applications of the generated 3D data to filtering, data compression and mapping. The algorithms have been successfully tested on our own datasets provided by Transparent Sky and this thesis work also proposes methods to evaluate accuracy, completeness and photo-consistency. The generated data has been successfully used to improve detection and tracking performances, and allows data compression and extrapolation by generating synthetic images from new point of view, and data augmentation with the inferred occlusion areas.Includes bibliographical reference

    Selected techniques for vehicle tracking and assessment in wide area motion imagery

    Get PDF
    Title from PDF of title page (University of Missouri--Columbia, viewed on March 28, 2011).The entire thesis text is included in the research.pdf file; the official abstract appears in the short.pdf file; a non-technical public abstract appears in the public.pdf file.Thesis advisor: Dr. K. Palaniappan, and Dr. F. Bunyak.M.S. University of Missouri--Columbia 2010.Tracking of vehicles in wide area motion imagery (WAMI) is very important for civilian and military surveillance. Tracking in a dataset that is characterized by very large format video with an extremely wide field-of-view (covering few to tens of square miles), and with very minimal ground resolution (images taken at about 4000ft to 5000ft above ground) and with low frame rates (1-10 frames/ sec), is a very challenging job. This research describes some of the techniques and approaches taken towards developing a low frame rate automatic and assisted vehicle tracking system and also develops a performance evaluation system for low frame rate tracker. One approach that is taken on this challenging dataset is extracting roads from these images using the geo-registered property of the data. This makes the car detection algorithms using Bayesian approach run considerably faster and efficiently. Also, car tracking algorithms can use this apriori knowledge of roads. The car tracking algorithm using Camshift has been further modified/improved, customizing it to track cars better in this dataset. A performance evaluation system developed in this research can be used for measuring the performance improvement of the tracker as it advances over the coming years. It can also be used for parameter tuning. This performance evaluation system can be used for testing the tracker performance using two approaches, the approach using gaps and approach using tracklets. Both of these frameworks are developed using information theoretics measures and non-information theoretic measures.Includes bibliographical references

    ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information

    Full text link
    Object detection in wide area motion imagery (WAMI) has drawn the attention of the computer vision research community for a number of years. WAMI proposes a number of unique challenges including extremely small object sizes, both sparse and densely-packed objects, and extremely large search spaces (large video frames). Nearly all state-of-the-art methods in WAMI object detection report that appearance-based classifiers fail in this challenging data and instead rely almost entirely on motion information in the form of background subtraction or frame-differencing. In this work, we experimentally verify the failure of appearance-based classifiers in WAMI, such as Faster R-CNN and a heatmap-based fully convolutional neural network (CNN), and propose a novel two-stage spatio-temporal CNN which effectively and efficiently combines both appearance and motion information to significantly surpass the state-of-the-art in WAMI object detection. To reduce the large search space, the first stage (ClusterNet) takes in a set of extremely large video frames, combines the motion and appearance information within the convolutional architecture, and proposes regions of objects of interest (ROOBI). These ROOBI can contain from one to clusters of several hundred objects due to the large video frame size and varying object density in WAMI. The second stage (FoveaNet) then estimates the centroid location of all objects in that given ROOBI simultaneously via heatmap estimation. The proposed method exceeds state-of-the-art results on the WPAFB 2009 dataset by 5-16% for moving objects and nearly 50% for stopped objects, as well as being the first proposed method in wide area motion imagery to detect completely stationary objects.Comment: Main paper is 8 pages. Supplemental section contains a walk-through of our method (using a qualitative example) and qualitative results for WPAFB 2009 datase

    Accurate, fast, and robust 3D city-scale reconstruction using wide area motion imagery

    Get PDF
    Multi-view stereopsis (MVS) is a core problem in computer vision, which takes a set of scene views together with known camera poses, then produces a geometric representation of the underlying 3D model Using 3D reconstruction one can determine any object's 3D profile, as well as knowing the 3D coordinate of any point on the profile. The 3D reconstruction of objects is a generally scientific problem and core technology of a wide variety of fields, such as Computer Aided Geometric Design (CAGD), computer graphics, computer animation, computer vision, medical imaging, computational science, virtual reality, digital media, etc. However, though MVS problems have been studied for decades, many challenges still exist in current state-of-the-art algorithms, for example, many algorithms still lack accuracy and completeness when tested on city-scale large datasets, most MVS algorithms available require a large amount of execution time and/or specialized hardware and software, which results in high cost, and etc... This dissertation work tries to address all the challenges we mentioned, and proposed multiple solutions. More specifically, this dissertation work proposed multiple novel MVS algorithms to automatically and accurately reconstruct the underlying 3D scenes. By proposing a novel volumetric voxel-based method, one of our algorithms achieved near real-time runtime speed, which does not require any special hardware or software, and can be deployed onto power-constrained embedded systems. By developing a new camera clustering module and a novel weighted voting-based surface likelihood estimation module, our algorithm is generalized to process di erent datasets, and achieved the best performance in terms of accuracy and completeness when compared with existing algorithms. This dissertation work also performs the very first quantitative evaluation in terms of precision, recall, and F-score using real-world LiDAR groundtruth data. Last but not least, this dissertation work proposes an automatic workflow, which can stitch multiple point cloud models with limited overlapping areas into one larger 3D model for better geographical coverage. All the results presented in this dissertation work have been evaluated in our wide area motion imagery (WAMI) dataset, and improved the state-of-the-art performances by a large margin.The generated results from this dissertation work have been successfully used in many aspects, including: city digitization, improving detection and tracking performances, real time dynamic shadow detection, 3D change detection, visibility map generating, VR environment, and visualization combined with other information, such as building footprint and roads.Includes bibliographical references

    Nonlinear Image Enhancement and Super Resolution for Enhanced Object Tracking

    Get PDF
    Tracking objects, such as vehicles and humans, in wide area motion imagery (WAMI) is a challenging problem because of the limited pixel area and the low contrast/visibility of the target objects. We propose an approach to make automatic tracking algorithms more effective by incorporating image enhancement and super resolution as preprocessing algorithms. The enhancement process includes the stages of dynamic range compression and contrast enhancement. Dynamic range compression is performed by a neighborhood based nonlinear intensity transformation process, which utilizes a locally tuned inverse sine nonlinear function to generate various nonlinear curves based on pixel’s neighborhood information. These nonlinear curves are used to select the new intensity value for each pixel. A contrast enhancement technique is used to maintain or improve the contrast of the original image. Local contrast enhancement using surrounding pixel information aids in extracting higher number of features a detector can find in the image, and therefore, improves the automatic object detection capabilities. Secondly, the super resolution technique is performed on an area surrounding the object of interest to increase the size of the object in terms of pixels. The single image super resolution process is performed in the Fourier phase space which preserves the local structure of each pixel in order to estimate the interpolated pixels in the high resolution image. As a result, super resolution increases the sharpness of edges and allows for addition tracking features to be extracted. The combination of these two techniques provides the necessary preprocessing enhancement to increase the effectiveness of tracking algorithms. A quantitative evaluation is performed to compare the results of the tracking with and without the proposed techniques. The analysis is based on results of an automatic detection and tracking technique, Gaussian Ringlet Intensity Distribution (GRID), evaluated using wide area motion imagery data.https://ecommons.udayton.edu/stander_posters/1481/thumbnail.jp

    Directional Ringlet Intensity Feature Transform for Tracking

    Get PDF
    The challenges existing for current intensity-based histogram feature tracking methods in wide area motion imagery include object structural information distortions and background variations, such as different pavement or ground types. All of these challenges need to be met in order to have a robust object tracker, while attaining to be computed at an appropriate speed for real-time processing. To achieve this we propose a novel method, Directional Ringlet Intensity Feature Transform (DRIFT), that employs Kirsch kernel filtering and Gaussian ringlet feature mapping. We evaluated the DRIFT on two challenging datasets, namely Columbus Large Image Format (CLIF) and Large Area Image Recorder (LAIR), to evaluate its robustness and efficiency. Experimental results show that the proposed approach yields the highest accuracy compared to state-of-the-art object tracking methods

    Aerial Vehicle Tracking by Adaptive Fusion of Hyperspectral Likelihood Maps

    Full text link
    Hyperspectral cameras can provide unique spectral signatures for consistently distinguishing materials that can be used to solve surveillance tasks. In this paper, we propose a novel real-time hyperspectral likelihood maps-aided tracking method (HLT) inspired by an adaptive hyperspectral sensor. A moving object tracking system generally consists of registration, object detection, and tracking modules. We focus on the target detection part and remove the necessity to build any offline classifiers and tune a large amount of hyperparameters, instead learning a generative target model in an online manner for hyperspectral channels ranging from visible to infrared wavelengths. The key idea is that, our adaptive fusion method can combine likelihood maps from multiple bands of hyperspectral imagery into one single more distinctive representation increasing the margin between mean value of foreground and background pixels in the fused map. Experimental results show that the HLT not only outperforms all established fusion methods but is on par with the current state-of-the-art hyperspectral target tracking frameworks.Comment: Accepted at the International Conference on Computer Vision and Pattern Recognition Workshops, 201
    • …
    corecore