656 research outputs found
Recommended from our members
Design and performance assessment of correlation filters for the detection of objects in high clutter thermal imagery
The research reported in this thesis has examined means of enhancing the performance of the Optimal Trade-off Maximum Average Correlation Height (OT-MACH) filter for target detection in Forward Looking Infra-Red (FLIR) imagery acquired from a helicopter and border security FLIR camera in northern Kuwait. The data acquired with these FLIR sensors allows real-world evaluation of the comparative performance of the various filters that have been developed in the thesis. The results obtained have been quantified using well known performance measures such as Peak to Side-lobe Ratio (PSR) and Total Detection Error (TDE). The initial focus was to study the effect of modifying the OT-MACH parameters on the correlation metrics. A new optimisation technique has been presented, which computes statistically the filter alpha parameter associated with controlling the response of the filter to clutter noise. A further modification of the OT-MACH filter performance using the Difference of Gaussian bandpass filter (named the D-MACH filter) as a pre-processing stage has been described. The D-MACH has been applied to several test images containing single and multiple targets in the scene. Enhanced performance of the modified filter is demonstrated with improved metrics being obtained with less false side peaks in the correlation plane, especially when multiple targets are present in the test images.
A further pre-processing technique was investigated using the Rayleigh distribution as a pre-processing filter (named the R-MACH filter). The R-MACH filter has been applied
to multiple target types with tests conducted across various image data sets. The filter demonstrated an improvement over the Difference of Gaussian filter in terms of 6 reducing the number of parameters needing to be tuned whilst producing further enhanced correlation plane metrics.
Finally, recommendations for future work has been made to improve the use of the OT-MACH filter in target detection and identification. A novel training image representation is proposed for further investigation, which will minimise the computational intensity of using the MACH filter for unconstrained object recognition
Recommended from our members
Pattern recognition employing spatially variant unconstrained correlation filters
A spatial domain Optimal Trade-off Maximum Average Correlation Height (SPOT-MACH) filter is proposed in this thesis. The proposed technique uses a pre-defined fixed size kernel rather than using estimation techniques. The spatial domain implementation of OT-MACH offers the advantage that it does not have shift invariance imposed on it as the kernel can be modified depending upon its position within the input image. This allows normalization of the kernel and allows inclusion of a space domain non-linearity to improve performance.
The proposed SPOT-MACH filter can be used to maximize the height of the correlation peak in the presence of distortions of the training object and provide resistance to background clutter. One of the major characteristics of the SPOT-MACH filter is that it can be tuned to maximize the height and sharpness of the correlation peak by using trade-offs between distortion tolerance, peak sharpness and the ability to suppress clutter noise.
A number of non-parametric local regression techniques offer a simplified approach to pattern recognition problems which employ linear filtering using low pass filters designed
using moving window local approximations. In most of these cases the algorithms search for a region of interest near the point of estimation for various prevailing conditions which fit the required criteria. These estimates are calculated for a defined window size which is determined as being the largest area within which the estimators do not widely vary from the criteria. The only drawback in this approach is that the window size is directly proportional to the required computational resources and would adversely affect the performance of the system if the moving window size is not proportionate to the resources.
The proposed filter employs an optimization technique using low-pass filtering to highlight the potential region of interests in the image and then restricts the movement of the kernel to these regions to allow target identification and to use less computational resources. Also another optimization technique is also proposed which is based on an entropy filter which measures the degree of randomness between two changing scenes and would return the area where change has occurred i.e. the target object might be present. This approach gives a more accurate region of interest than the low-pass filtering approach.
Apart from the software based optimization approaches two hardware based enhancement techniques have also been proposed in this thesis. One of the approaches employs Field
Programmable Gate Array (FPGA) to perform correlation process employing the inbuilt multipliers and look up tables and the other one uses Graphical Processing Unit (GPU) to do parallel processing of the input scene.
Also in this thesis a detailed analysis of SPOT-MACH has been carried out by comparing with popular feature based techniques like Scale Invariant Feature Transform (SIFT) and a comparison matrix has been created.
The proposed filter uses a two-staged approach using speed optimizations and then detection of targets from input scenes. Both visible and Forward Looking Infrared (FLIR) imagery data sets have been used to test the performance of filter
Object Recognition
Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs
Sonar image interpretation for sub-sea operations
Mine Counter-Measure (MCM) missions are conducted to neutralise underwater
explosives. Automatic Target Recognition (ATR) assists operators by
increasing the speed and accuracy of data review. ATR embedded on vehicles
enables adaptive missions which increase the speed of data acquisition. This
thesis addresses three challenges; the speed of data processing, robustness of
ATR to environmental conditions and the large quantities of data required to
train an algorithm.
The main contribution of this thesis is a novel ATR algorithm. The algorithm
uses features derived from the projection of 3D boxes to produce a set of 2D
templates. The template responses are independent of grazing angle, range
and target orientation. Integer skewed integral images, are derived to accelerate
the calculation of the template responses. The algorithm is compared
to the Haar cascade algorithm. For a single model of sonar and cylindrical
targets the algorithm reduces the Probability of False Alarm (PFA) by 80%
at a Probability of Detection (PD) of 85%. The algorithm is trained on target
data from another model of sonar. The PD is only 6% lower even though no
representative target data was used for training.
The second major contribution is an adaptive ATR algorithm that uses local
sea-floor characteristics to address the problem of ATR robustness with
respect to the local environment. A dual-tree wavelet decomposition of the
sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is
used to segment the terrain. A Neural Network (NN) is then trained to filter
ATR results based on the local sea-floor context. It is shown, for the Haar
Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%.
Speed of data processing is addressed using novel pre-processing techniques.
The standard three class MRF, for sonar image segmentation, is formulated
using graph-cuts. Consequently, a 1.2 million pixel image is segmented in
1.2 seconds. Additionally, local estimation of class models is introduced to
remove range dependent segmentation quality. Finally, an A* graph search
is developed to remove the surface return, a line of saturated pixels often
detected as false alarms by ATR. The A* search identifies the surface return
in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is
robust to the presence of ripples and rocks
Adaptive object segmentation and tracking
Efficient tracking of deformable objects moving with variable velocities is an important current research problem. In this thesis a robust tracking model is proposed for the automatic detection, recognition and tracking of target objects which are subject to variable orientations and velocities and are viewed under variable ambient lighting conditions. The tracking model can be applied to efficiently track fast moving vehicles and other objects in various complex scenarios. The tracking model is evaluated on both colour visible band and infra-red band video sequences acquired from the air by the Sussex police helicopter and other collaborators. The observations made validate the improved performance of the model over existing methods.
The thesis is divided in three major sections. The first section details the development of an enhanced active contour for object segmentation. The second section describes an implementation of a global active contour orientation model. The third section describes the tracking model and assesses it performance on the aerial video sequences.
In the first part of the thesis an enhanced active contour snake model using the difference of Gaussian (DoG) filter is reported and discussed in detail. An acquisition method based on the enhanced active contour method developed that can assist the proposed tracking system is tested. The active contour model is further enhanced by the use of a disambiguation framework designed to assist multiple object segmentation which is used to demonstrate that the enhanced active contour model can be used for robust multiple object segmentation and tracking. The active contour model developed not only facilitates the efficient update of the tracking filter but also decreases the latency involved in tracking targets in real-time. As far as computational effort is concerned, the active contour model presented improves the computational cost by 85% compared to existing active contour models.
The second part of the thesis introduces the global active contour orientation (GACO) technique for statistical measurement of contoured object orientation. It is an overall object orientation measurement method which uses the proposed active contour model along with statistical measurement techniques. The use of the GACO technique, incorporating the active contour model, to measure object orientation angle is discussed in detail. A real-time door surveillance application based on the GACO technique is developed and evaluated on the i-LIDS door surveillance dataset provided by the UK Home Office. The performance results demonstrate the use of GACO to evaluate the door surveillance dataset gives a success rate of 92%.
Finally, a combined approach involving the proposed active contour model and an optimal trade-off maximum average correlation height (OT-MACH) filter for tracking is presented. The implementation of methods for controlling the area of support of the OT-MACH filter is discussed in detail. The proposed active contour method as the area of support for the OT-MACH filter is shown to significantly improve the performance of the OT-MACH filter's ability to track vehicles moving within highly cluttered visible and infra-red band video sequence
Object Tracking and Mensuration in Surveillance Videos
This thesis focuses on tracking and mensuration in surveillance videos. The
first part of the thesis discusses several object tracking approaches based on the
different properties of tracking targets. For airborne videos, where the targets are
usually small and with low resolutions, an approach of building motion models for
foreground/background proposed in which the foreground target is simplified as a
rigid object. For relatively high resolution targets, the non-rigid models are applied.
An active contour-based algorithm has been introduced. The algorithm is based on
decomposing the tracking into three parts: estimate the affine transform parameters
between successive frames using particle filters; detect the contour deformation using
a probabilistic deformation map, and regulate the deformation by projecting the
updated model onto a trained shape subspace. The active appearance Markov chain
(AAMC). It integrates a statistical model of shape, appearance and motion. In the
AAMC model, a Markov chain represents the switching of motion phases (poses),
and several pairwise active appearance model (P-AAM) components characterize the
shape, appearance and motion information for different motion phases. The second
part of the thesis covers video mensuration, in which we have proposed a heightmeasuring
algorithm with less human supervision, more flexibility and improved
robustness. From videos acquired by an uncalibrated stationary camera, we first
recover the vanishing line and the vertical point of the scene. We then apply a single
view mensuration algorithm to each of the frames to obtain height measurements.
Finally, using the LMedS as the cost function and the Robbins-Monro stochastic
approximation (RMSA) technique to obtain the optimal estimate
Map-Based Localization for Unmanned Aerial Vehicle Navigation
Unmanned Aerial Vehicles (UAVs) require precise pose estimation when navigating in indoor and GNSS-denied / GNSS-degraded outdoor environments. The possibility of crashing in these environments is high, as spaces are confined, with many moving obstacles. There are many solutions for localization in GNSS-denied environments, and many different technologies are used. Common solutions involve setting up or using existing infrastructure, such as beacons, Wi-Fi, or surveyed targets. These solutions were avoided because the cost should be proportional to the number of users, not the coverage area. Heavy and expensive sensors, for example a high-end IMU, were also avoided. Given these requirements, a camera-based localization solution was selected for the sensor pose estimation. Several camera-based localization approaches were investigated. Map-based localization methods were shown to be the most efficient because they close loops using a pre-existing map, thus the amount of data and the amount of time spent collecting data are reduced as there is no need to re-observe the same areas multiple times. This dissertation proposes a solution to address the task of fully localizing a monocular camera onboard a UAV with respect to a known environment (i.e., it is assumed that a 3D model of the environment is available) for the purpose of navigation for UAVs in structured environments.
Incremental map-based localization involves tracking a map through an image sequence. When the map is a 3D model, this task is referred to as model-based tracking. A by-product of the tracker is the relative 3D pose (position and orientation) between the camera and the object being tracked. State-of-the-art solutions advocate that tracking geometry is more robust than tracking image texture because edges are more invariant to changes in object appearance and lighting. However, model-based trackers have been limited to tracking small simple objects in small environments. An assessment was performed in tracking larger, more complex building models, in larger environments. A state-of-the art model-based tracker called ViSP (Visual Servoing Platform) was applied in tracking outdoor and indoor buildings using a UAVs low-cost camera. The assessment revealed weaknesses at large scales. Specifically, ViSP failed when tracking was lost, and needed to be manually re-initialized. Failure occurred when there was a lack of model features in the cameras field of view, and because of rapid camera motion. Experiments revealed that ViSP achieved positional accuracies similar to single point positioning solutions obtained from single-frequency (L1) GPS observations standard deviations around 10 metres. These errors were considered to be large, considering the geometric accuracy of the 3D model used in the experiments was 10 to 40 cm. The first contribution of this dissertation proposes to increase the performance of the localization system by combining ViSP with map-building incremental localization, also referred to as simultaneous localization and mapping (SLAM). Experimental results in both indoor and outdoor environments show sub-metre positional accuracies were achieved, while reducing the number of tracking losses throughout the image sequence. It is shown that by integrating model-based tracking with SLAM, not only does SLAM improve model tracking performance, but the model-based tracker alleviates the computational expense of SLAMs loop closing procedure to improve runtime performance. Experiments also revealed that ViSP was unable to handle occlusions when a complete 3D building model was used, resulting in large errors in its pose estimates. The second contribution of this dissertation is a novel map-based incremental localization algorithm that improves tracking performance, and increases pose estimation accuracies from ViSP. The novelty of this algorithm is the implementation of an efficient matching process that identifies corresponding linear features from the UAVs RGB image data and a large, complex, and untextured 3D model. The proposed model-based tracker improved positional accuracies from 10 m (obtained with ViSP) to 46 cm in outdoor environments, and improved from an unattainable result using VISP to 2 cm positional accuracies in large indoor environments.
The main disadvantage of any incremental algorithm is that it requires the camera pose of the first frame. Initialization is often a manual process. The third contribution of this dissertation is a map-based absolute localization algorithm that automatically estimates the camera pose when no prior pose information is available. The method benefits from vertical line matching to accomplish a registration procedure of the reference model views with a set of initial input images via geometric hashing. Results demonstrate that sub-metre positional accuracies were achieved and a proposed enhancement of conventional geometric hashing produced more correct matches - 75% of the correct matches were identified, compared to 11%. Further the number of incorrect matches was reduced by 80%
UAV based distributed automatic target detection algorithm under realistic simulated environmental effects
Over the past several years, the military has grown increasingly reliant upon the use of unattended aerial vehicles (UAVs) for surveillance missions. There is an increasing trend towards fielding swarms of UAVs operating as large-scale sensor networks in the air [1]. Such systems tend to be used primarily for the purpose of acquiring sensory data with the goal of automatic detection, identification, and tracking objects of interest. These trends have been paralleled by advances in both distributed detection [2], image/signal processing and data fusion techniques [3]. Furthermore, swarmed UAV systems must operate under severe constraints on environmental conditions and sensor limitations. In this work, we investigate the effects of environmental conditions on target detection performance in a UAV network. We assume that each UAV is equipped with an optical camera, and use a realistic computer simulation to generate synthetic images. The automatic target detector is a cascade of classifiers based on Haar-like features. The detector\u27s performance is evaluated using simulated images that closely mimic data acquired in a UAV network under realistic camera and environmental conditions. In order to improve automatic target detection (ATD) performance in a swarmed UAV system, we propose and design several fusion techniques both at the image and score level and analyze both the case of a single observation and the case of multiple observations of the same target
Bio-inspired log-polar based color image pattern analysis in multiple frequency channels
The main topic addressed in this thesis is to implement color image pattern recognition based on the lateral inhibition subtraction phenomenon combined with a complex log-polar mapping in multiple spatial frequency channels. It is shown that the individual red, green and blue channels have different recognition performances when put in the context of former work done by Dragan Vidacic. It is observed that the green channel performs better than the other two channels, with the blue channel having the poorest performance. Following the application of a contrast stretching function the object recognition performance is improved in all channels. Multiple spatial frequency filters were designed to simulate the filtering channels that occur in the human visual system. Following these preprocessing steps Dragan Vidacic\u27s methodology is followed in order to determine the benefits that are obtained from the preprocessing steps being investigated. It is shown that performance gains are realized by using such preprocessing steps
- âŠ