295 research outputs found
Recommended from our members
Development and evaluation of a multiscale keypoint detector based on complex wavelets
This thesis develops a multiscale keypoint detector and descriptor based on the Dual-Tree Complex Wavelet Transform (DTCWT). First, we develop a scale-space framework called the 4S-DTCWT that uses the dyadic decomposition of the DTCWT but achieves denser sampling in scale by interleaving several DTCWT trees, leading to reduced scale-related aliasing. This forms the foundation for the rest of our work. Then, we present a new DTCWT based keypoint detector (BTK), which exhibits improved spatial localisation owing to the use of a more selective cornerness measure and keypoint localisation in individual levels in the 4S-DTCWT. A number of scale refinement approaches are investigated.
The improved keypoint position and scale localisation directly leads to more robust image characterisation using DTCWT based visual descriptors. We also present some ways of speeding up both the descriptor and the matching computations. These changes make it possible to use the system in practical scenarios.
We develop a novel, fully automated framework for the evaluation of keypoint detectors and descriptors. This includes a new dataset containing 3978 calibrated images from 2 cameras of 39 different toy cars on a turntable. The dataset, calibration images, inter-camera calibration, rotational calibration and test scripts are publicly available. We establish ground truth correspondences using a three-image setup, with fixed angular separation between two of the three views, thus reducing the dependency on angular separation when compared to conventional epipolar line search.
Various keypoint detectors and descriptors were compared with DTCWT based methods using this framework. To the extent possible, we separated the evaluation of the keypoint detectors from that of the descriptors. The main conclusions were that DTCWT based methods can achieve a performance comparable, if not superior, to that of established methods. We also showed that, although repeatability of keypoint detections falls off reasonably steeply with change in viewing angle, conditioned on an associated keypoint being detected at a reasonably correct corresponding location, descriptor similarity is hardly affected by viewpoint variation.
Finally, we show how an evaluation that is based purely on the prior knowledge of the geometry of the scene can be useful in eliminating the inaccuracies involved in appearance based evaluations. This uses an enhanced epipolar constraint that exploits both positions and scales of keypoints to constrain the range of possible matches
An Efficient Reconfigurable Architecture for Fingerprint Recognition
The fingerprint identification is an efficient biometric technique to authenticate human beings in real-time Big Data Analytics. In this paper, we propose an efficient Finite State Machine (FSM) based reconfigurable architecture for fingerprint recognition. The fingerprint image is resized, and Compound Linear Binary Pattern (CLBP) is applied on fingerprint, followed by histogram to obtain histogram CLBP features. Discrete Wavelet Transform (DWT) Level 2 features are obtained by the same methodology. The novel matching score of CLBP is computed using histogram CLBP features of test image and fingerprint images in the database. Similarly, the DWT matching score is computed using DWT features of test image and fingerprint images in the database. Further, the matching scores of CLBP and DWT are fused with arithmetic equation using improvement factor. The performance parameters such as TSR (Total Success Rate), FAR (False Acceptance Rate), and FRR (False Rejection Rate) are computed using fusion scores with correlation matching technique for FVC2004 DB3 Database. The proposed fusion based VLSI architecture is synthesized on Virtex xc5vlx30T-3 FPGA board using Finite State Machine resulting in optimized parameters
A Survey of Partition-Based Techniques for Copy-Move Forgery Detection
A copy-move forged image results from a specific type of image tampering procedure carried out by copying a part of an image and pasting it on one or more parts of the same image generally to maliciously hide unwanted objects/regions or clone an object. Therefore, detecting such forgeries mainly consists in devising ways of exposing identical or relatively similar areas in images. This survey attempts to cover existing partition-based copy-move forgery detection techniques
Super Resolution of Wavelet-Encoded Images and Videos
In this dissertation, we address the multiframe super resolution reconstruction problem for wavelet-encoded images and videos. The goal of multiframe super resolution is to obtain one or more high resolution images by fusing a sequence of degraded or aliased low resolution images of the same scene. Since the low resolution images may be unaligned, a registration step is required before super resolution reconstruction. Therefore, we first explore in-band (i.e. in the wavelet-domain) image registration; then, investigate super resolution. Our motivation for analyzing the image registration and super resolution problems in the wavelet domain is the growing trend in wavelet-encoded imaging, and wavelet-encoding for image/video compression. Due to drawbacks of widely used discrete cosine transform in image and video compression, a considerable amount of literature is devoted to wavelet-based methods. However, since wavelets are shift-variant, existing methods cannot utilize wavelet subbands efficiently. In order to overcome this drawback, we establish and explore the direct relationship between the subbands under a translational shift, for image registration and super resolution. We then employ our devised in-band methodology, in a motion compensated video compression framework, to demonstrate the effective usage of wavelet subbands. Super resolution can also be used as a post-processing step in video compression in order to decrease the size of the video files to be compressed, with downsampling added as a pre-processing step. Therefore, we present a video compression scheme that utilizes super resolution to reconstruct the high frequency information lost during downsampling. In addition, super resolution is a crucial post-processing step for satellite imagery, due to the fact that it is hard to update imaging devices after a satellite is launched. Thus, we also demonstrate the usage of our devised methods in enhancing resolution of pansharpened multispectral images
The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work).
This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream -- from V1, V2, V4 and to IT -- is to discount image transformations, after learning them during development
Directional edge and texture representations for image processing
An efficient representation for natural images is of fundamental importance in image processing and analysis. The commonly used separable transforms such as wavelets axe not best suited for images due to their inability to exploit directional regularities such as edges and oriented textural patterns; while most of the recently proposed directional schemes cannot represent these two types of features in a unified transform. This thesis focuses on the development of directional representations for images which can capture both edges and textures in a multiresolution manner. The thesis first considers the problem of extracting linear features with the multiresolution Fourier transform (MFT). Based on a previous MFT-based linear feature model, the work extends the extraction method into the situation when the image is corrupted by noise. The problem is tackled by the combination of a "Signal+Noise" frequency model, a refinement stage and a robust classification scheme. As a result, the MFT is able to perform linear feature analysis on noisy images on which previous methods failed. A new set of transforms called the multiscale polar cosine transforms (MPCT) are also proposed in order to represent textures. The MPCT can be regarded as real-valued MFT with similar basis functions of oriented sinusoids. It is shown that the transform can represent textural patches more efficiently than the conventional Fourier basis. With a directional best cosine basis, the MPCT packet (MPCPT) is shown to be an efficient representation for edges and textures, despite its high computational burden. The problem of representing edges and textures in a fixed transform with less complexity is then considered. This is achieved by applying a Gaussian frequency filter, which matches the disperson of the magnitude spectrum, on the local MFT coefficients. This is particularly effective in denoising natural images, due to its ability to preserve both types of feature. Further improvements can be made by employing the information given by the linear feature extraction process in the filter's configuration. The denoising results compare favourably against other state-of-the-art directional representations
- …