587 research outputs found

    Accurate depth from defocus estimation with video-rate implementation

    Get PDF
    The science of measuring depth from images at video rate using „defocus‟ has been investigated. The method required two differently focussed images acquired from a single view point using a single camera. The relative blur between the images was used to determine the in-focus axial points of each pixel and hence depth. The depth estimation algorithm researched by Watanabe and Nayar was employed to recover the depth estimates, but the broadband filters, referred as the Rational filters were designed using a new procedure: the Two Step Polynomial Approach. The filters designed by the new model were largely insensitive to object texture and were shown to model the blur more precisely than the previous method. Experiments with real planar images demonstrated a maximum RMS depth error of 1.18% for the proposed filters, compared to 1.54% for the previous design. The researched software program required five 2D convolutions to be processed in parallel and these convolutions were effectively implemented on a FPGA using a two channel, five stage pipelined architecture, however the precision of the filter coefficients and the variables had to be limited within the processor. The number of multipliers required for each convolution was reduced from 49 to 10 (79.5% reduction) using a Triangular design procedure. Experimental results suggested that the pipelined processor provided depth estimates comparable in accuracy to the full precision Matlab‟s output, and generated depth maps of size 400 x 400 pixels in 13.06msec, that is faster than the video rate. The defocused images (near and far-focused) were optically registered for magnification using Telecentric optics. A frequency domain approach based on phase correlation was employed to measure the radial shifts due to magnification and also to optimally position the external aperture. The telecentric optics ensured pixel to pixel registration between the defocused images was correct and provided more accurate depth estimates

    Learning Test-time Data Augmentation for Image Retrieval with Reinforcement Learning

    Full text link
    Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks. However, their invariance is pre-defined by the network architecture and training data. Existing image retrieval approaches require fine-tuning or modification of the pre-trained networks to adapt to the variations in the target data. In contrast, our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented with learned test-time augmentations. The optimal ensemble of test-time augmentations is learned automatically through reinforcement learning. Our training is time and resources efficient, and learns a diverse test-time augmentations. Experiment results on trademark retrieval (METU trademark dataset) and landmark retrieval (Oxford5k and Paris6k scene datasets) tasks show the learned ensemble of transformations is effective and transferable. We also achieve state-of-the-art MAP@100 results on the METU trademark dataset
    • …
    corecore