209 research outputs found

    End-to-end security for video distribution

    Get PDF

    Recovering Missing Coefficients in DCT-Transformed Images

    Full text link
    A general method for recovering missing DCT coefficients in DCT-transformed images is presented in this work. We model the DCT coefficients recovery problem as an optimization problem and recover all missing DCT coefficients via linear programming. The visual quality of the recovered image gradually decreases as the number of missing DCT coefficients increases. For some images, the quality is surprisingly good even when more than 10 most significant DCT coefficients are missing. When only the DC coefficient is missing, the proposed algorithm outperforms existing methods according to experimental results conducted on 200 test images. The proposed recovery method can be used for cryptanalysis of DCT based selective encryption schemes and other applications.Comment: 4 pages, 4 figure

    Phase Retrieval From Binary Measurements

    Full text link
    We consider the problem of signal reconstruction from quadratic measurements that are encoded as +1 or -1 depending on whether they exceed a predetermined positive threshold or not. Binary measurements are fast to acquire and inexpensive in terms of hardware. We formulate the problem of signal reconstruction using a consistency criterion, wherein one seeks to find a signal that is in agreement with the measurements. To enforce consistency, we construct a convex cost using a one-sided quadratic penalty and minimize it using an iterative accelerated projected gradient-descent (APGD) technique. The PGD scheme reduces the cost function in each iteration, whereas incorporating momentum into PGD, notwithstanding the lack of such a descent property, exhibits faster convergence than PGD empirically. We refer to the resulting algorithm as binary phase retrieval (BPR). Considering additive white noise contamination prior to quantization, we also derive the Cramer-Rao Bound (CRB) for the binary encoding model. Experimental results demonstrate that the BPR algorithm yields a signal-to- reconstruction error ratio (SRER) of approximately 25 dB in the absence of noise. In the presence of noise prior to quantization, the SRER is within 2 to 3 dB of the CRB

    3-D Scene Reconstruction for Passive Ranging Using Depth from Defocus and Deep Learning

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)Depth estimation is increasingly becoming more important in computer vision. The requirement for autonomous systems to gauge their surroundings is of the utmost importance in order to avoid obstacles, preventing damage to itself and/or other systems or people. Depth measuring/estimation systems that use multiple cameras from multiple views can be expensive and extremely complex. And as these autonomous systems decrease in size and available power, the supporting sensors required to estimate depth must also shrink in size and power consumption. This research will concentrate on a single passive method known as Depth from Defocus (DfD), which uses an in-focus and out-of-focus image to infer the depth of objects in a scene. The major contribution of this research is the introduction of a new Deep Learning (DL) architecture to process the the in-focus and out-of-focus images to produce a depth map for the scene improving both speed and performance over a range of lighting conditions. Compared to the previous state-of-the-art multi-label graph cuts algorithms applied to the synthetically blurred dataset the DfD-Net produced a 34.30% improvement in the average Normalized Root Mean Square Error (NRMSE). Similarly the DfD-Net architecture produced a 76.69% improvement in the average Normalized Mean Absolute Error (NMAE). Only the Structural Similarity Index (SSIM) had a small average decrease of 2.68% when compared to the graph cuts algorithm. This slight reduction in the SSIM value is a result of the SSIM metric penalizing images that appear to be noisy. In some instances the DfD-Net output is mottled, which is interpreted as noise by the SSIM metric. This research introduces two methods of deep learning architecture optimization. The first method employs the use of a variant of the Particle Swarm Optimization (PSO) algorithm to improve the performance of the DfD-Net architecture. The PSO algorithm was able to find a combination of the number of convolutional filters, the size of the filters, the activation layers used, the use of a batch normalization layer between filters and the size of the input image used during training to produce a network architecture that resulted in an average NRMSE that was approximately 6.25% better than the baseline DfD-Net average NRMSE. This optimized architecture also resulted in an average NMAE that was 5.25% better than the baseline DfD-Net average NMAE. Only the SSIM metric did not see a gain in performance, dropping by 0.26% when compared to the baseline DfD-Net average SSIM value. The second method illustrates the use of a Self Organizing Map clustering method to reduce the number convolutional filters in the DfD-Net to reduce the overall run time of the architecture while still retaining the network performance exhibited prior to the reduction. This method produces a reduced DfD-Net architecture that has a run time decrease of between 14.91% and 44.85% depending on the hardware architecture that is running the network. The final reduced DfD-Net resulted in a network architecture that had an overall decrease in the average NRMSE value of approximately 3.4% when compared to the baseline, unaltered DfD-Net, mean NRMSE value. The NMAE and the SSIM results for the reduced architecture were 0.65% and 0.13% below the baseline results respectively. This illustrates that reducing the network architecture complexity does not necessarily reduce the reduction in performance. Finally, this research introduced a new, real world dataset that was captured using a camera and a voltage controlled microfluidic lens to capture the visual data and a 2-D scanning LIDAR to capture the ground truth data. The visual data consists of images captured at seven different exposure times and 17 discrete voltage steps per exposure time. The objects in this dataset were divided into four repeating scene patterns in which the same surfaces were used. These scenes were located between 1.5 and 2.5 meters from the camera and LIDAR. This was done so any of the deep learning algorithms tested would see the same texture at multiple depths and multiple blurs. The DfD-Net architecture was employed in two separate tests using the real world dataset. The first test was the synthetic blurring of the real world dataset and assessing the performance of the DfD-Net trained on the Middlebury dataset. The results of the real world dataset for the scenes that were between 1.5 and 2.2 meters from the camera the DfD-Net trained on the Middlebury dataset produced an average NRMSE, NMAE and SSIM value that exceeded the test results of the DfD-Net tested on the Middlebury test set. The second test conducted was the training and testing solely on the real world dataset. Analysis of the camera and lens behavior led to an optimal lens voltage step configuration of 141 and 129. Using this configuration, training the DfD-Net resulted in an average NRMSE, NMAE and SSIM of 0.0660, 0.0517 and 0.8028 with a standard deviation of 0.0173, 0.0186 and 0.0641 respectively
    • …
    corecore