275 research outputs found

    Hypernetwork functional image representation

    Full text link
    Motivated by the human way of memorizing images we introduce their functional representation, where an image is represented by a neural network. For this purpose, we construct a hypernetwork which takes an image and returns weights to the target network, which maps point from the plane (representing positions of the pixel) into its corresponding color in the image. Since the obtained representation is continuous, one can easily inspect the image at various resolutions and perform on it arbitrary continuous operations. Moreover, by inspecting interpolations we show that such representation has some properties characteristic to generative models. To evaluate the proposed mechanism experimentally, we apply it to image super-resolution problem. Despite using a single model for various scaling factors, we obtained results comparable to existing super-resolution methods

    Development of Some Spatial-domain Preprocessing and Post-processing Algorithms for Better 2-D Up-scaling

    Get PDF
    Image super-resolution is an area of great interest in recent years and is extensively used in applications like video streaming, multimedia, internet technologies, consumer electronics, display and printing industries. Image super-resolution is a process of increasing the resolution of a given image without losing its integrity. Its most common application is to provide better visual effect after resizing a digital image for display or printing. One of the methods of improving the image resolution is through the employment of a 2-D interpolation. An up-scaled image should retain all the image details with very less degree of blurring meant for better visual quality. In literature, many efficient 2-D interpolation schemes are found that well preserve the image details in the up-scaled images; particularly at the regions with edges and fine details. Nevertheless, these existing interpolation schemes too give blurring effect in the up-scaled images due to the high frequency (HF) degradation during the up-sampling process. Hence, there is a scope to further improve their performance through the incorporation of various spatial domain pre-processing, post-processing and composite algorithms. Therefore, it is felt that there is sufficient scope to develop various efficient but simple pre-processing, post-processing and composite schemes to effectively restore the HF contents in the up-scaled images for various online and off-line applications. An efficient and widely used Lanczos-3 interpolation is taken for further performance improvement through the incorporation of various proposed algorithms. The various pre-processing algorithms developed in this thesis are summarized here. The term pre-processing refers to processing the low-resolution input image prior to image up-scaling. The various pre-processing algorithms proposed in this thesis are: Laplacian of Laplacian based global pre-processing (LLGP) scheme; Hybrid global pre-processing (HGP); Iterative Laplacian of Laplacian based global pre-processing (ILLGP); Unsharp masking based pre-processing (UMP); Iterative unsharp masking (IUM); Error based up-sampling(EU) scheme. The proposed algorithms: LLGP, HGP and ILLGP are three spatial domain preprocessing algorithms which are based on 4th, 6th and 8th order derivatives to alleviate nonuniform blurring in up-scaled images. These algorithms are used to obtain the high frequency (HF) extracts from an image by employing higher order derivatives and perform precise sharpening on a low resolution image to alleviate the blurring in its 2-D up-sampled counterpart. In case of unsharp masking based pre-processing (UMP) scheme, the blurred version of a low resolution image is used for HF extraction from the original version through image subtraction. The weighted version of the HF extracts are superimposed with the original image to produce a sharpened image prior to image up-scaling to counter blurring effectively. IUM makes use of many iterations to generate an unsharp mask which contains very high frequency (VHF) components. The VHF extract is the result of signal decomposition in terms of sub-bands using the concept of analysis filter bank. Since the degradation of VHF components is maximum, restoration of such components would produce much better restoration performance. EU is another pre-processing scheme in which the HF degradation due to image upscaling is extracted and is called prediction error. The prediction error contains the lost high frequency components. When this error is superimposed on the low resolution image prior to image up-sampling, blurring is considerably reduced in the up-scaled images. Various post-processing algorithms developed in this thesis are summarized in following. The term post-processing refers to processing the high resolution up-scaled image. The various post-processing algorithms proposed in this thesis are: Local adaptive Laplacian (LAL); Fuzzy weighted Laplacian (FWL); Legendre functional link artificial neural network(LFLANN). LAL is a non-fuzzy, local based scheme. The local regions of an up-scaled image with high variance are sharpened more than the region with moderate or low variance by employing a local adaptive Laplacian kernel. The weights of the LAL kernel are varied as per the normalized local variance so as to provide more degree of HF enhancement to high variance regions than the low variance counterpart to effectively counter the non-uniform blurring. Furthermore, FWL post-processing scheme with a higher degree of non-linearity is proposed to further improve the performance of LAL. FWL, being a fuzzy based mapping scheme, is highly nonlinear to resolve the blurring problem more effectively than LAL which employs a linear mapping. Another LFLANN based post-processing scheme is proposed here to minimize the cost function so as to reduce the blurring in a 2-D up-scaled image. Legendre polynomials are used for functional expansion of the input pattern-vector and provide high degree of nonlinearity. Therefore, the requirement of multiple layers can be replaced by single layer LFLANN architecture so as to reduce the cost function effectively for better restoration performance. With single layer architecture, it has reduced the computational complexity and hence is suitable for various real-time applications. There is a scope of further improvement of the stand-alone pre-processing and postprocessing schemes by combining them through composite schemes. Here, two spatial domain composite schemes, CS-I and CS-II are proposed to tackle non-uniform blurring in an up-scaled image. CS-I is developed by combining global iterative Laplacian (GIL) preprocessing scheme with LAL post-processing scheme. Another highly nonlinear composite scheme, CS-II is proposed which combines ILLGP scheme with a fuzzy weighted Laplacian post-processing scheme for more improved performance than the stand-alone schemes. Finally, it is observed that the proposed algorithms: ILLGP, IUM, FWL, LFLANN and CS-II are better algorithms in their respective categories for effectively reducing blurring in the up-scaled images

    Side-Information For Steganography Design And Detection

    Get PDF
    Today, the most secure steganographic schemes for digital images embed secret messages while minimizing a distortion function that describes the local complexity of the content. Distortion functions are heuristically designed to predict the modeling error, or in other words, how difficult it would be to detect a single change to the original image in any given area. This dissertation investigates how both the design and detection of such content-adaptive schemes can be improved with the use of side-information. We distinguish two types of side-information, public and private: Public side-information is available to the sender and at least in part also to anybody else who can observe the communication. Content complexity is a typical example of public side-information. While it is commonly used for steganography, it can also be used for detection. In this work, we propose a modification to the rich-model style feature sets in both spatial and JPEG domain to inform such feature sets of the content complexity. Private side-information is available only to the sender. The previous use of private side-information in steganography was very successful but limited to steganography in JPEG images. Also, the constructions were based on heuristic with little theoretical foundations. This work tries to remedy this deficiency by introducing a scheme that generalizes the previous approach to an arbitrary domain. We also put forward a theoretical investigation of how to incorporate side-information based on a model of images. Third, we propose to use a novel type of side-information in the form of multiple exposures for JPEG steganography

    Mapping Stream Programs into the Compressed Domain

    Get PDF
    Due to the high data rates involved in audio, video, and signalprocessing applications, it is imperative to compress the data todecrease the amount of storage used. Unfortunately, this implies thatany program operating on the data needs to be wrapped by adecompression and re-compression stage. Re-compression can incursignificant computational overhead, while decompression swamps theapplication with the original volume of data.In this paper, we present a program transformation that greatlyaccelerates the processing of compressible data. Given a program thatoperates on uncompressed data, we output an equivalent program thatoperates directly on the compressed format. Our transformationapplies to stream programs, a restricted but useful class ofapplications with regular communication and computation patterns. Ourformulation is based on LZ77, a lossless compression algorithm that isutilized by ZIP and fully encapsulates common formats such as AppleAnimation, Microsoft RLE, and Targa.We implemented a simple subset of our techniques in the StreamItcompiler, which emits executable plugins for two popular video editingtools: MEncoder and Blender. For common operations such as coloradjustment and video compositing, mapping into the compressed domainoffers a speedup roughly proportional to the overall compressionratio. For our benchmark suite of 12 videos in Apple Animationformat, speedups range from 1.1x to 471x, with a median of 15x

    Improving Human Face Recognition Using Deep Learning Based Image Registration And Multi-Classifier Approaches

    Get PDF
    Face detection, registration, and recognition have become a fascinating field for researchers. The motivation behind the enormous interest in the topic is the need to improve the accuracy of many real-time applications. Countless methodologies have been acknowledged and presented in the past years. The complexity of the human face visual and the significant changes based on different effects make it more challenging to design as well as implementing a powerful computational system for object recognition in addition to human face recognition. Using supervised learning often requires extensive training for the computer which results in high execution times. It is an essential step in the face recognition to apply strong preprocessing approaches such as face registration to achieve a high recognition accuracy rate. Although there are exist approaches do both detection and recognition, we believe the absence of a complete end-to-end system capable of performing recognition from an arbitrary scene is in large part due to the difficulty in alignment. Often, the face registration is ignored, with the assumption that the detector will perform a rough alignment, leading to suboptimal recognition performance. In this research, we presented an enhanced approach to improve human face recognition using a back-propagation neural network (BPNN) and features extraction based on the correlation between the training images. A key contribution of this paper is the generation of a new set called the T-Dataset from the original training data set, which is used to train the BPNN. We generated the T-Dataset using the correlation between the training images without using a common technique of image density. The correlated T-Dataset provides a high distinction layer between the training images, which helps the BPNN to converge faster and achieve better accuracy. Data and features reduction is essential in the face recognition process, and researchers have recently focused on the modern neural network. Therefore, we used using a classical conventional Principal Component Analysis (PCA) and Local Binary Patterns (LBP) to prove that there is a potential improvement even using traditional methods. We applied five distance measurement algorithms and then combined them to obtain the T-Dataset, which we fed into the BPNN. We achieved higher face recognition accuracy with less computational cost compared with the current approach by using reduced image features. We test the proposed framework on two small data sets, the YALE and AT&T data sets, as the ground truth. We achieved tremendous accuracy. Furthermore, we evaluate our method on one of the state-of-the-art benchmark data sets, Labeled Faces in the Wild (LFW), where we produce a competitive face recognition performance. In addition, we presented an enhanced framework to improve the face registration using deep learning model. We used deep architectures such as VGG16 and VGG19 to train our method. We trained our model to learn the transformation parameters (Rotation, scaling, and shifting). By leaning the transformation parameters, we will able to transfer the image back to the frontal domain. We used the LFW dataset to evaluate our method, and we achieve high accuracy

    Digital forensic techniques for the reverse engineering of image acquisition chains

    Get PDF
    In recent years a number of new methods have been developed to detect image forgery. Most forensic techniques use footprints left on images to predict the history of the images. The images, however, sometimes could have gone through a series of processing and modification through their lifetime. It is therefore difficult to detect image tampering as the footprints could be distorted or removed over a complex chain of operations. In this research we propose digital forensic techniques that allow us to reverse engineer and determine history of images that have gone through chains of image acquisition and reproduction. This thesis presents two different approaches to address the problem. In the first part we propose a novel theoretical framework for the reverse engineering of signal acquisition chains. Based on a simplified chain model, we describe how signals have gone in the chains at different stages using the theory of sampling signals with finite rate of innovation. Under particular conditions, our technique allows to detect whether a given signal has been reacquired through the chain. It also makes possible to predict corresponding important parameters of the chain using acquisition-reconstruction artefacts left on the signal. The second part of the thesis presents our new algorithm for image recapture detection based on edge blurriness. Two overcomplete dictionaries are trained using the K-SVD approach to learn distinctive blurring patterns from sets of single captured and recaptured images. An SVM classifier is then built using dictionary approximation errors and the mean edge spread width from the training images. The algorithm, which requires no user intervention, was tested on a database that included more than 2500 high quality recaptured images. Our results show that our method achieves a performance rate that exceeds 99% for recaptured images and 94% for single captured images.Open Acces

    Different Facial Recognition Techniques in Transform Domains

    Get PDF
    The human face is frequently used as the biometric signal presented to a machine for identification purposes. Several challenges are encountered while designing face identification systems. The challenges are either caused by the process of capturing the face image itself, or occur while processing the face poses. Since the face image not only contains the face, this adds to the data dimensionality, and thus degrades the performance of the recognition system. Face Recognition (FR) has been a major signal processing topic of interest in the last few decades. Most common applications of the FR include, forensics, access authorization to facilities, or simply unlocking of a smart phone. The three factors governing the performance of a FR system are: the storage requirements, the computational complexity, and the recognition accuracy. The typical FR system consists of the following main modules in each of the Training and Testing phases: Preprocessing, Feature Extraction, and Classification. The ORL, YALE, FERET, FEI, Cropped AR, and Georgia Tech datasets are used to evaluate the performance of the proposed systems. The proposed systems are categorized into Single-Transform and Two-Transform systems. In the first category, the features are extracted from a single domain, that of the Two-Dimensional Discrete Cosine Transform (2D DCT). In the latter category, the Two-Dimensional Discrete Wavelet Transform (2D DWT) coefficients are combined with those of the 2D DCT to form one feature vector. The feature vectors are either used directly or further processed to obtain the persons\u27 final models. The Principle Component Analysis (PCA), the Sparse Representation, Vector Quantization (VQ) are employed as a second step in the Feature Extraction Module. Additionally, a technique is proposed in which the feature vector is composed of appropriately selected 2D DCT and 2D DWT coefficients based on a residual minimization algorithm

    An Evaluation of Popular Copy-Move Forgery Detection Approaches

    Full text link
    A copy-move forgery is created by copying and pasting content within the same image, and potentially post-processing it. In recent years, the detection of copy-move forgeries has become one of the most actively researched topics in blind image forensics. A considerable number of different algorithms have been proposed focusing on different types of postprocessed copies. In this paper, we aim to answer which copy-move forgery detection algorithms and processing steps (e.g., matching, filtering, outlier detection, affine transformation estimation) perform best in various postprocessing scenarios. The focus of our analysis is to evaluate the performance of previously proposed feature sets. We achieve this by casting existing algorithms in a common pipeline. In this paper, we examined the 15 most prominent feature sets. We analyzed the detection performance on a per-image basis and on a per-pixel basis. We created a challenging real-world copy-move dataset, and a software framework for systematic image manipulation. Experiments show, that the keypoint-based features SIFT and SURF, as well as the block-based DCT, DWT, KPCA, PCA and Zernike features perform very well. These feature sets exhibit the best robustness against various noise sources and downsampling, while reliably identifying the copied regions.Comment: Main paper: 14 pages, supplemental material: 12 pages, main paper appeared in IEEE Transaction on Information Forensics and Securit

    multi-patch aggregation models for resampling detection

    Full text link
    Images captured nowadays are of varying dimensions with smartphones and DSLR's allowing users to choose from a list of available image resolutions. It is therefore imperative for forensic algorithms such as resampling detection to scale well for images of varying dimensions. However, in our experiments, we observed that many state-of-the-art forensic algorithms are sensitive to image size and their performance quickly degenerates when operated on images of diverse dimensions despite re-training them using multiple image sizes. To handle this issue, we propose a novel pooling strategy called ITERATIVE POOLING. This pooling strategy can dynamically adjust input tensors in a discrete without much loss of information as in ROI Max-pooling. This pooling strategy can be used with any of the existing deep models and for demonstration purposes, we show its utility on Resnet-18 for the case of resampling detection a fundamental operation for any image sought of image manipulation. Compared to existing strategies and Max-pooling it gives up to 7-8% improvement on public datasets.Comment: 6 pages; 6 tables; 4 figure