Search CORE

137 research outputs found

Weakly- and Self-Supervised Learning for Content-Aware Deep Image Retargeting

Author: Cho Donghyeon
Kweon In So
Oh Tae-Hyun
Park Jinsun
Tai Yu-Wing
Publication venue
Publication date: 09/08/2017
Field of study

This paper proposes a weakly- and self-supervised deep convolutional neural network (WSSDCNN) for content-aware image retargeting. Our network takes a source image and a target aspect ratio, and then directly outputs a retargeted image. Retargeting is performed through a shift map, which is a pixel-wise mapping from the source to the target grid. Our method implicitly learns an attention map, which leads to a content-aware shift map for image retargeting. As a result, discriminative parts in an image are preserved, while background regions are adjusted seamlessly. In the training phase, pairs of an image and its image-level annotation are used to compute content and structure losses. We demonstrate the effectiveness of our proposed method for a retargeting application with insightful analyses.Comment: 10 pages, 11 figures. To appear in ICCV 2017, Spotlight Presentatio

arXiv.org e-Print Archive

포항공과대학교

STEFANN: Scene Text Editor using Font Adaptive Neural Network

Author: Bhattacharya Saumik
Ghosh Subhankar
Pal Umapada
Roy Prasun
Publication venue
Publication date: 25/04/2020
Field of study

Textual information in a captured scene plays an important role in scene interpretation and decision making. Though there exist methods that can successfully detect and interpret complex text regions present in a scene, to the best of our knowledge, there is no significant prior work that aims to modify the textual information in an image. The ability to edit text directly on images has several advantages including error correction, text restoration and image reusability. In this paper, we propose a method to modify text in an image at character-level. We approach the problem in two stages. At first, the unobserved character (target) is generated from an observed character (source) being modified. We propose two different neural network architectures - (a) FANnet to achieve structural consistency with source font and (b) Colornet to preserve source color. Next, we replace the source character with the generated character maintaining both geometric and visual consistency with neighboring characters. Our method works as a unified platform for modifying text in images. We present the effectiveness of our method on COCO-Text and ICDAR datasets both qualitatively and quantitatively.Comment: Accepted in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 202

arXiv.org e-Print Archive

Crossref

Digital image forensics via meta-learning and few-shot learning

Author: Shi Yuxi
Publication venue: Digital Commons @ NJIT
Publication date: 31/08/2022
Field of study

Digital images are a substantial portion of the information conveyed by social media, the Internet, and television in our daily life. In recent years, digital images have become not only one of the public information carriers, but also a crucial piece of evidence. The widespread availability of low-cost, user-friendly, and potent image editing software and mobile phone applications facilitates altering images without professional expertise. Consequently, safeguarding the originality and integrity of digital images has become a difficulty. Forgers commonly use digital image manipulation to transmit misleading information. Digital image forensics investigates the irregular patterns that might result from image alteration. It is crucial to information security. Over the past several years, machine learning techniques have been effectively used to identify image forgeries. Convolutional Neural Networks(CNN) are a frequent machine learning approach. A standard CNN model could distinguish between original and manipulated images. In this dissertation, two CNN models are introduced to recognize seam carving and Gaussian filtering. Training a conventional CNN model for a new similar image forgery detection task, one must start from scratch. Additionally, many types of tampered image data are challenging to acquire or simulate. Meta-learning is an alternative learning paradigm in which a machine learning model gets experience across numerous related tasks and uses this expertise to improve its future learning performance. Few-shot learning is a method for acquiring knowledge from few data. It can classify images with as few as one or two examples per class. Inspired by meta-learning and few-shot learning, this dissertation proposed a prototypical networks model capable of resolving a collection of related image forgery detection problems. Unlike traditional CNN models, the proposed prototypical networks model does not need to be trained from scratch for a new task. Additionally, it drastically decreases the quantity of training images

Digital Commons @ New Jersey Institute of Technology (NJIT)

Forensic research on detecting seam carving in digital images

Author: Ye Jingyu
Publication venue: Digital Commons @ NJIT
Publication date: 01/04/2017
Field of study

Digital images have been playing an important role in our daily life for the last several decades. Naturally, image editing technologies have been tremendously developed due to the increasing demands. As a result, digital images can be easily manipulated on a personal computer or even a cellphone for many purposes nowadays, so that the authenticity of digital images becomes an important issue. In this dissertation research, four machine learning based forensic methods are presented to detect one of the popular image editing techniques, called ‘seam carving’. To reveal seam carving applied to uncompressed images from the perspective of energy distribution change, an energy based statistical model is proposed as the first work in this dissertation. Features measured global energy of images, remaining optimal seams, and noise level are extracted from four local derivative pattern (LDP) domains instead of from the original pixel domain to heighten the energy change caused by seam carving. A support vector machine (SVM) based classifier is employed to determine whether an image has been seam carved or not. In the second work, an advanced feature model is presented for seam carving detection by investigating the statistical variation among neighboring pixels. Comprised with three types of statistical features, i.e., LDP features, Markov features, and SPAM features, the powerful feature model significantly improved the state-of-the-art accuracy in detecting low carving rate seam carving. After the feature selection by utilizing SVM based recursive feature elimination (SVM-RFE), with a small amount of features selected from the proposed model the overall performance is further improved. Combining above mentioned two works, a hybrid feature model is then proposed as the third work to further boost the accuracy in detecting seam carving at low carving rate. The proposed model consists of two sets of features, which capture energy change and neighboring relationship variation respectively, achieves remarkable performance on revealing seam carving, especially low carving rate seam carving, in digital images. Besides these three hand crafted feature models, a deep convolutional neural network is designed for seam carving detection. It is the first work that successfully utilizes deep learning technology to solve this forensic problem. The experimental works demonstrate their much more improved performance in the cases where the amount of seam carving is not serious. Although these four pieces of work move the seam carving detection ahead substantially, future research works with more advanced statistical model or deep neural network along this line are expected

Digital Commons @ New Jersey Institute of Technology (NJIT)

Supervised Deep Learning for Content-Aware Image Retargeting with Fourier Convolutions

Author: Givkashi MohammadHossein
Karimi Nader
Naderi MohammadReza
Samavi Shadrokh
Shirani Shahram
Publication venue
Publication date: 12/06/2023
Field of study

Image retargeting aims to alter the size of the image with attention to the contents. One of the main obstacles to training deep learning models for image retargeting is the need for a vast labeled dataset. Labeled datasets are unavailable for training deep learning models in the image retargeting tasks. As a result, we present a new supervised approach for training deep learning models. We use the original images as ground truth and create inputs for the model by resizing and cropping the original images. A second challenge is generating different image sizes in inference time. However, regular convolutional neural networks cannot generate images of different sizes than the input image. To address this issue, we introduced a new method for supervised learning. In our approach, a mask is generated to show the desired size and location of the object. Then the mask and the input image are fed to the network. Comparing image retargeting methods and our proposed method demonstrates the model's ability to produce high-quality retargeted images. Afterward, we compute the image quality assessment score for each output image based on different techniques and illustrate the effectiveness of our approach.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Learning Visual Importance for Graphic Designs and Data Visualizations

Author: Alsheikh Sami
Bylinskii Zoya
Durand Fredo
Hertzmann Aaron
Kim Nam Wook
Madan Spandan
O'Donovan Peter
Pfister Hanspeter
Russell Bryan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/08/2017
Field of study

Knowing where people look and click on visual designs can provide clues about how the designs are perceived, and where the most important or relevant content lies. The most important content of a visual design can be used for effective summarization or to facilitate retrieval from a database. We present automated models that predict the relative importance of different elements in data visualizations and graphic designs. Our models are neural networks trained on human clicks and importance annotations on hundreds of designs. We collected a new dataset of crowdsourced importance, and analyzed the predictions of our models with respect to ground truth importance and human eye movements. We demonstrate how such predictions of importance can be used for automatic design retargeting and thumbnailing. User studies with hundreds of MTurk participants validate that, with limited post-processing, our importance-driven applications are on par with, or outperform, current state-of-the-art methods, including natural image saliency. We also provide a demonstration of how our importance predictions can be built into interactive design tools to offer immediate feedback during the design process

arXiv.org e-Print Archive

Crossref