127 research outputs found
Texture Synthesis for Mobile Data Communications
A digital camera mounted on a mobile phone is utilized as a data input device to obtain embedded data by analyzing the pattern of an image code such as a 2D bar code. This article proposes a new type of image coding method using texture image synthesis. Regularly arranged dotted-pattern is first painted with colors picked out from a texture sample, for having features corresponding to embedded data. Our texture synthesis technique then camouflages the dotted-patternusing the same texture sample while preserving the qualitycomparable to that of existing synthesis techniques. The texturedcode provides the conventional bar code with an aesthetic appealand is used for tagging data onto real texture objects, which canform a basis for ubiquitous mobile data communications. Thistechnical approach has the potential to explore new applicationfields of example-based, computer-generated texture images
CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields
Neural Radiance Fields (NeRF) have the potential to be a major representation
of media. Since training a NeRF has never been an easy task, the protection of
its model copyright should be a priority. In this paper, by analyzing the pros
and cons of possible copyright protection solutions, we propose to protect the
copyright of NeRF models by replacing the original color representation in NeRF
with a watermarked color representation. Then, a distortion-resistant rendering
scheme is designed to guarantee robust message extraction in 2D renderings of
NeRF. Our proposed method can directly protect the copyright of NeRF models
while maintaining high rendering quality and bit accuracy when compared among
optional solutions.Comment: 11 pages, 6 figures, accepted by iccv 2023 non-camera-ready versio
Human-Centric Deep Generative Models: The Blessing and The Curse
Over the past years, deep neural networks have achieved significant progress in a wide range of real-world applications. In particular, my research puts a focused lens in deep generative models, a neural network solution that proves effective in visual (re)creation. But is generative modeling a niche topic that should be researched on its own? My answer is critically no. In the thesis, I present the two sides of deep generative models, their blessing and their curse to human beings. Regarding what can deep generative models do for us, I demonstrate the improvement in performance and steerability of visual (re)creation. Regarding what can we do for deep generative models, my answer is to mitigate the security concerns of DeepFakes and improve minority inclusion of deep generative models.
For the performance of deep generative models, I probe on applying attention modules and dual contrastive loss to generative adversarial networks (GANs), which pushes photorealistic image generation to a new state of the art. For the steerability, I introduce Texture Mixer, a simple yet effective approach to achieve steerable texture synthesis and blending. For the security, my research spans over a series of GAN fingerprinting solutions that enable the detection and attribution of GAN-generated image misuse. For the inclusion, I investigate the biased misbehavior of generative models and present my solution in enhancing the minority inclusion of GAN models over underrepresented image attributes. All in all, I propose to project actionable insights to the applications of deep generative models, and finally contribute to human-generator interaction
Data Hiding with Deep Learning: A Survey Unifying Digital Watermarking and Steganography
Data hiding is the process of embedding information into a noise-tolerant
signal such as a piece of audio, video, or image. Digital watermarking is a
form of data hiding where identifying data is robustly embedded so that it can
resist tampering and be used to identify the original owners of the media.
Steganography, another form of data hiding, embeds data for the purpose of
secure and secret communication. This survey summarises recent developments in
deep learning techniques for data hiding for the purposes of watermarking and
steganography, categorising them based on model architectures and noise
injection methods. The objective functions, evaluation metrics, and datasets
used for training these data hiding models are comprehensively summarised.
Finally, we propose and discuss possible future directions for research into
deep data hiding techniques
VGFlow: Visibility guided Flow Network for Human Reposing
The task of human reposing involves generating a realistic image of a person
standing in an arbitrary conceivable pose. There are multiple difficulties in
generating perceptually accurate images, and existing methods suffer from
limitations in preserving texture, maintaining pattern coherence, respecting
cloth boundaries, handling occlusions, manipulating skin generation, etc. These
difficulties are further exacerbated by the fact that the possible space of
pose orientation for humans is large and variable, the nature of clothing items
is highly non-rigid, and the diversity in body shape differs largely among the
population. To alleviate these difficulties and synthesize perceptually
accurate images, we propose VGFlow. Our model uses a visibility-guided flow
module to disentangle the flow into visible and invisible parts of the target
for simultaneous texture preservation and style manipulation. Furthermore, to
tackle distinct body shapes and avoid network artifacts, we also incorporate a
self-supervised patch-wise "realness" loss to improve the output. VGFlow
achieves state-of-the-art results as observed qualitatively and quantitatively
on different image quality metrics (SSIM, LPIPS, FID).Comment: 9 pages, 18 figures, computer visio
Visual Privacy Protection Methods: A Survey
Recent advances in computer vision technologies have made possible the development of intelligent monitoring systems for video surveillance and ambient-assisted living. By using this technology, these systems are able to automatically interpret visual data from the environment and perform tasks that would have been unthinkable years ago. These achievements represent a radical improvement but they also suppose a new threat to individual’s privacy. The new capabilities of such systems give them the ability to collect and index a huge amount of private information about each individual. Next-generation systems have to solve this issue in order to obtain the users’ acceptance. Therefore, there is a need for mechanisms or tools to protect and preserve people’s privacy. This paper seeks to clarify how privacy can be protected in imagery data, so as a main contribution a comprehensive classification of the protection methods for visual privacy as well as an up-to-date review of them are provided. A survey of the existing privacy-aware intelligent monitoring systems and a valuable discussion of important aspects of visual privacy are also provided.This work has been partially supported by the Spanish Ministry of Science and Innovation under project “Sistema de visión para la monitorización de la actividad de la vida diaria en el hogar” (TIN2010-20510-C04-02) and by the European Commission under project “caring4U - A study on people activity in private spaces: towards a multisensor network that meets privacy requirements” (PIEF-GA-2010-274649). José Ramón Padilla López and Alexandros Andre Chaaraoui acknowledge financial support by the Conselleria d'Educació, Formació i Ocupació of the Generalitat Valenciana (fellowship ACIF/2012/064 and ACIF/2011/160 respectively)
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
In this work, we address the problem of musical timbre transfer, where the
goal is to manipulate the timbre of a sound sample from one instrument to match
another instrument while preserving other musical content, such as pitch,
rhythm, and loudness. In principle, one could apply image-based style transfer
techniques to a time-frequency representation of an audio signal, but this
depends on having a representation that allows independent manipulation of
timbre as well as high-quality waveform generation. We introduce TimbreTron, a
method for musical timbre transfer which applies "image" domain style transfer
to a time-frequency representation of the audio signal, and then produces a
high-quality waveform using a conditional WaveNet synthesizer. We show that the
Constant Q Transform (CQT) representation is particularly well-suited to
convolutional architectures due to its approximate pitch equivariance. Based on
human perceptual evaluations, we confirmed that TimbreTron recognizably
transferred the timbre while otherwise preserving the musical content, for both
monophonic and polyphonic samples.Comment: 17 pages, published as a conference paper at ICLR 201
Learning Iterative Neural Optimizers for Image Steganography
Image steganography is the process of concealing secret information in images
through imperceptible changes. Recent work has formulated this task as a
classic constrained optimization problem. In this paper, we argue that image
steganography is inherently performed on the (elusive) manifold of natural
images, and propose an iterative neural network trained to perform the
optimization steps. In contrast to classical optimization methods like L-BFGS
or projected gradient descent, we train the neural network to also stay close
to the manifold of natural images throughout the optimization. We show that our
learned neural optimization is faster and more reliable than classical
optimization approaches. In comparison to previous state-of-the-art
encoder-decoder-based steganography methods, it reduces the recovery error rate
by multiple orders of magnitude and achieves zero error up to 3 bits per pixel
(bpp) without the need for error-correcting codes.Comment: International Conference on Learning Representations (ICLR) 202
- …