62 research outputs found
3D Face Reconstruction by Learning from Synthetic Data
Fast and robust three-dimensional reconstruction of facial geometric
structure from a single image is a challenging task with numerous applications.
Here, we introduce a learning-based approach for reconstructing a
three-dimensional face from a single image. Recent face recovery methods rely
on accurate localization of key characteristic points. In contrast, the
proposed approach is based on a Convolutional-Neural-Network (CNN) which
extracts the face geometry directly from its image. Although such deep
architectures outperform other models in complex computer vision problems,
training them properly requires a large dataset of annotated examples. In the
case of three-dimensional faces, currently, there are no large volume data
sets, while acquiring such big-data is a tedious task. As an alternative, we
propose to generate random, yet nearly photo-realistic, facial images for which
the geometric form is known. The suggested model successfully recovers facial
shapes from real images, even for faces with extreme expressions and under
various lighting conditions.Comment: The first two authors contributed equally to this wor
A Neural Space-Time Representation for Text-to-Image Personalization
A key aspect of text-to-image personalization methods is the manner in which
the target concept is represented within the generative process. This choice
greatly affects the visual fidelity, downstream editability, and disk space
needed to store the learned concept. In this paper, we explore a new
text-conditioning space that is dependent on both the denoising process
timestep (time) and the denoising U-Net layers (space) and showcase its
compelling properties. A single concept in the space-time representation is
composed of hundreds of vectors, one for each combination of time and space,
making this space challenging to optimize directly. Instead, we propose to
implicitly represent a concept in this space by optimizing a small neural
mapper that receives the current time and space parameters and outputs the
matching token embedding. In doing so, the entire personalized concept is
represented by the parameters of the learned mapper, resulting in a compact,
yet expressive, representation. Similarly to other personalization methods, the
output of our neural mapper resides in the input space of the text encoder. We
observe that one can significantly improve the convergence and visual fidelity
of the concept by introducing a textual bypass, where our neural mapper
additionally outputs a residual that is added to the output of the text
encoder. Finally, we show how one can impose an importance-based ordering over
our implicit representation, providing users control over the reconstruction
and editability of the learned concept using a single trained model. We
demonstrate the effectiveness of our approach over a range of concepts and
prompts, showing our method's ability to generate high-quality and controllable
compositions without fine-tuning any parameters of the generative model itself.Comment: Project page available at
https://neuraltextualinversion.github.io/NeTI
NeRN -- Learning Neural Representations for Neural Networks
Neural Representations have recently been shown to effectively reconstruct a
wide range of signals from 3D meshes and shapes to images and videos. We show
that, when adapted correctly, neural representations can be used to directly
represent the weights of a pre-trained convolutional neural network, resulting
in a Neural Representation for Neural Networks (NeRN). Inspired by coordinate
inputs of previous neural representation methods, we assign a coordinate to
each convolutional kernel in our network based on its position in the
architecture, and optimize a predictor network to map coordinates to their
corresponding weights. Similarly to the spatial smoothness of visual scenes, we
show that incorporating a smoothness constraint over the original network's
weights aids NeRN towards a better reconstruction. In addition, since slight
perturbations in pre-trained model weights can result in a considerable
accuracy loss, we employ techniques from the field of knowledge distillation to
stabilize the learning process. We demonstrate the effectiveness of NeRN in
reconstructing widely used architectures on CIFAR-10, CIFAR-100, and ImageNet.
Finally, we present two applications using NeRN, demonstrating the capabilities
of the learned representations
End-to-end Interpretable Learning of Non-blind Image Deblurring
Non-blind image deblurring is typically formulated as a linear least-squares
problem regularized by natural priors on the corresponding sharp picture's
gradients, which can be solved, for example, using a half-quadratic splitting
method with Richardson fixed-point iterations for its least-squares updates and
a proximal operator for the auxiliary variable updates. We propose to
precondition the Richardson solver using approximate inverse filters of the
(known) blur and natural image prior kernels. Using convolutions instead of a
generic linear preconditioner allows extremely efficient parameter sharing
across the image, and leads to significant gains in accuracy and/or speed
compared to classical FFT and conjugate-gradient methods. More importantly, the
proposed architecture is easily adapted to learning both the preconditioner and
the proximal operator using CNN embeddings. This yields a simple and efficient
algorithm for non-blind image deblurring which is fully interpretable, can be
learned end to end, and whose accuracy matches or exceeds the state of the art,
quite significantly, in the non-uniform case.Comment: Accepted at ECCV2020 (poster
Cell-Based Sensor System Using L6 Cells for Broad Band Continuous Pollutant Monitoring in Aquatic Environments
Pollution of drinking water sources represents a continuously emerging problem in global environmental protection. Novel techniques for real-time monitoring of water quality, capable of the detection of unanticipated toxic and bioactive substances, are urgently needed. In this study, the applicability of a cell-based sensor system using selected eukaryotic cell lines for the detection of aquatic pollutants is shown. Readout parameters of the cells were the acidification (metabolism), oxygen consumption (respiration) and impedance (morphology) of the cells. A variety of potential cytotoxic classes of substances (heavy metals, pharmaceuticals, neurotoxins, waste water) was tested with monolayers of L6 cells (rat myoblasts). The cytotoxicity or cellular effects induced by inorganic ions (Ni2+ and Cu2+) can be detected with the metabolic parameters acidification and respiration down to 0.5 mg/L, whereas the detection limit for other substances like nicotine and acetaminophen are rather high, in the range of 0.1 mg/L and 100 mg/L. In a close to application model a real waste water sample shows detectable signals, indicating the existence of cytotoxic substances. The results support the paradigm change from single substance detection to the monitoring of overall toxicity
- …