21 research outputs found

    Estimation of Scribble Placement for Painting Colorization

    Get PDF
    Image colorization has been a topic of interest since the mid 70’s and several algorithms have been proposed that given a grayscale image and color scribbles (hints) produce a colorized image. Recently, this approach has been introduced in the field of art conservation and cultural heritage, where B&W photographs of paintings at previous stages have been colorized. However, the questions of what is the minimum number of scribbles necessary and where they should be placed in an image remain unexplored. Here we address this limitation using an iterative algorithm that provides insights as to the relationship between locally vs. globally important scribbles. Given a color image we randomly select scribbles and we attempt to color the grayscale version of the original.We define a scribble contribution measure based on the reconstruction error. We demonstrate our approach using a widely used colorization algorithm and images from a Picasso painting and the peppers test image. We show that areas isolated by thick brushstrokes or areas with high textural variation are locally important but contribute very little to the overall representation accuracy. We also find that for the case of Picasso on average 10% of scribble coverage is enough and that flat areas can be presented by few scribbles. The proposed method can be used verbatim to test any colorization algorithm

    Exploiting temporal stability and low-rank structure for motion capture data refinement

    Get PDF
    Inspired by the development of the matrix completion theories and algorithms, a low-rank based motion capture (mocap) data refinement method has been developed, which has achieved encouraging results. However, it does not guarantee a stable outcome if we only consider the low-rank property of the motion data. To solve this problem, we propose to exploit the temporal stability of human motion and convert the mocap data refinement problem into a robust matrix completion problem, where both the low-rank structure and temporal stability properties of the mocap data as well as the noise effect are considered. An efficient optimization method derived from the augmented Lagrange multiplier algorithm is presented to solve the proposed model. Besides, a trust data detection method is also introduced to improve the degree of automation for processing the entire set of the data and boost the performance. Extensive experiments and comparisons with other methods demonstrate the effectiveness of our approaches on both predicting missing data and de-noising. © 2014 Elsevier Inc. All rights reserved

    RecolorNeRF: Layer Decomposed Radiance Fields for Efficient Color Editing of 3D Scenes

    Full text link
    Radiance fields have gradually become a main representation of media. Although its appearance editing has been studied, how to achieve view-consistent recoloring in an efficient manner is still under explored. We present RecolorNeRF, a novel user-friendly color editing approach for the neural radiance fields. Our key idea is to decompose the scene into a set of pure-colored layers, forming a palette. By this means, color manipulation can be conducted by altering the color components of the palette directly. To support efficient palette-based editing, the color of each layer needs to be as representative as possible. In the end, the problem is formulated as an optimization problem, where the layers and their blending weights are jointly optimized with the NeRF itself. Extensive experiments show that our jointly-optimized layer decomposition can be used against multiple backbones and produce photo-realistic recolored novel-view renderings. We demonstrate that RecolorNeRF outperforms baseline methods both quantitatively and qualitatively for color editing even in complex real-world scenes.Comment: To appear in ACM Multimedia 2023. Project website is accessible at https://sites.google.com/view/recolorner

    Data-driven approaches for interactive appearance editing

    Get PDF
    This thesis proposes several techniques for interactive editing of digital content and fast rendering of virtual 3D scenes. Editing of digital content - such as images or 3D scenes - is difficult, requires artistic talent and technical expertise. To alleviate these difficulties, we exploit data-driven approaches that use the easily accessible Internet data (e. g., images, videos, materials) to develop new tools for digital content manipulation. Our proposed techniques allow casual users to achieve high-quality editing by interactively exploring the manipulations without the need to understand the underlying physical models of appearance. First, the thesis presents a fast algorithm for realistic image synthesis of virtual 3D scenes. This serves as the core framework for a new method that allows artists to fine tune the appearance of a rendered 3D scene. Here, artists directly paint the final appearance and the system automatically solves for the material parameters that best match the desired look. Along this line, an example-based material assignment approach is proposed, where the 3D models of a virtual scene can be "materialized" simply by giving a guidance source (image/video). Next, the thesis proposes shape and color subspaces of an object that are learned from a collection of exemplar images. These subspaces can be used to constrain image manipulations to valid shapes and colors, or provide suggestions for manipulations. Finally, data-driven color manifolds which contain colors of a specific context are proposed. Such color manifolds can be used to improve color picking performance, color stylization, compression or white balancing.Diese Dissertation stellt Techniken zum interaktiven Editieren von digitalen Inhalten und zum schnellen Rendering von virtuellen 3D Szenen vor. Digitales Editieren - seien es Bilder oder dreidimensionale Szenen - ist kompliziert, benötigt künstlerisches Talent und technische Expertise. Um diese Schwierigkeiten zu relativieren, nutzen wir datengesteuerte Ansätze, die einfach zugängliche Internetdaten, wie Bilder, Videos und Materialeigenschaften, nutzen um neue Werkzeuge zur Manipulation von digitalen Inhalten zu entwickeln. Die von uns vorgestellten Techniken erlauben Gelegenheitsnutzern das Editieren in hoher Qualität, indem Manipulationsmöglichkeiten interaktiv exploriert werden können ohne die zugrundeliegenden physikalischen Modelle der Bildentstehung verstehen zu müssen. Zunächst stellen wir einen effizienten Algorithmus zur realistischen Bildsynthese von virtuellen 3D Szenen vor. Dieser dient als Kerngerüst einer Methode, die Nutzern die Feinabstimmung des finalen Aussehens einer gerenderten dreidimensionalen Szene erlaubt. Hierbei malt der Künstler direkt das beabsichtigte Aussehen und das System errechnet automatisch die zugrundeliegenden Materialeigenschaften, die den beabsichtigten Eigenschaften am nahesten kommen. Zu diesem Zweck wird ein auf Beispielen basierender Materialzuordnungsansatz vorgestellt, für den das 3D Model einer virtuellen Szene durch das simple Anführen einer Leitquelle (Bild, Video) in Materialien aufgeteilt werden kann. Als Nächstes schlagen wir Form- und Farbunterräume von Objektklassen vor, die aus einer Sammlung von Beispielbildern gelernt werden. Diese Unterräume können genutzt werden um Bildmanipulationen auf valide Formen und Farben einzuschränken oder Manipulationsvorschläge zu liefern. Schließlich werden datenbasierte Farbmannigfaltigkeiten vorgestellt, die Farben eines spezifischen Kontexts enthalten. Diese Mannigfaltigkeiten ermöglichen eine Leistungssteigerung bei Farbauswahl, Farbstilisierung, Komprimierung und Weißabgleich

    Exploring information retrieval using image sparse representations:from circuit designs and acquisition processes to specific reconstruction algorithms

    Get PDF
    New advances in the field of image sensors (especially in CMOS technology) tend to question the conventional methods used to acquire the image. Compressive Sensing (CS) plays a major role in this, especially to unclog the Analog to Digital Converters which are generally representing the bottleneck of this type of sensors. In addition, CS eliminates traditional compression processing stages that are performed by embedded digital signal processors dedicated to this purpose. The interest is twofold because it allows both to consistently reduce the amount of data to be converted but also to suppress digital processing performed out of the sensor chip. For the moment, regarding the use of CS in image sensors, the main route of exploration as well as the intended applications aims at reducing power consumption related to these components (i.e. ADC & DSP represent 99% of the total power consumption). More broadly, the paradigm of CS allows to question or at least to extend the Nyquist-Shannon sampling theory. This thesis shows developments in the field of image sensors demonstrating that is possible to consider alternative applications linked to CS. Indeed, advances are presented in the fields of hyperspectral imaging, super-resolution, high dynamic range, high speed and non-uniform sampling. In particular, three research axes have been deepened, aiming to design proper architectures and acquisition processes with their associated reconstruction techniques taking advantage of image sparse representations. How the on-chip implementation of Compressed Sensing can relax sensor constraints, improving the acquisition characteristics (speed, dynamic range, power consumption) ? How CS can be combined with simple analysis to provide useful image features for high level applications (adding semantic information) and improve the reconstructed image quality at a certain compression ratio ? Finally, how CS can improve physical limitations (i.e. spectral sensitivity and pixel pitch) of imaging systems without a major impact neither on the sensing strategy nor on the optical elements involved ? A CMOS image sensor has been developed and manufactured during this Ph.D. to validate concepts such as the High Dynamic Range - CS. A new design approach was employed resulting in innovative solutions for pixels addressing and conversion to perform specific acquisition in a compressed mode. On the other hand, the principle of adaptive CS combined with the non-uniform sampling has been developed. Possible implementations of this type of acquisition are proposed. Finally, preliminary works are exhibited on the use of Liquid Crystal Devices to allow hyperspectral imaging combined with spatial super-resolution. The conclusion of this study can be summarized as follows: CS must now be considered as a toolbox for defining more easily compromises between the different characteristics of the sensors: integration time, converters speed, dynamic range, resolution and digital processing resources. However, if CS relaxes some material constraints at the sensor level, it is possible that the collected data are difficult to interpret and process at the decoder side, involving massive computational resources compared to so-called conventional techniques. The application field is wide, implying that for a targeted application, an accurate characterization of the constraints concerning both the sensor (encoder), but also the decoder need to be defined

    Foundations, Inference, and Deconvolution in Image Restoration

    Get PDF
    Image restoration is a critical preprocessing step in computer vision, producing images with reduced noise, blur, and pixel defects. This enables precise higher-level reasoning as to the scene content in later stages of the vision pipeline (e.g., object segmentation, detection, recognition, and tracking). Restoration techniques have found extensive usage in a broad range of applications from industry, medicine, astronomy, biology, and photography. The recovery of high-grade results requires models of the image degradation process, giving rise to a class of often heavily underconstrained, inverse problems. A further challenge specific to the problem of blur removal is noise amplification, which may cause strong distortion by ringing artifacts. This dissertation presents new insights and problem solving procedures for three areas of image restoration, namely (1) model foundations, (2) Bayesian inference for high-order Markov random fields (MRFs), and (3) blind image deblurring (deconvolution). As basic research on model foundations, we contribute to reconciling the perceived differences between probabilistic MRFs on the one hand, and deterministic variational models on the other. To do so, we restrict the variational functional to locally supported finite elements (FE) and integrate over the domain. This yields a sum of terms depending locally on FE basis coefficients, and by identifying the latter with pixels, the terms resolve to MRF potential functions. In contrast with previous literature, we place special emphasis on robust regularizers used commonly in contemporary computer vision. Moreover, we draw samples from the derived models to further demonstrate the probabilistic connection. Another focal issue is a class of high-order Field of Experts MRFs which are learned generatively from natural image data and yield best quantitative results under Bayesian estimation. This involves minimizing an integral expression, which has no closed form solution in general. However, the MRF class under study has Gaussian mixture potentials, permitting expansion by indicator variables as a technical measure. As approximate inference method, we study Gibbs sampling in the context of non-blind deblurring and obtain excellent results, yet at the cost of high computing effort. In reaction to this, we turn to the mean field algorithm, and show that it scales quadratically in the clique size for a standard restoration setting with linear degradation model. An empirical study of mean field over several restoration scenarios confirms advantageous properties with regard to both image quality and computational runtime. This dissertation further examines the problem of blind deconvolution, beginning with localized blur from fast moving objects in the scene, or from camera defocus. Forgoing dedicated hardware or user labels, we rely only on the image as input and introduce a latent variable model to explain the non-uniform blur. The inference procedure estimates freely varying kernels and we demonstrate its generality by extensive experiments. We further present a discriminative method for blind removal of camera shake. In particular, we interleave discriminative non-blind deconvolution steps with kernel estimation and leverage the error cancellation effects of the Regression Tree Field model to attain a deblurring process with tightly linked sequential stages
    corecore