45 research outputs found

    Applying a Color Palette with Local Control using Diffusion Models

    Full text link
    We demonstrate two novel editing procedures in the context of fantasy card art. Palette transfer applies a specified reference palette to a given card. For fantasy art, the desired change in palette can be very large, leading to huge changes in the "look" of the art. We demonstrate that a pipeline of vector quantization; matching; and "vector dequantization" (using a diffusion model) produces successful extreme palette transfers. Segment control allows an artist to move one or more image segments, and to optionally specify the desired color of the result. The combination of these two types of edit yields valuable workflows, including: move a segment, then recolor; recolor, then force some segments to take a prescribed color. We demonstrate our methods on the challenging Yu-Gi-Oh card art dataset.Comment: 11 pages, 8 figure

    Coarse-grained Multiresolution Structures for Mobile Exploration of Gigantic Surface Models

    Get PDF
    We discuss our experience in creating scalable systems for distributing and rendering gigantic 3D surfaces on web environments and common handheld devices. Our methods are based on compressed streamable coarse-grained multiresolution structures. By combining CPU and GPU compression technology with our multiresolution data representation, we are able to incrementally transfer, locally store and render with unprecedented performance extremely detailed 3D mesh models on WebGL-enabled browsers, as well as on hardware-constrained mobile devices

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Graphically Structured Diffusion Models

    Full text link
    We introduce a framework for automatically defining and learning deep generative models with problem-specific structure. We tackle problem domains that are more traditionally solved by algorithms such as sorting, constraint satisfaction for Sudoku, and matrix factorization. Concretely, we train diffusion models with an architecture tailored to the problem specification. This problem specification should contain a graphical model describing relationships between variables, and often benefits from explicit representation of subcomputations. Permutation invariances can also be exploited. Across a diverse set of experiments we improve the scaling relationship between problem dimension and our model's performance, in terms of both training time and final accuracy. Our code can be found at https://github.com/plai-group/gsdm

    Accurate and reliable probabilistic modeling with high-dimensional data

    Get PDF
    Machine learning studies algorithms for learning from data. Probabilistic modeling and reasoning define a principled framework for machine learning, where probability theory is used to represent and manipulate knowledge. In this thesis we focus on two fundamental tasks in probabilistic machine learning: probabilistic prediction and density estimation. We study reliability of probabilistic predictive models, propose flexible models for density estimation, and propose a novel training regime for densities with low-dimensional structure. Neural networks demonstrate state-of-the-art performance in many different prediction tasks. At the same time, modern neural networks trained by maximum likelihood have poorly calibrated predictive uncertainties and suffer from adversarial examples. We hypothesize that careful probabilistic treatment of neural networks would make them better calibrated and more robust. However, Bayesian neural networks have to rely on uninformative priors and crude approximations, which makes it difficult to test this hypothesis. In this thesis we take a step back and study adversarial robustness of a simple, linear model, demonstrating that it no longer suffers from calibration errors on adversarial points when the approximate inference method is accurate and the prior is chosen carefully. Classic density estimation methods do not scale to complex, high-dimensional data like natural images. Normalizing flows model the target density as an invertible transformation of a simple base density, and demonstrate good results in high-dimensional density estimation tasks. State-of-the-art normalizing flow architectures rely on parametrizations of univariate invertible functions. Simple additive/affine parametrizations are often used, stacking many layers to express complex transformations. In this thesis we propose novel parametrizations based on cubic and rational-quadratic splines. The proposed flows demonstrate improved parameter-efficiency and advance state-of-the-art on several density estimation benchmarks. The manifold hypothesis says that the data are likely to lie on a lower-dimensional manifold. This assumption is built into many machine learning models, but using it with density models like normalizing flows is difficult: the standard likelihood-based training objective becomes ill-defined. Injective normalizing flows can be implemented, but their training objective is no longer tractable, requiring approximations or heuristic alternatives. In this thesis we propose a novel training objective that uses nested dropout to align the latent space of a normalizing flow, allowing us to extract a sequence of manifold densities from the trained model. Our experiments demonstrate that the manifolds fit by the method match the data well

    An Introduction to Neural Data Compression

    Full text link
    Neural compression is the application of neural networks and other machine learning methods to data compression. Recent advances in statistical machine learning have opened up new possibilities for data compression, allowing compression algorithms to be learned end-to-end from data using powerful generative models such as normalizing flows, variational autoencoders, diffusion probabilistic models, and generative adversarial networks. The present article aims to introduce this field of research to a broader machine learning audience by reviewing the necessary background in information theory (e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image quality assessment, perceptual metrics), and providing a curated guide through the essential ideas and methods in the literature thus far
    corecore