45 research outputs found
Applying a Color Palette with Local Control using Diffusion Models
We demonstrate two novel editing procedures in the context of fantasy card
art. Palette transfer applies a specified reference palette to a given card.
For fantasy art, the desired change in palette can be very large, leading to
huge changes in the "look" of the art. We demonstrate that a pipeline of vector
quantization; matching; and "vector dequantization" (using a diffusion model)
produces successful extreme palette transfers. Segment control allows an artist
to move one or more image segments, and to optionally specify the desired color
of the result. The combination of these two types of edit yields valuable
workflows, including: move a segment, then recolor; recolor, then force some
segments to take a prescribed color. We demonstrate our methods on the
challenging Yu-Gi-Oh card art dataset.Comment: 11 pages, 8 figure
Coarse-grained Multiresolution Structures for Mobile Exploration of Gigantic Surface Models
We discuss our experience in creating scalable systems for distributing
and rendering gigantic 3D surfaces on web environments and
common handheld devices. Our methods are based on compressed
streamable coarse-grained multiresolution structures. By combining
CPU and GPU compression technology with our multiresolution
data representation, we are able to incrementally transfer, locally
store and render with unprecedented performance extremely
detailed 3D mesh models on WebGL-enabled browsers, as well as
on hardware-constrained mobile devices
Graph Spectral Image Processing
Recent advent of graph signal processing (GSP) has spurred intensive studies
of signals that live naturally on irregular data kernels described by graphs
(e.g., social networks, wireless sensor networks). Though a digital image
contains pixels that reside on a regularly sampled 2D grid, if one can design
an appropriate underlying graph connecting pixels with weights that reflect the
image structure, then one can interpret the image (or image patch) as a signal
on a graph, and apply GSP tools for processing and analysis of the signal in
graph spectral domain. In this article, we overview recent graph spectral
techniques in GSP specifically for image / video processing. The topics covered
include image compression, image restoration, image filtering and image
segmentation
Graphically Structured Diffusion Models
We introduce a framework for automatically defining and learning deep
generative models with problem-specific structure. We tackle problem domains
that are more traditionally solved by algorithms such as sorting, constraint
satisfaction for Sudoku, and matrix factorization. Concretely, we train
diffusion models with an architecture tailored to the problem specification.
This problem specification should contain a graphical model describing
relationships between variables, and often benefits from explicit
representation of subcomputations. Permutation invariances can also be
exploited. Across a diverse set of experiments we improve the scaling
relationship between problem dimension and our model's performance, in terms of
both training time and final accuracy. Our code can be found at
https://github.com/plai-group/gsdm
Accurate and reliable probabilistic modeling with high-dimensional data
Machine learning studies algorithms for learning from data.
Probabilistic modeling and reasoning define a principled framework for machine learning, where probability theory is used to represent and manipulate knowledge. In this thesis we focus on two fundamental tasks in probabilistic machine learning: probabilistic prediction and density estimation. We study reliability of probabilistic predictive models, propose flexible models for density estimation, and propose a novel training regime for densities with low-dimensional structure. Neural networks demonstrate state-of-the-art performance in many different prediction tasks.
At the same time, modern neural networks trained by maximum likelihood have poorly calibrated predictive uncertainties and suffer from adversarial examples. We hypothesize that careful probabilistic treatment of neural networks would make them better calibrated and more robust. However, Bayesian neural networks have to rely on uninformative priors and crude approximations, which makes it difficult to test this hypothesis. In this thesis we take a step back and study adversarial robustness of a simple, linear model, demonstrating that it no longer suffers from calibration errors on adversarial points when the approximate inference method is accurate and the prior is chosen carefully. Classic density estimation methods do not scale to complex, high-dimensional data like natural images. Normalizing flows model the target density as an invertible transformation of a simple base density, and demonstrate good results in high-dimensional density estimation tasks. State-of-the-art normalizing flow architectures rely on parametrizations of univariate invertible functions. Simple additive/affine parametrizations are often used, stacking many layers to express complex transformations. In this thesis we propose novel parametrizations based on cubic and rational-quadratic splines. The proposed flows demonstrate improved parameter-efficiency and advance state-of-the-art on several density estimation benchmarks. The manifold hypothesis says that the data are likely to lie on a lower-dimensional manifold. This assumption is built into many machine learning models, but using it with density models like normalizing flows is difficult: the standard likelihood-based training objective becomes ill-defined. Injective normalizing flows can be implemented, but their training objective is no longer tractable, requiring approximations or heuristic alternatives. In this thesis we propose a novel training objective that uses nested dropout to align the latent space of a normalizing flow, allowing us to extract a sequence of manifold densities from the trained model. Our experiments demonstrate that the manifolds fit by the method match the data well
An Introduction to Neural Data Compression
Neural compression is the application of neural networks and other machine
learning methods to data compression. Recent advances in statistical machine
learning have opened up new possibilities for data compression, allowing
compression algorithms to be learned end-to-end from data using powerful
generative models such as normalizing flows, variational autoencoders,
diffusion probabilistic models, and generative adversarial networks. The
present article aims to introduce this field of research to a broader machine
learning audience by reviewing the necessary background in information theory
(e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image
quality assessment, perceptual metrics), and providing a curated guide through
the essential ideas and methods in the literature thus far