9 research outputs found

    Building generative models over discrete structures : from graphical models to deep learning

    No full text
    Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019Cataloged from PDF version of thesis. Page 173 blank.Includes bibliographical references (pages 159-172).The goal of this thesis is to investigate generative models over discrete structures, such as binary grids, alignments or arbitrary graphs. We focused on developing models easy to sample from, and we approached the task from two broad perspectives: defining models via structured potential functions, and via neural network based decoders. In the first case, we investigated Perturbation Models, a family of implicit distributions where samples emerge through optimization of randomized potential functions. Designed explicitly for efficient sampling, Perturbation Models are strong candidates for building generative models over structures, and the leading open questions pertain to understanding the properties of the induced models and developing practical learning algorithms.In this thesis, we present theoretical results showing that, in contrast to the more established Gibbs models, low-order potential functions, after undergoing randomization and maximization, lead to high-order dependencies in the induced distributions. Furthermore, while conditioning in Gibbs' distributions is straightforward, conditioning in Perturbation Models is typically not, but we theoretically characterize cases where the straightforward approach produces the correct results. Finally, we introduce a new Perturbation Models learning algorithm based on Inverse Combinatorial Optimization. We illustrate empirically both the induced dependencies and the inverse optimization approach, in learning tasks inspired by computer vision problems. In the second case, we sequentialize the structures, converting structure generation into a sequence of discrete decisions, to enable the use of sequential models.We explore maximum likelihood training with step-wise supervision and continuous relaxations of the intermediate decisions. With respect to intermediate discrete representations, the main directions consist of using gradient estimators or designing continuous relaxations. We discuss these solutions in the context of unsupervised scene understanding with generative models. In particular, we asked whether a continuous relaxation of the counting problem also discovers the objects in an unsupervised fashion (given the increased training stability that continuous relaxations provide) and we proposed an approach based on Adaptive Computation Time (ACT) which achieves the desired result. Finally, we investigated the task of iterative graph generation. We proposed a variational lower-bound to the maximum likelihood objective, where the approximate posterior distribution renormalizes the prior distribution over local predictions which are plausible for the target graph.For instance, the local predictions may be binary values indicating the presence or absence of an edge indexed by the given time step, for a canonical edge indexing chosen a-priori. The plausibility of each local prediction is assessed by solving a combinatorial optimization problem, and we discuss relevant approaches, including an induced sub-graph isomorphism-based algorithm for the generic graph generation case, and a polynomial algorithm for the special case of graph generation resulting from solving graph clustering tasks. In this thesis, we focused on the generic case, and we investigated the approximate posterior's relevance on synthetic graph datasets.by Georgiana Andreea Gane.Ph. D.Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienc

    Direct optimization through arg max for discrete variational auto-encoder

    No full text
    Reparameterization of variational auto-encoders with continuous random variables is an effective method for reducing the variance of their gradient estimates. In the discrete case, one can perform reparametrization using the Gumbel-Max trick, but the resulting objective relies on an arg max operation and is non-differentiable. In contrast to previous works which resort to softmax-based relaxations, we propose to optimize it directly by applying the direct loss minimization approach. Our proposal extends naturally to structured discrete latent variable models when evaluating the arg max operation is tractable. We demonstrate empirically the effectiveness of the direct loss minimization technique in variational autoencoders with both unstructured and structured discrete latent variables
    corecore