8,941 research outputs found

    A Neural Attention Model for Abstractive Sentence Summarization

    Full text link
    Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201

    FreezeOut: Accelerate Training by Progressively Freezing Layers

    Full text link
    The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOutComment: Extended Abstrac

    Discrete holomorphicity and quantized affine algebras

    Full text link
    We consider non-local currents in the context of quantized affine algebras, following the construction introduced by Bernard and Felder. In the case of Uq(A1(1))U_q(A_1^{(1)}) and Uq(A2(2))U_q(A_2^{(2)}), these currents can be identified with configurations in the six-vertex and Izergin--Korepin nineteen-vertex models. Mapping these to their corresponding Temperley--Lieb loop models, we directly identify non-local currents with discretely holomorphic loop observables. In particular, we show that the bulk discrete holomorphicity relation and its recently derived boundary analogue are equivalent to conservation laws for non-local currents

    Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

    Get PDF
    When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.Comment: 9 pages, 5 figures, 2 table

    SMASH: One-Shot Model Architecture Search through HyperNetworks

    Full text link
    Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMAS

    Unwritten

    Get PDF

    Creation Myths

    Get PDF

    Exact and Scaling Form of the Bipartite Fidelity of the Infinite XXZ Chain

    Full text link
    We find an exact expression for the bipartite fidelity f=|'|^2, where |vac> is the vacuum eigenstate of an infinite-size antiferromagnetic XXZ chain and |vac>' is the vacuum eigenstate of an infinite-size XXZ chain which is split in two. We consider the quantity -ln(f) which has been put forward as a measure of quantum entanglement, and show that the large correlation length xi behaviour is consistent with a general conjecture -ln(f) ~ c/8 ln(xi), where c is the central charge of the UV conformal field theory (with c=1 for the XXZ chain). This behaviour is a natural extension of the existing conformal field theory prediction of -ln(f) ~ c/8 ln(L) for a length L bipartite system with 0<< L <<xi.Comment: 6 page

    The Singing

    Get PDF
    • …
    corecore