8,924 research outputs found

    A Neural Attention Model for Abstractive Sentence Summarization

    Full text link
    Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201

    Exact and Scaling Form of the Bipartite Fidelity of the Infinite XXZ Chain

    Full text link
    We find an exact expression for the bipartite fidelity f=|'|^2, where |vac> is the vacuum eigenstate of an infinite-size antiferromagnetic XXZ chain and |vac>' is the vacuum eigenstate of an infinite-size XXZ chain which is split in two. We consider the quantity -ln(f) which has been put forward as a measure of quantum entanglement, and show that the large correlation length xi behaviour is consistent with a general conjecture -ln(f) ~ c/8 ln(xi), where c is the central charge of the UV conformal field theory (with c=1 for the XXZ chain). This behaviour is a natural extension of the existing conformal field theory prediction of -ln(f) ~ c/8 ln(L) for a length L bipartite system with 0<< L <<xi.Comment: 6 page

    FreezeOut: Accelerate Training by Progressively Freezing Layers

    Full text link
    The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOutComment: Extended Abstrac

    Discrete holomorphicity and quantized affine algebras

    Full text link
    We consider non-local currents in the context of quantized affine algebras, following the construction introduced by Bernard and Felder. In the case of Uq(A1(1))U_q(A_1^{(1)}) and Uq(A2(2))U_q(A_2^{(2)}), these currents can be identified with configurations in the six-vertex and Izergin--Korepin nineteen-vertex models. Mapping these to their corresponding Temperley--Lieb loop models, we directly identify non-local currents with discretely holomorphic loop observables. In particular, we show that the bulk discrete holomorphicity relation and its recently derived boundary analogue are equivalent to conservation laws for non-local currents

    Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

    Get PDF
    When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.Comment: 9 pages, 5 figures, 2 table

    SMASH: One-Shot Model Architecture Search through HyperNetworks

    Full text link
    Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMAS

    Unwritten

    Get PDF

    Creation Myths

    Get PDF

    The Singing

    Get PDF
    • …
    corecore