Search CORE

9,093 research outputs found

A Neural Attention Model for Abstractive Sentence Summarization

Author: Chopra Sumit
Rush Alexander M.
Weston Jason
Publication venue
Publication date: 01/01/2015
Field of study

Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.Comment: Proceedings of EMNLP 201

arXiv.org e-Print Archive

Crossref

Exact and Scaling Form of the Bipartite Fidelity of the Infinite XXZ Chain

Author: Baxter R J
Calabrese P
Di Francesco P
Dubail J
Jimbo M
Olver F W J
Peschel I
Robert Weston
Weston R
Weston R
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

We find an exact expression for the bipartite fidelity f=|'|^2, where |vac> is the vacuum eigenstate of an infinite-size antiferromagnetic XXZ chain and |vac>' is the vacuum eigenstate of an infinite-size XXZ chain which is split in two. We consider the quantity -ln(f) which has been put forward as a measure of quantum entanglement, and show that the large correlation length xi behaviour is consistent with a general conjecture -ln(f) ~ c/8 ln(xi), where c is the central charge of the UV conformal field theory (with c=1 for the XXZ chain). This behaviour is a natural extension of the existing conformal field theory prediction of -ln(f) ~ c/8 ln(L) for a length L bipartite system with 0<< L <<xi.Comment: 6 page

arXiv.org e-Print Archive

CiteSeerX

Heriot Watt Pure

Crossref

FreezeOut: Accelerate Training by Progressively Freezing Layers

Author: Brock Andrew
Lim Theodore
Ritchie J. M.
Weston Nick
Publication venue
Publication date: 18/06/2017
Field of study

The early layers of a deep neural net have the fewest parameters, but take up the most computation. In this extended abstract, we propose to only train the hidden layers for a set portion of the training run, freezing them out one-by-one and excluding them from the backward pass. Through experiments on CIFAR, we empirically demonstrate that FreezeOut yields savings of up to 20% wall-clock time during training with 3% loss in accuracy for DenseNets, a 20% speedup without loss of accuracy for ResNets, and no improvement for VGG networks. Our code is publicly available at https://github.com/ajbrock/FreezeOutComment: Extended Abstrac

arXiv.org e-Print Archive

Heriot Watt Pure

Discrete holomorphicity and quantized affine algebras

Author: Ikhlef Y.
Weston R.
Wheeler M.
Zinn-Justin P.
Publication venue: 'IOP Publishing'
Publication date: 01/06/2013
Field of study

We consider non-local currents in the context of quantized affine algebras, following the construction introduced by Bernard and Felder. In the case of

U_q(A_1^{(1)})

and

U_q(A_2^{(2)})

, these currents can be identified with configurations in the six-vertex and Izergin--Korepin nineteen-vertex models. Mapping these to their corresponding Temperley--Lieb loop models, we directly identify non-local currents with discretely holomorphic loop observables. In particular, we show that the bulk discrete holomorphicity relation and its recently derived boundary analogue are equivalent to conservation laws for non-local currents

arXiv.org e-Print Archive

Heriot Watt Pure

Crossref

Generative and Discriminative Voxel Modeling with Convolutional Neural Networks

Author: Brock Andrew
Lim Theodore
Ritchie J. M.
Weston Nick
Publication venue
Publication date: 16/08/2016
Field of study

When working with three-dimensional data, choice of representation is key. We explore voxel-based models, and present evidence for the viability of voxellated representations in applications including shape modeling and object classification. Our key contributions are methods for training voxel-based variational autoencoders, a user interface for exploring the latent space learned by the autoencoder, and a deep convolutional neural network architecture for object classification. We address challenges unique to voxel-based representations, and empirically evaluate our models on the ModelNet benchmark, where we demonstrate a 51.5% relative improvement in the state of the art for object classification.Comment: 9 pages, 5 figures, 2 table

arXiv.org e-Print Archive

Heriot Watt Pure

ZENODO

SMASH: One-Shot Model Architecture Search through HyperNetworks

Author: Brock Andrew
Lim Theodore
Ritchie J. M.
Weston Nick
Publication venue
Publication date: 17/08/2017
Field of study

Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate architecture selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model's architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read-writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-10 and CIFAR-100, STL-10, ModelNet10, and Imagenet32x32, achieving competitive performance with similarly-sized hand-designed networks. Our code is available at https://github.com/ajbrock/SMAS

arXiv.org e-Print Archive

Heriot Watt Pure