941 research outputs found

    LIG-CRIStAL System for the WMT17 Automatic Post-Editing Task

    Get PDF
    This paper presents the LIG-CRIStAL submission to the shared Automatic Post- Editing task of WMT 2017. We propose two neural post-editing models: a monosource model with a task-specific attention mechanism, which performs particularly well in a low-resource scenario; and a chained architecture which makes use of the source sentence to provide extra context. This latter architecture manages to slightly improve our results when more training data is available. We present and discuss our results on two datasets (en-de and de-en) that are made available for the task.Comment: keywords: neural post-edition, attention model

    Duality symmetries and effective dynamics in disordered hopping models

    Full text link
    We identify a duality transformation in one-dimensional hopping models that relates propagators in general disordered potentials linked by an up-down inversion of the energy landscape. This significantly generalises previous results for a duality between trap and barrier models. We use the resulting insights into the symmetries of these models to develop a real-space renormalisation scheme that can be implemented computationally and allows rather accurate prediction of propagation in these models. We also discuss the relation of this renormalisation scheme to earlier analytical treatments.Comment: 29 pages, 7 figs. Final version, some extra context and references adde

    Attentive Convolution: Equipping CNNs with RNN-style Attention Mechanisms

    Get PDF
    In NLP, convolutional neural networks (CNNs) have benefited less than recurrent neural networks (RNNs) from attention mechanisms. We hypothesize that this is because the attention in CNNs has been mainly implemented as attentive pooling (i.e., it is applied to pooling) rather than as attentive convolution (i.e., it is integrated into convolution). Convolution is the differentiator of CNNs in that it can powerfully model the higher-level representation of a word by taking into account its local fixed-size context in the input text t^x. In this work, we propose an attentive convolution network, ATTCONV. It extends the context scope of the convolution operation, deriving higher-level features for a word not only from local context, but also information extracted from nonlocal context by the attention mechanism commonly used in RNNs. This nonlocal context can come (i) from parts of the input text t^x that are distant or (ii) from extra (i.e., external) contexts t^y. Experiments on sentence modeling with zero-context (sentiment analysis), single-context (textual entailment) and multiple-context (claim verification) demonstrate the effectiveness of ATTCONV in sentence representation learning with the incorporation of context. In particular, attentive convolution outperforms attentive pooling and is a strong competitor to popular attentive RNNs.Comment: Camera-ready for TACL. 16 page

    Consumption context and personalization

    Get PDF

    Focusing on the Big Picture: Insights into a Systems Approach to Deep Learning for Satellite Imagery

    Full text link
    Deep learning tasks are often complicated and require a variety of components working together efficiently to perform well. Due to the often large scale of these tasks, there is a necessity to iterate quickly in order to attempt a variety of methods and to find and fix bugs. While participating in IARPA's Functional Map of the World challenge, we identified challenges along the entire deep learning pipeline and found various solutions to these challenges. In this paper, we present the performance, engineering, and deep learning considerations with processing and modeling data, as well as underlying infrastructure considerations that support large-scale deep learning tasks. We also discuss insights and observations with regard to satellite imagery and deep learning for image classification.Comment: Accepted to IEEE Big Data 201

    The Guppy Effect as Interference

    Full text link
    People use conjunctions and disjunctions of concepts in ways that violate the rules of classical logic, such as the law of compositionality. Specifically, they overextend conjunctions of concepts, a phenomenon referred to as the Guppy Effect. We build on previous efforts to develop a quantum model that explains the Guppy Effect in terms of interference. Using a well-studied data set with 16 exemplars that exhibit the Guppy Effect, we developed a 17-dimensional complex Hilbert space H that models the data and demonstrates the relationship between overextension and interference. We view the interference effect as, not a logical fallacy on the conjunction, but a signal that out of the two constituent concepts, a new concept has emerged.Comment: 10 page

    Retrieval-Augmented Classification with Decoupled Representation

    Full text link
    Retrieval augmented methods have shown promising results in various classification tasks. However, existing methods focus on retrieving extra context to enrich the input, which is noise sensitive and non-expandable. In this paper, following this line, we propose a kk-nearest-neighbor (KNN) -based method for retrieval augmented classifications, which interpolates the predicted label distribution with retrieved instances' label distributions. Different from the standard KNN process, we propose a decoupling mechanism as we find that shared representation for classification and retrieval hurts performance and leads to training instability. We evaluate our method on a wide range of classification datasets. Experimental results demonstrate the effectiveness and robustness of our proposed method. We also conduct extra experiments to analyze the contributions of different components in our model.\footnote{\url{https://github.com/xnliang98/knn-cls-w-decoupling}}Comment: preprin
    • …
    corecore