8 research outputs found

    Selective Attention for Context-aware Neural Machine Translation

    Full text link
    Despite the progress made in sentence-level NMT, current systems still fall short at achieving fluent, good quality translation for a full document. Recent works in context-aware NMT consider only a few previous sentences as context and may not scale to entire documents. To this end, we propose a novel and scalable top-down approach to hierarchical attention for context-aware NMT which uses sparse attention to selectively focus on relevant sentences in the document context and then attends to key words in those sentences. We also propose single-level attention approaches based on sentence or word-level information in the context. The document-level context representation, produced from these attention modules, is integrated into the encoder or decoder of the Transformer model depending on whether we use monolingual or bilingual context. Our experiments and evaluation on English-German datasets in different document MT settings show that our selective attention approach not only significantly outperforms context-agnostic baselines but also surpasses context-aware baselines in most cases.Comment: Accepted at NAACL-HLT 201

    Leveraging Unannotated Texts for Scientific Relation Extraction

    Get PDF

    Semantic Feature Extraction Using Multi-Sense Embeddings and Lexical Chains

    Full text link
    The relationship between words in a sentence often tell us more about the underlying semantic content of a document than its actual words individually. Natural language understanding has seen an increasing effort in the formation of techniques that try to produce non-trivial features, in the last few years, especially after robust word embeddings models became prominent, when they proved themselves able to capture and represent semantic relationships from massive amounts of data. These new dense vector representations indeed leverage the baseline in natural language processing, but they still fall short in dealing with intrinsic issues in linguistics, such as polysemy and homonymy. Systems that make use of natural language at its core, can be affected by a weak semantic representation of human language, resulting in inaccurate outcomes based on poor decisions. In this subject, word sense disambiguation and lexical chains have been exploring alternatives to alleviate several problems in linguistics, such as semantic representation, definitions, differentiation, polysemy, and homonymy. However, little effort is seen in combining recent advances in token embeddings (e.g. words, documents) with word sense disambiguation and lexical chains. To collaborate in building a bridge between these areas, this work proposes a collection of algorithms to extract semantic features from large corpora as its main contributions, named MSSA, MSSA-D, MSSA-NR, FLLC II, and FXLC II. The MSSA techniques focus on disambiguating and annotating each word by its specific sense, considering the semantic effects of its context. The lexical chains group derive the semantic relations between consecutive words in a document in a dynamic and pre-defined manner. These original techniques' target is to uncover the implicit semantic links between words using their lexical structure, incorporating multi-sense embeddings, word sense disambiguation, lexical chains, and lexical databases. A few natural language problems are selected to validate the contributions of this work, in which our techniques outperform state-of-the-art systems. All the proposed algorithms can be used separately as independent components or combined in one single system to improve the semantic representation of words, sentences, and documents. Additionally, they can also work in a recurrent form, refining even more their results.Ph.D.College of Engineering & Computer ScienceUniversity of Michigan-Dearbornhttps://deepblue.lib.umich.edu/bitstream/2027.42/149647/1/Terry Ruas Final Dissertation.pdfDescription of Terry Ruas Final Dissertation.pdf : Dissertatio

    Bibliographie annuelle: recherche suisse sur le plurilinguisme 2017

    Get PDF
    Notre Bibliographie annuelle de la recherche suisse sur le plurilinguisme contient une sélection de publications consacrées au plurilinguisme. Ce numéro de la bibliographie comprend des publications parues en 2017. Il contient des informations sur des articles de revues, des chapitres de livres, des monographies, des volumes collectifs et des documents en ligne publiés par des chercheurs et chercheuses d’institutions suisses ainsi que des travaux de chercheurs internationaux parus dans certaines revues spécialisées. La bibliographie recense des publications dans les langues nationales suisses ainsi qu’en anglais.The Annual Bibliography of Swiss Research on Multilingualism contains a selection of scholarly publications in the disciplines of linguistics, sociology, pedagogy and other fields related to multilingualism. This issue contains bibliographic information on publications from the year 2017. Includes articles from journals, book chapters, monographs, anthologies and online documents by researchers at Swiss institutions. We have also included publications which international researchers have contributed to Swiss journals. The bibliography catalogues publications in Switzerland’s official languages and in English.Unsere Jahresbibliographie Schweizer Mehrsprachigkeitsforschung enthält eine Auswahl der linguistischen, soziologischen, erziehungswissenschaftlichen und anderweitig dem Themenkomplex Mehrsprachigkeit gewidmeten wissenschaftlichen Literatur. Diese Ausgabe enthält bibliographische Angaben zu Veröffentlichungen aus dem Jahr 2017. In die Bibliographie werden Zeitschriftenaufsätze, Buchkapitel, Monographien, Sammelwerke und Online-Dokumente von Forscherinnen und Forschern an Schweizer Institutionen sowie Publikationen internationaler Forscher/innen in einigen Schweizer Fachzeitschriften aufgenommen. Berücksichtigt werden Veröffentlichungen in den Landessprachen der Schweiz sowie in englischer Sprache.La nostra Bibliografia annuale della ricerca svizzera sul plurilinguismo contiene una selezione delle pubblicazioni consacrate al plurilinguismo apparse in linguistica, sociologia, scienze dell’educazione o in altre discipline. Questo numero della bibliografia contiene le pubblicazioni apparse nel 2017. Esso include articoli di riviste, capitoli di libri, monografie, opere collettive e documenti digitali pubblicati da ricercatrici e ricercatori d’istituzioni svizzere, oltre a lavori di ricercatori internazionali apparsi in alcune riviste specializzate. La bibliografia censisce pubblicazioni nelle lingue nazionali svizzere e in inglese
    corecore