Search CORE

5 research outputs found

Character n-gram Embeddings to Improve RNN Language Models

Author: Nagata Masaaki
Suzuki Jun
Takase Sho
Publication venue
Publication date: 13/06/2019
Field of study

This paper proposes a novel Recurrent Neural Network (RNN) language model that takes advantage of character information. We focus on character n-grams based on research in the field of word embedding construction (Wieting et al. 2016). Our proposed method constructs word embeddings from character n-gram embeddings and combines them with ordinary word embeddings. We demonstrate that the proposed method achieves the best perplexities on the language modeling datasets: Penn Treebank, WikiText-2, and WikiText-103. Moreover, we conduct experiments on application tasks: machine translation and headline generation. The experimental results indicate that our proposed method also positively affects these tasks.Comment: AAAI 2019 pape

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

The Mechanism of Additive Composition

Author: Inui Kentaro
Okazaki Naoaki
Tian Ran
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Additive composition (Foltz et al, 1998; Landauer and Dumais, 1997; Mitchell and Lapata, 2010) is a widely used method for computing meanings of phrases, which takes the average of vector representations of the constituent words. In this article, we prove an upper bound for the bias of additive composition, which is the first theoretical analysis on compositional frameworks from a machine learning point of view. The bound is written in terms of collocation strength; we prove that the more exclusively two successive words tend to occur together, the more accurate one can guarantee their additive composition as an approximation to the natural phrase vector. Our proof relies on properties of natural language data that are empirically verified, and can be theoretically derived from an assumption that the data is generated from a Hierarchical Pitman-Yor Process. The theory endorses additive composition as a reasonable operation for calculating meanings of phrases, and suggests ways to improve additive compositionality, including: transforming entries of distributional word vectors by a function that meets a specific condition, constructing a novel type of vector representations to make additive composition sensitive to word order, and utilizing singular value decomposition to train word vectors.Comment: More explanations on theory and additional experiments added. Accepted by Machine Learning Journa

arXiv.org e-Print Archive

Springer - Publisher Connector

スタンス分類における領域知識の活用に関する研究

Author: Sasaki Akira
Publication venue
Publication date: 31/08/2018
Field of study

Tohoku University乾健太郎課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

自然言語の分散表現へのエンコードに関する研究

Author: Takase Sho
Publication venue
Publication date: 14/12/2017
Field of study

Tohoku University乾健太郎課

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ