17,529 research outputs found
Learning Character-level Compositionality with Visual Features
Previous work has modeled the compositionality of words by creating
character-level models of meaning, reducing problems of sparsity for rare
words. However, in many writing systems compositionality has an effect even on
the character-level: the meaning of a character is derived by the sum of its
parts. In this paper, we model this effect by creating embeddings for
characters based on their visual characteristics, creating an image for the
character and running it through a convolutional neural network to produce a
visual character embedding. Experiments on a text classification task
demonstrate that such model allows for better processing of instances with rare
characters in languages such as Chinese, Japanese, and Korean. Additionally,
qualitative analyses demonstrate that our proposed model learns to focus on the
parts of characters that carry semantic content, resulting in embeddings that
are coherent in visual space.Comment: Accepted to ACL 201
Moses-based official baseline for NEWS 2016
Transliteration is the phonetic translation between two different languages. There are many works that approach transliteration using machine translation methods. This paper describes the official baseline system for the NEWS 2016 workshop shared task. This baseline is based on a standard phrase-based machine translation system using Moses. Results are between the range of best and worst from last year’s workshops providing a nice starting point for participants this year.Postprint (published version
- …