Words can be represented by composing the representations of subword units
such as word segments, characters, and/or character n-grams. While such
representations are effective and may capture the morphological regularities of
words, they have not been systematically compared, and it is not understood how
they interact with different morphological typologies. On a language modeling
task, we present experiments that systematically vary (1) the basic unit of
representation, (2) the composition of these representations, and (3) the
morphological typology of the language modeled. Our results extend previous
findings that character representations are effective across typologies, and we
find that a previously unstudied combination of character trigram
representations composed with bi-LSTMs outperforms most others. But we also
find room for improvement: none of the character-level models match the
predictive accuracy of a model with access to true morphological analyses, even
when learned from an order of magnitude more data.Comment: Accepted at ACL 201