Character-level features are currently used in different neural network-based
natural language processing algorithms. However, little is known about the
character-level patterns those models learn. Moreover, models are often
compared only quantitatively while a qualitative analysis is missing. In this
paper, we investigate which character-level patterns neural networks learn and
if those patterns coincide with manually-defined word segmentations and
annotations. To that end, we extend the contextual decomposition technique
(Murdoch et al. 2018) to convolutional neural networks which allows us to
compare convolutional neural networks and bidirectional long short-term memory
networks. We evaluate and compare these models for the task of morphological
tagging on three morphologically different languages and show that these models
implicitly discover understandable linguistic rules. Our implementation can be
found at https://github.com/FredericGodin/ContextualDecomposition-NLP .Comment: Accepted at EMNLP 201