3,699 research outputs found
Adapting Sequence to Sequence models for Text Normalization in Social Media
Social media offer an abundant source of valuable raw data, however informal
writing can quickly become a bottleneck for many natural language processing
(NLP) tasks. Off-the-shelf tools are usually trained on formal text and cannot
explicitly handle noise found in short online posts. Moreover, the variety of
frequently occurring linguistic variations presents several challenges, even
for humans who might not be able to comprehend the meaning of such posts,
especially when they contain slang and abbreviations. Text Normalization aims
to transform online user-generated text to a canonical form. Current text
normalization systems rely on string or phonetic similarity and classification
models that work on a local fashion. We argue that processing contextual
information is crucial for this task and introduce a social media text
normalization hybrid word-character attention-based encoder-decoder model that
can serve as a pre-processing step for NLP applications to adapt to noisy text
in social media. Our character-based component is trained on synthetic
adversarial examples that are designed to capture errors commonly found in
online user-generated text. Experiments show that our model surpasses neural
architectures designed for text normalization and achieves comparable
performance with state-of-the-art related work.Comment: Accepted at the 13th International AAAI Conference on Web and Social
Media (ICWSM 2019
Improving character-based decoding using target-side morphological information for neural machine translation
Recently, neural machine translation
(NMT) has emerged as a powerful alternative to conventional statistical approaches.
However, its performance drops considerably in the presence of morphologically
rich languages (MRLs). Neural engines
usually fail to tackle the large vocabulary
and high out-of-vocabulary (OOV) word
rate of MRLs. Therefore, it is not suitable
to exploit existing word-based models
to translate this set of languages. In this
paper, we propose an extension to the
state-of-the-art model of Chung et al.
(2016), which works at the character level
and boosts the decoder with target-side
morphological information. In our architecture, an additional morphology table
is plugged into the model. Each time the
decoder samples from a target vocabulary,
the table sends auxiliary signals from the
most relevant affixes in order to enrich the
decoder’s current state and constrain it to
provide better predictions. We evaluated
our model to translate English into German, Russian, and Turkish as three MRLs
and observed significant improvements
- …