5,761 research outputs found
Dual Long Short-Term Memory Networks for Sub-Character Representation Learning
Characters have commonly been regarded as the minimal processing unit in
Natural Language Processing (NLP). But many non-latin languages have
hieroglyphic writing systems, involving a big alphabet with thousands or
millions of characters. Each character is composed of even smaller parts, which
are often ignored by the previous work. In this paper, we propose a novel
architecture employing two stacked Long Short-Term Memory Networks (LSTMs) to
learn sub-character level representation and capture deeper level of semantic
meanings. To build a concrete study and substantiate the efficiency of our
neural architecture, we take Chinese Word Segmentation as a research case
example. Among those languages, Chinese is a typical case, for which every
character contains several components called radicals. Our networks employ a
shared radical level embedding to solve both Simplified and Traditional Chinese
Word Segmentation, without extra Traditional to Simplified Chinese conversion,
in such a highly end-to-end way the word segmentation can be significantly
simplified compared to the previous work. Radical level embeddings can also
capture deeper semantic meaning below character level and improve the system
performance of learning. By tying radical and character embeddings together,
the parameter count is reduced whereas semantic knowledge is shared and
transferred between two levels, boosting the performance largely. On 3 out of 4
Bakeoff 2005 datasets, our method surpassed state-of-the-art results by up to
0.4%. Our results are reproducible, source codes and corpora are available on
GitHub.Comment: Accepted & forthcoming at ITNG-201
SuperChat: Dialogue Generation by Transfer Learning from Vision to Language using Two-dimensional Word Embedding and Pretrained ImageNet CNN Models
The recent work of Super Characters method using two-dimensional word
embedding achieved state-of-the-art results in text classification tasks,
showcasing the promise of this new approach. This paper borrows the idea of
Super Characters method and two-dimensional embedding, and proposes a method of
generating conversational response for open domain dialogues. The experimental
results on a public dataset shows that the proposed SuperChat method generates
high quality responses. An interactive demo is ready to show at the workshop.Comment: 5 pages, 2 figures, 1 table. Accepted by CVPR2019 Language and Vision
Worksho
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
Visual modifications to text are often used to obfuscate offensive comments
in social media (e.g., "!d10t") or as a writing style ("1337" in "leet speak"),
among other scenarios. We consider this as a new type of adversarial attack in
NLP, a setting to which humans are very robust, as our experiments with both
simple and more difficult visual input perturbations demonstrate. We then
investigate the impact of visual adversarial attacks on current NLP systems on
character-, word-, and sentence-level tasks, showing that both neural and
non-neural models are, in contrast to humans, extremely sensitive to such
attacks, suffering performance decreases of up to 82\%. We then explore three
shielding methods---visual character embeddings, adversarial training, and
rule-based recovery---which substantially improve the robustness of the models.
However, the shielding methods still fall behind performances achieved in
non-attack scenarios, which demonstrates the difficulty of dealing with visual
attacks.Comment: Accepted as long paper at NAACL-2019; fixed one ungrammatical
sentenc
One-Shot Neural Cross-Lingual Transfer for Paradigm Completion
We present a novel cross-lingual transfer method for paradigm completion, the
task of mapping a lemma to its inflected forms, using a neural encoder-decoder
model, the state of the art for the monolingual task. We use labeled data from
a high-resource language to increase performance on a low-resource language. In
experiments on 21 language pairs from four different language families, we
obtain up to 58% higher accuracy than without transfer and show that even
zero-shot and one-shot learning are possible. We further find that the degree
of language relatedness strongly influences the ability to transfer
morphological knowledge.Comment: Accepted at ACL 201
- …