20 research outputs found
Representation Learning for Natural Language Processing
This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing
SememeASR: Boosting Performance of End-to-End Speech Recognition against Domain and Long-Tailed Data Shift with Sememe Semantic Knowledge
Recently, excellent progress has been made in speech recognition. However,
pure data-driven approaches have struggled to solve the problem in
domain-mismatch and long-tailed data. Considering that knowledge-driven
approaches can help data-driven approaches alleviate their flaws, we introduce
sememe-based semantic knowledge information to speech recognition (SememeASR).
Sememe, according to the linguistic definition, is the minimum semantic unit in
a language and is able to represent the implicit semantic information behind
each word very well. Our experiments show that the introduction of sememe
information can improve the effectiveness of speech recognition. In addition,
our further experiments show that sememe knowledge can improve the model's
recognition of long-tailed data and enhance the model's domain generalization
ability.Comment: Accepted by INTERSPEECH 202
Word-level Textual Adversarial Attacking as Combinatorial Optimization
Adversarial attacks are carried out to reveal the vulnerability of deep
neural networks. Textual adversarial attacking is challenging because text is
discrete and a small perturbation can bring significant change to the original
input. Word-level attacking, which can be regarded as a combinatorial
optimization problem, is a well-studied class of textual attack methods.
However, existing word-level attack models are far from perfect, largely
because unsuitable search space reduction methods and inefficient optimization
algorithms are employed. In this paper, we propose a novel attack model, which
incorporates the sememe-based word substitution method and particle swarm
optimization-based search algorithm to solve the two problems separately. We
conduct exhaustive experiments to evaluate our attack model by attacking BiLSTM
and BERT on three benchmark datasets. Experimental results demonstrate that our
model consistently achieves much higher attack success rates and crafts more
high-quality adversarial examples as compared to baseline methods. Also,
further experiments show our model has higher transferability and can bring
more robustness enhancement to victim models by adversarial training. All the
code and data of this paper can be obtained on
https://github.com/thunlp/SememePSO-Attack.Comment: Accepted at ACL 2020 as a long paper (a typo is corrected as compared
with the official conference camera-ready version). 16 pages, 3 figure
Knowledge-Augmented Language Model and its Application to Unsupervised Named-Entity Recognition
Traditional language models are unable to efficiently model entity names
observed in text. All but the most popular named entities appear infrequently
in text providing insufficient context. Recent efforts have recognized that
context can be generalized between entity names that share the same type (e.g.,
\emph{person} or \emph{location}) and have equipped language models with access
to an external knowledge base (KB). Our Knowledge-Augmented Language Model
(KALM) continues this line of work by augmenting a traditional model with a KB.
Unlike previous methods, however, we train with an end-to-end predictive
objective optimizing the perplexity of text. We do not require any additional
information such as named entity tags. In addition to improving language
modeling performance, KALM learns to recognize named entities in an entirely
unsupervised way by using entity type information latent in the model. On a
Named Entity Recognition (NER) task, KALM achieves performance comparable with
state-of-the-art supervised models. Our work demonstrates that named entities
(and possibly other types of world knowledge) can be modeled successfully using
predictive learning and training on large corpora of text without any
additional information.Comment: NAACL 2019; updated to cite Zhou et al. (2018) EMNLP as a piece of
related wor