33,640 research outputs found
Attention Is All You Need for Chinese Word Segmentation
Taking greedy decoding algorithm as it should be, this work focuses on
further strengthening the model itself for Chinese word segmentation (CWS),
which results in an even more fast and more accurate CWS model. Our model
consists of an attention only stacked encoder and a light enough decoder for
the greedy segmentation plus two highway connections for smoother training, in
which the encoder is composed of a newly proposed Transformer variant,
Gaussian-masked Directional (GD) Transformer, and a biaffine attention scorer.
With the effective encoder design, our model only needs to take unigram
features for scoring. Our model is evaluated on SIGHAN Bakeoff benchmark
datasets. The experimental results show that with the highest segmentation
speed, the proposed model achieves new state-of-the-art or comparable
performance against strong baselines in terms of strict closed test setting.Comment: 11 pages, to appear in EMNLP 2020 as a long pape
Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Head-driven phrase structure grammar (HPSG) enjoys a uniform formalism
representing rich contextual syntactic and even semantic meanings. This paper
makes the first attempt to formulate a simplified HPSG by integrating
constituent and dependency formal representations into head-driven phrase
structure. Then two parsing algorithms are respectively proposed for two
converted tree representations, division span and joint span. As HPSG encodes
both constituent and dependency structure information, the proposed HPSG
parsers may be regarded as a sort of joint decoder for both types of structures
and thus are evaluated in terms of extracted or converted constituent and
dependency parsing trees. Our parser achieves new state-of-the-art performance
for both parsing tasks on Penn Treebank (PTB) and Chinese Penn Treebank,
verifying the effectiveness of joint learning constituent and dependency
structures. In details, we report 96.33 F1 of constituent parsing and 97.20\%
UAS of dependency parsing on PTB.Comment: Accepted by ACL 201
Exploring Lexical, Syntactic, and Semantic Features for Chinese Textual Entailment in NTCIR RITE Evaluation Tasks
We computed linguistic information at the lexical, syntactic, and semantic
levels for Recognizing Inference in Text (RITE) tasks for both traditional and
simplified Chinese in NTCIR-9 and NTCIR-10. Techniques for syntactic parsing,
named-entity recognition, and near synonym recognition were employed, and
features like counts of common words, statement lengths, negation words, and
antonyms were considered to judge the entailment relationships of two
statements, while we explored both heuristics-based functions and
machine-learning approaches. The reported systems showed robustness by
simultaneously achieving second positions in the binary-classification subtasks
for both simplified and traditional Chinese in NTCIR-10 RITE-2. We conducted
more experiments with the test data of NTCIR-9 RITE, with good results. We also
extended our work to search for better configurations of our classifiers and
investigated contributions of individual features. This extended work showed
interesting results and should encourage further discussion.Comment: 20 pages, 1 figure, 26 tables, Journal article in Soft Computing
(Spinger). Soft Computing, online. Springer, Germany, 201
A Sequential Neural Encoder with Latent Structured Description for Modeling Sentences
In this paper, we propose a sequential neural encoder with latent structured
description (SNELSD) for modeling sentences. This model introduces latent
chunk-level representations into conventional sequential neural encoders, i.e.,
recurrent neural networks (RNNs) with long short-term memory (LSTM) units, to
consider the compositionality of languages in semantic modeling. An SNELSD
model has a hierarchical structure that includes a detection layer and a
description layer. The detection layer predicts the boundaries of latent word
chunks in an input sentence and derives a chunk-level vector for each word. The
description layer utilizes modified LSTM units to process these chunk-level
vectors in a recurrent manner and produces sequential encoding outputs. These
output vectors are further concatenated with word vectors or the outputs of a
chain LSTM encoder to obtain the final sentence representation. All the model
parameters are learned in an end-to-end manner without a dependency on
additional text chunking or syntax parsing. A natural language inference (NLI)
task and a sentiment analysis (SA) task are adopted to evaluate the performance
of our proposed model. The experimental results demonstrate the effectiveness
of the proposed SNELSD model on exploring task-dependent chunking patterns
during the semantic modeling of sentences. Furthermore, the proposed method
achieves better performance than conventional chain LSTMs and tree-structured
LSTMs on both tasks.Comment: Accepted by IEEE Transactions on Audio, Speech, and Language
Processin
Human Translation Vs Machine Translation: the Practitioner Phenomenology
The paper aimed at exploring the current phenomenon regarding human translation with machine translation. Human translation (HT), by definition, is when a human translator—rather than a machine—translate text. It's the oldest form of translation, relying on pure human intelligence to convert one way of saying things to another. The person who performs language translation. Learn more about using technology to reduce healthcare disparity. A person who performs language translation. The translation is necessary for the spread of information, knowledge, and ideas. It is absolutely necessary for effective and empathetic communication between different cultures. Translation, therefore, is critical for social harmony and peace. Only a human translation can tell the difference because the machine translator will just do the direct word to word translation. This is a hindrance to machines because they are not advanced to the level of rendering these nuances accurately, but they can only do word to word translations. There are different translation techniques, diverse theories about translation and eight different translation services types, including technical translation, judicial translation and certified translation. The translation is the process of translating the sequence of a messenger RNA (mRNA) molecule to a sequence of amino acids during protein synthesis. The genetic code describes the relationship between the sequence of base pairs in a gene and the corresponding amino acid sequence that it encodes
AI-Powered Text Generation for Harmonious Human-Machine Interaction: Current State and Future Directions
In the last two decades, the landscape of text generation has undergone
tremendous changes and is being reshaped by the success of deep learning. New
technologies for text generation ranging from template-based methods to neural
network-based methods emerged. Meanwhile, the research objectives have also
changed from generating smooth and coherent sentences to infusing personalized
traits to enrich the diversification of newly generated content. With the rapid
development of text generation solutions, one comprehensive survey is urgent to
summarize the achievements and track the state of the arts. In this survey
paper, we present the general systematical framework, illustrate the widely
utilized models and summarize the classic applications of text generation.Comment: Accepted by IEEE UIC 201
GlyphCRM: Bidirectional Encoder Representation for Chinese Character with its Glyph
Previous works indicate that the glyph of Chinese characters contains rich
semantic information and has the potential to enhance the representation of
Chinese characters. The typical method to utilize the glyph features is by
incorporating them into the character embedding space. Inspired by previous
methods, we innovatively propose a Chinese pre-trained representation model
named as GlyphCRM, which abandons the ID-based character embedding method yet
solely based on sequential character images. We render each character into a
binary grayscale image and design two-channel position feature maps for it.
Formally, we first design a two-layer residual convolutional neural network,
namely HanGlyph to generate the initial glyph representation of Chinese
characters, and subsequently adopt multiple bidirectional encoder Transformer
blocks as the superstructure to capture the context-sensitive information.
Meanwhile, we feed the glyph features extracted from each layer of the HanGlyph
module into the underlying Transformer blocks by skip-connection method to
fully exploit the glyph features of Chinese characters. As the HanGlyph module
can obtain a sufficient glyph representation of any Chinese character, the
long-standing out-of-vocabulary problem could be effectively solved. Extensive
experimental results indicate that GlyphCRM substantially outperforms the
previous BERT-based state-of-the-art model on 9 fine-tuning tasks, and it has
strong transferability and generalization on specialized fields and
low-resource tasks. We hope this work could spark further research beyond the
realms of well-established representation of Chinese texts.Comment: 11 pages, 7 figure
Natural Language Inference over Interaction Space
Natural Language Inference (NLI) task requires an agent to determine the
logical relationship between a natural language premise and a natural language
hypothesis. We introduce Interactive Inference Network (IIN), a novel class of
neural network architectures that is able to achieve high-level understanding
of the sentence pair by hierarchically extracting semantic features from
interaction space. We show that an interaction tensor (attention weight)
contains semantic information to solve natural language inference, and a denser
interaction tensor contains richer semantic information. One instance of such
architecture, Densely Interactive Inference Network (DIIN), demonstrates the
state-of-the-art performance on large scale NLI copora and large-scale NLI
alike corpus. It's noteworthy that DIIN achieve a greater than 20% error
reduction on the challenging Multi-Genre NLI (MultiNLI) dataset with respect to
the strongest published system.Comment: 15 pages, 2 figures, under review as ICLR proceeding, Published at
Sixth International Conference on Learning Representations, ICLR 201
Corpus analysis without prior linguistic knowledge - unsupervised mining of phrases and subphrase structure
When looking at the structure of natural language, "phrases" and "words" are
central notions. We consider the problem of identifying such "meaningful
subparts" of language of any length and underlying composition principles in a
completely corpus-based and language-independent way without using any kind of
prior linguistic knowledge. Unsupervised methods for identifying "phrases",
mining subphrase structure and finding words in a fully automated way are
described. This can be considered as a step towards automatically computing a
"general dictionary and grammar of the corpus". We hope that in the long run
variants of our approach turn out to be useful for other kind of sequence data
as well, such as, e.g., speech, genom sequences, or music annotation. Even if
we are not primarily interested in immediate applications, results obtained for
a variety of languages show that our methods are interesting for many practical
tasks in text mining, terminology extraction and lexicography, search engine
technology, and related fields
Towards a Robust Deep Neural Network in Texts: A Survey
Deep neural networks (DNNs) have achieved remarkable success in various tasks
(e.g., image classification, speech recognition, and natural language
processing). However, researches have shown that DNN models are vulnerable to
adversarial examples, which cause incorrect predictions by adding imperceptible
perturbations into normal inputs. Studies on adversarial examples in image
domain have been well investigated, but in texts the research is not enough,
let alone a comprehensive survey in this field. In this paper, we aim at
presenting a comprehensive understanding of adversarial attacks and
corresponding mitigation strategies in texts. Specifically, we first give a
taxonomy of adversarial attacks and defenses in texts from the perspective of
different natural language processing (NLP) tasks, and then introduce how to
build a robust DNN model via testing and verification. Finally, we discuss the
existing challenges of adversarial attacks and defenses in texts and present
the future research directions in this emerging field
- …