2,451 research outputs found
Learning Neural Templates for Text Generation
While neural, encoder-decoder models have had significant empirical success
in text generation, there remain several unaddressed problems with this style
of generation. Encoder-decoder models are largely (a) uninterpretable, and (b)
difficult to control in terms of their phrasing or content. This work proposes
a neural generation system using a hidden semi-markov model (HSMM) decoder,
which learns latent, discrete templates jointly with learning to generate. We
show that this model learns useful templates, and that these templates make
generation both more interpretable and controllable. Furthermore, we show that
this approach scales to real data sets and achieves strong performance nearing
that of encoder-decoder text generation models.Comment: EMNLP 2018; purity calculations update
Turkish PoS Tagging by Reducing Sparsity with Morpheme Tags in Small Datasets
Sparsity is one of the major problems in natural language processing. The
problem becomes even more severe in agglutinating languages that are highly
prone to be inflected. We deal with sparsity in Turkish by adopting
morphological features for part-of-speech tagging. We learn inflectional and
derivational morpheme tags in Turkish by using conditional random fields (CRF)
and we employ the morpheme tags in part-of-speech (PoS) tagging by using hidden
Markov models (HMMs) to mitigate sparsity. Results show that using morpheme
tags in PoS tagging helps alleviate the sparsity in emission probabilities. Our
model outperforms other hidden Markov model based PoS tagging models for small
training datasets in Turkish. We obtain an accuracy of 94.1% in morpheme
tagging and 89.2% in PoS tagging on a 5K training dataset.Comment: 13 pages, accepted and presented in 17th International Conference on
Intelligent Text Processing and Computational Linguistics (CICLING
Joint Energy-based Detection and Classificationon of Multilingual Text Lines
This paper proposes a new hierarchical MDL-based model for a joint detection
and classification of multilingual text lines in im- ages taken by hand-held
cameras. The majority of related text detec- tion methods assume alphabet-based
writing in a single language, e.g. in Latin. They use simple clustering
heuristics specific to such texts: prox- imity between letters within one line,
larger distance between separate lines, etc. We are interested in a
significantly more ambiguous problem where images combine alphabet and
logographic characters from multiple languages and typographic rules vary a lot
(e.g. English, Korean, and Chinese). Complexity of detecting and classifying
text lines in multiple languages calls for a more principled approach based on
information- theoretic principles. Our new MDL model includes data costs
combining geometric errors with classification likelihoods and a hierarchical
sparsity term based on label costs. This energy model can be efficiently
minimized by fusion moves. We demonstrate robustness of the proposed algorithm
on a large new database of multilingual text images collected in the pub- lic
transit system of Seoul
Deep Cascade Multi-task Learning for Slot Filling in Online Shopping Assistant
Slot filling is a critical task in natural language understanding (NLU) for
dialog systems. State-of-the-art approaches treat it as a sequence labeling
problem and adopt such models as BiLSTM-CRF. While these models work relatively
well on standard benchmark datasets, they face challenges in the context of
E-commerce where the slot labels are more informative and carry richer
expressions. In this work, inspired by the unique structure of E-commerce
knowledge base, we propose a novel multi-task model with cascade and residual
connections, which jointly learns segment tagging, named entity tagging and
slot filling. Experiments show the effectiveness of the proposed cascade and
residual structures. Our model has a 14.6% advantage in F1 score over the
strong baseline methods on a new Chinese E-commerce shopping assistant dataset,
while achieving competitive accuracies on a standard dataset. Furthermore,
online test deployed on such dominant E-commerce platform shows 130%
improvement on accuracy of understanding user utterances. Our model has already
gone into production in the E-commerce platform.Comment: AAAI 201
Character Feature Engineering for Japanese Word Segmentation
On word segmentation problems, machine learning architecture engineering
often draws attention. The problem representation itself, however, has remained
almost static as either word lattice ranking or character sequence tagging, for
at least two decades. The latter of-ten shows stronger predictive power than
the former for out-of-vocabulary (OOV) issue. When the issue escalating to
rapid adaptation, which is a common scenario for industrial applications,
active learning of partial annotations or re-training with additional lexical
re-sources is usually applied, however, from a somewhat word-based perspective.
Not only it is uneasy for end-users to comply with linguistically consistent
word boundary decisions, but also the risk/cost of forking models permanently
with estimated weights is seldom affordable. To overcome the obstacle, this
work provides an alternative, which uses linguistic intuition about character
compositions, such that a sophisticated feature set and its derived scheme can
enable dynamic lexicon expansion with the model remaining intact. Experiment
results suggest that the proposed solution, with or without external lexemes,
performs competitively in terms of F1 score and OOV recall across various
datasets
Incorporating Dictionaries into Deep Neural Networks for the Chinese Clinical Named Entity Recognition
Clinical Named Entity Recognition (CNER) aims to identify and classify
clinical terms such as diseases, symptoms, treatments, exams, and body parts in
electronic health records, which is a fundamental and crucial task for clinical
and translational research. In recent years, deep neural networks have achieved
significant success in named entity recognition and many other Natural Language
Processing (NLP) tasks. Most of these algorithms are trained end to end, and
can automatically learn features from large scale labeled datasets. However,
these data-driven methods typically lack the capability of processing rare or
unseen entities. Previous statistical methods and feature engineering practice
have demonstrated that human knowledge can provide valuable information for
handling rare and unseen cases. In this paper, we address the problem by
incorporating dictionaries into deep neural networks for the Chinese CNER task.
Two different architectures that extend the Bi-directional Long Short-Term
Memory (Bi-LSTM) neural network and five different feature representation
schemes are proposed to handle the task. Computational results on the CCKS-2017
Task 2 benchmark dataset show that the proposed method achieves the highly
competitive performance compared with the state-of-the-art deep learning
methods.Comment: 21 pages, 6 figure
Medical Knowledge Embedding Based on Recursive Neural Network for Multi-Disease Diagnosis
The representation of knowledge based on first-order logic captures the
richness of natural language and supports multiple probabilistic inference
models. Although symbolic representation enables quantitative reasoning with
statistical probability, it is difficult to utilize with machine learning
models as they perform numerical operations. In contrast, knowledge embedding
(i.e., high-dimensional and continuous vectors) is a feasible approach to
complex reasoning that can not only retain the semantic information of
knowledge but also establish the quantifiable relationship among them. In this
paper, we propose recursive neural knowledge network (RNKN), which combines
medical knowledge based on first-order logic with recursive neural network for
multi-disease diagnosis. After RNKN is efficiently trained from manually
annotated Chinese Electronic Medical Records (CEMRs), diagnosis-oriented
knowledge embeddings and weight matrixes are learned. Experimental results
verify that the diagnostic accuracy of RNKN is superior to that of some
classical machine learning models and Markov logic network (MLN). The results
also demonstrate that the more explicit the evidence extracted from CEMRs is,
the better is the performance achieved. RNKN gradually exhibits the
interpretation of knowledge embeddings as the number of training epochs
increases
Recurrent Neural Network Method in Arabic Words Recognition System
The recognition of unconstrained handwriting continues to be a difficult task
for computers despite active research for several decades. This is because
handwritten text offers great challenges such as character and word
segmentation, character recognition, variation between handwriting styles,
different character size and no font constraints as well as the background
clarity. In this paper primarily discussed Online Handwriting Recognition
methods for Arabic words which being often used among then across the Middle
East and North Africa people. Because of the characteristic of the whole body
of the Arabic words, namely connectivity between the characters, thereby the
segmentation of An Arabic word is very difficult. We introduced a recurrent
neural network to online handwriting Arabic word recognition. The key
innovation is a recently produce recurrent neural networks objective function
known as connectionist temporal classification. The system consists of an
advanced recurrent neural network with an output layer designed for sequence
labeling, partially combined with a probabilistic language model. Experimental
results show that unconstrained Arabic words achieve recognition rates about
79%, which is significantly higher than the about 70% using a previously
developed hidden markov model based recognition system.Comment: 6 Pages, 5 Figures, Vol. 3, Issue 11, pages 43-4
The NLP Engine: A Universal Turing Machine for NLP
It is commonly accepted that machine translation is a more complex task than
part of speech tagging. But how much more complex? In this paper we make an
attempt to develop a general framework and methodology for computing the
informational and/or processing complexity of NLP applications and tasks. We
define a universal framework akin to a Turning Machine that attempts to fit
(most) NLP tasks into one paradigm. We calculate the complexities of various
NLP tasks using measures of Shannon Entropy, and compare `simple' ones such as
part of speech tagging to `complex' ones such as machine translation. This
paper provides a first, though far from perfect, attempt to quantify NLP tasks
under a uniform paradigm. We point out current deficiencies and suggest some
avenues for fruitful research
Adversarial Generation of Training Examples: Applications to Moving Vehicle License Plate Recognition
Generative Adversarial Networks (GAN) have attracted much research attention
recently, leading to impressive results for natural image generation. However,
to date little success was observed in using GAN generated images for improving
classification tasks. Here we attempt to explore, in the context of car license
plate recognition, whether it is possible to generate synthetic training data
using GAN to improve recognition accuracy. With a carefully-designed pipeline,
we show that the answer is affirmative. First, a large-scale image set is
generated using the generator of GAN, without manual annotation. Then, these
images are fed to a deep convolutional neural network (DCNN) followed by a
bidirectional recurrent neural network (BRNN) with long short-term memory
(LSTM), which performs the feature learning and sequence labelling. Finally,
the pre-trained model is fine-tuned on real images. Our experimental results on
a few data sets demonstrate the effectiveness of using GAN images: an
improvement of 7.5% over a strong baseline with moderate-sized real data being
available. We show that the proposed framework achieves competitive recognition
accuracy on challenging test datasets. We also leverage the depthwise separate
convolution to construct a lightweight convolutional RNN, which is about half
size and 2x faster on CPU. Combining this framework and the proposed pipeline,
we make progress in performing accurate recognition on mobile and embedded
devices
- …