14,476 research outputs found
Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning
We use Bayesian optimization to learn curricula for word representation
learning, optimizing performance on downstream tasks that depend on the learned
representations as features. The curricula are modeled by a linear ranking
function which is the scalar product of a learned weight vector and an
engineered feature vector that characterizes the different aspects of the
complexity of each instance in the training corpus. We show that learning the
curriculum improves performance on a variety of downstream tasks over random
orders and in comparison to the natural corpus order.Comment: In proceedings of ACL 2016, 10 page
Bilingual Terminology Extraction Using Multi-level Termhood
Purpose: Terminology is the set of technical words or expressions used in
specific contexts, which denotes the core concept in a formal discipline and is
usually applied in the fields of machine translation, information retrieval,
information extraction and text categorization, etc. Bilingual terminology
extraction plays an important role in the application of bilingual dictionary
compilation, bilingual Ontology construction, machine translation and
cross-language information retrieval etc. This paper addresses the issues of
monolingual terminology extraction and bilingual term alignment based on
multi-level termhood.
Design/methodology/approach: A method based on multi-level termhood is
proposed. The new method computes the termhood of the terminology candidate as
well as the sentence that includes the terminology by the comparison of the
corpus. Since terminologies and general words usually have differently
distribution in the corpus, termhood can also be used to constrain and enhance
the performance of term alignment when aligning bilingual terms on the parallel
corpus. In this paper, bilingual term alignment based on termhood constraints
is presented.
Findings: Experiment results show multi-level termhood can get better
performance than existing method for terminology extraction. If termhood is
used as constrain factor, the performance of bilingual term alignment can be
improved
Offline Arabic Handwriting Recognition Using Artificial Neural Network
The ambition of a character recognition system is to transform a text
document typed on paper into a digital format that can be manipulated by word
processor software Unlike other languages, Arabic has unique features, while
other language doesn't have, from this language these are seven or eight
language such as ordo, jewie and Persian writing, Arabic has twenty eight
letters, each of which can be linked in three different ways or separated
depending on the case. The difficulty of the Arabic handwriting recognition is
that, the accuracy of the character recognition which affects on the accuracy
of the word recognition, in additional there is also two or three from for each
character, the suggested solution by using artificial neural network can solve
the problem and overcome the difficulty of Arabic handwriting recognition.Comment: Submitted to Journal of Computer Science and Engineering, see
http://sites.google.com/site/jcseuk/volume-1-issue-1-may-201
A Study of Sindhi Related and Arabic Script Adapted languages Recognition
A large number of publications are available for the Optical Character
Recognition (OCR). Significant researches, as well as articles are present for
the Latin, Chinese and Japanese scripts. Arabic script is also one of mature
script from OCR perspective. The adaptive languages which share Arabic script
or its extended characters; still lacking the OCRs for their language. In this
paper we present the efforts of researchers on Arabic and its related and
adapted languages. This survey is organized in different sections, in which
introduction is followed by properties of Sindhi Language. OCR process
techniques and methods used by various researchers are presented. The last
section is dedicated for future work and conclusion is also discussed.Comment: 11 pages, 8 Figures, Sindh Univ. Res. Jour. (Sci. Ser.
Enhancing Chinese Intent Classification by Dynamically Integrating Character Features into Word Embeddings with Ensemble Techniques
Intent classification has been widely researched on English data with deep
learning approaches that are based on neural networks and word embeddings. The
challenge for Chinese intent classification stems from the fact that, unlike
English where most words are made up of 26 phonologic alphabet letters, Chinese
is logographic, where a Chinese character is a more basic semantic unit that
can be informative and its meaning does not vary too much in contexts. Chinese
word embeddings alone can be inadequate for representing words, and pre-trained
embeddings can suffer from not aligning well with the task at hand. To account
for the inadequacy and leverage Chinese character information, we propose a
low-effort and generic way to dynamically integrate character embedding based
feature maps with word embedding based inputs, whose resulting word-character
embeddings are stacked with a contextual information extraction module to
further incorporate context information for predictions. On top of the proposed
model, we employ an ensemble method to combine single models and obtain the
final result. The approach is data-independent without relying on external
sources like pre-trained word embeddings. The proposed model outperforms
baseline models and existing methods
Adaptive Scaling for Sparse Detection in Information Extraction
This paper focuses on detection tasks in information extraction, where
positive instances are sparsely distributed and models are usually evaluated
using F-measure on positive classes. These characteristics often result in
deficient performance of neural network based detection models. In this paper,
we propose adaptive scaling, an algorithm which can handle the positive
sparsity problem and directly optimize over F-measure via dynamic
cost-sensitive learning. To this end, we borrow the idea of marginal utility
from economics and propose a theoretical framework for instance importance
measuring without introducing any additional hyper-parameters. Experiments show
that our algorithm leads to a more effective and stable training of neural
network based detection models.Comment: Accepted to ACL201
Named Entity Recognition with stack residual LSTM and trainable bias decoding
Recurrent Neural Network models are the state-of-the-art for Named Entity
Recognition (NER). We present two innovations to improve the performance of
these models. The first innovation is the introduction of residual connections
between the Stacked Recurrent Neural Network model to address the degradation
problem of deep neural networks. The second innovation is a bias decoding
mechanism that allows the trained system to adapt to non-differentiable and
externally computed objectives, such as the entity-based F-measure. Our work
improves the state-of-the-art results for both Spanish and English languages on
the standard train/development/test split of the CoNLL 2003 Shared Task NER
dataset
Static and Dynamic Feature Selection in Morphosyntactic Analyzers
We study the use of greedy feature selection methods for morphosyntactic
tagging under a number of different conditions. We compare a static ordering of
features to a dynamic ordering based on mutual information statistics, and we
apply the techniques to standalone taggers as well as joint systems for tagging
and parsing. Experiments on five languages show that feature selection can
result in more compact models as well as higher accuracy under all conditions,
but also that a dynamic ordering works better than a static ordering and that
joint systems benefit more than standalone taggers. We also show that the same
techniques can be used to select which morphosyntactic categories to predict in
order to maximize syntactic accuracy in a joint system. Our final results
represent a substantial improvement of the state of the art for several
languages, while at the same time reducing both the number of features and the
running time by up to 80% in some cases
Comparing Convolutional Neural Networks to Traditional Models for Slot Filling
We address relation classification in the context of slot filling, the task
of finding and evaluating fillers like "Steve Jobs" for the slot X in "X
founded Apple". We propose a convolutional neural network which splits the
input sentence into three parts according to the relation arguments and compare
it to state-of-the-art and traditional approaches of relation classification.
Finally, we combine different methods and show that the combination is better
than individual approaches. We also analyze the effect of genre differences on
performance.Comment: NAACL 201
Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification
Relation classification is an important semantic processing task in the field
of natural language processing (NLP). In this paper, we present a novel model,
Structure Regularized Bidirectional Recurrent Convolutional Neural
Network(SR-BRCNN), to classify the relation of two entities in a sentence, and
the new dataset of Chinese Sanwen for named entity recognition and relation
classification. Some state-of-the-art systems concentrate on modeling the
shortest dependency path (SDP) between two entities leveraging convolutional or
recurrent neural networks. We further explore how to make full use of the
dependency relations information in the SDP and how to improve the model by the
method of structure regularization. We propose a structure regularized model to
learn relation representations along the SDP extracted from the forest formed
by the structure regularized dependency tree, which benefits reducing the
complexity of the whole model and helps improve the score by 10.3.
Experimental results show that our method outperforms the state-of-the-art
approaches on the Chinese Sanwen task and performs as well on the SemEval-2010
Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used
will be released in the further.Comment: arXiv admin note: text overlap with arXiv:1411.6243 by other author
- …