Search CORE

14,476 research outputs found

Learning the Curriculum with Bayesian Optimization for Task-Specific Word Representation Learning

Author: Dyer Chris
Faruqui Manaal
Ling Wang
MacWhinney Brian
Tsvetkov Yulia
Publication venue
Publication date: 21/06/2016
Field of study

We use Bayesian optimization to learn curricula for word representation learning, optimizing performance on downstream tasks that depend on the learned representations as features. The curricula are modeled by a linear ranking function which is the scalar product of a learned weight vector and an engineered feature vector that characterizes the different aspects of the complexity of each instance in the training corpus. We show that learning the curriculum improves performance on a variety of downstream tasks over random orders and in comparison to the natural corpus order.Comment: In proceedings of ACL 2016, 10 page

arXiv.org e-Print Archive

Bilingual Terminology Extraction Using Multi-level Termhood

Author: Wu Dan
Zhang Chengzhi
Publication venue: 'Emerald'
Publication date: 18/02/2013
Field of study

Purpose: Terminology is the set of technical words or expressions used in specific contexts, which denotes the core concept in a formal discipline and is usually applied in the fields of machine translation, information retrieval, information extraction and text categorization, etc. Bilingual terminology extraction plays an important role in the application of bilingual dictionary compilation, bilingual Ontology construction, machine translation and cross-language information retrieval etc. This paper addresses the issues of monolingual terminology extraction and bilingual term alignment based on multi-level termhood. Design/methodology/approach: A method based on multi-level termhood is proposed. The new method computes the termhood of the terminology candidate as well as the sentence that includes the terminology by the comparison of the corpus. Since terminologies and general words usually have differently distribution in the corpus, termhood can also be used to constrain and enhance the performance of term alignment when aligning bilingual terms on the parallel corpus. In this paper, bilingual term alignment based on termhood constraints is presented. Findings: Experiment results show multi-level termhood can get better performance than existing method for terminology extraction. If termhood is used as constrain factor, the performance of bilingual term alignment can be improved

arXiv.org e-Print Archive

Offline Arabic Handwriting Recognition Using Artificial Neural Network

Author: Alanazi Hamdan. O.
Alnaqeib Rami
Jalab Hamid. A.
Zaidan A. A
Zaidan B. B
Publication venue
Publication date: 14/06/2010
Field of study

The ambition of a character recognition system is to transform a text document typed on paper into a digital format that can be manipulated by word processor software Unlike other languages, Arabic has unique features, while other language doesn't have, from this language these are seven or eight language such as ordo, jewie and Persian writing, Arabic has twenty eight letters, each of which can be linked in three different ways or separated depending on the case. The difficulty of the Arabic handwriting recognition is that, the accuracy of the character recognition which affects on the accuracy of the word recognition, in additional there is also two or three from for each character, the suggested solution by using artificial neural network can solve the problem and overcome the difficulty of Arabic handwriting recognition.Comment: Submitted to Journal of Computer Science and Engineering, see http://sites.google.com/site/jcseuk/volume-1-issue-1-may-201

arXiv.org e-Print Archive

A Study of Sindhi Related and Arabic Script Adapted languages Recognition

Author: Bhatti Zeeshan
Hakro Dil Nawaz
Moja G. N.
Talib A. Z.
Publication venue
Publication date: 13/12/2014
Field of study

A large number of publications are available for the Optical Character Recognition (OCR). Significant researches, as well as articles are present for the Latin, Chinese and Japanese scripts. Arabic script is also one of mature script from OCR perspective. The adaptive languages which share Arabic script or its extended characters; still lacking the OCRs for their language. In this paper we present the efforts of researchers on Arabic and its related and adapted languages. This survey is organized in different sections, in which introduction is followed by properties of Sindhi Language. OCR process techniques and methods used by various researchers are presented. The last section is dedicated for future work and conclusion is also discussed.Comment: 11 pages, 8 Figures, Sindh Univ. Res. Jour. (Sci. Ser.

arXiv.org e-Print Archive

Enhancing Chinese Intent Classification by Dynamically Integrating Character Features into Word Embeddings with Ensemble Techniques

Author: Costello Charles
Jankowski Charles
Lin Ruixi
Publication venue
Publication date: 22/05/2018
Field of study

Intent classification has been widely researched on English data with deep learning approaches that are based on neural networks and word embeddings. The challenge for Chinese intent classification stems from the fact that, unlike English where most words are made up of 26 phonologic alphabet letters, Chinese is logographic, where a Chinese character is a more basic semantic unit that can be informative and its meaning does not vary too much in contexts. Chinese word embeddings alone can be inadequate for representing words, and pre-trained embeddings can suffer from not aligning well with the task at hand. To account for the inadequacy and leverage Chinese character information, we propose a low-effort and generic way to dynamically integrate character embedding based feature maps with word embedding based inputs, whose resulting word-character embeddings are stacked with a contextual information extraction module to further incorporate context information for predictions. On top of the proposed model, we employ an ensemble method to combine single models and obtain the final result. The approach is data-independent without relying on external sources like pre-trained word embeddings. The proposed model outperforms baseline models and existing methods

arXiv.org e-Print Archive

Adaptive Scaling for Sparse Detection in Information Extraction

Author: Han Xianpei
Lin Hongyu
Lu Yaojie
Sun Le
Publication venue
Publication date: 28/05/2018
Field of study

This paper focuses on detection tasks in information extraction, where positive instances are sparsely distributed and models are usually evaluated using F-measure on positive classes. These characteristics often result in deficient performance of neural network based detection models. In this paper, we propose adaptive scaling, an algorithm which can handle the positive sparsity problem and directly optimize over F-measure via dynamic cost-sensitive learning. To this end, we borrow the idea of marginal utility from economics and propose a theoretical framework for instance importance measuring without introducing any additional hyper-parameters. Experiments show that our algorithm leads to a more effective and stable training of neural network based detection models.Comment: Accepted to ACL201

arXiv.org e-Print Archive

Named Entity Recognition with stack residual LSTM and trainable bias decoding

Author: MacKinlay Andrew
Tran Quan
Yepes Antonio Jimeno
Publication venue
Publication date: 11/07/2017
Field of study

Recurrent Neural Network models are the state-of-the-art for Named Entity Recognition (NER). We present two innovations to improve the performance of these models. The first innovation is the introduction of residual connections between the Stacked Recurrent Neural Network model to address the degradation problem of deep neural networks. The second innovation is a bias decoding mechanism that allows the trained system to adapt to non-differentiable and externally computed objectives, such as the entity-based F-measure. Our work improves the state-of-the-art results for both Spanish and English languages on the standard train/development/test split of the CoNLL 2003 Shared Task NER dataset

arXiv.org e-Print Archive

Static and Dynamic Feature Selection in Morphosyntactic Analyzers

Author: Ballesteros Miguel
Bohnet Bernd
McDonald Ryan
Nivre Joakim
Publication venue
Publication date: 21/03/2016
Field of study

We study the use of greedy feature selection methods for morphosyntactic tagging under a number of different conditions. We compare a static ordering of features to a dynamic ordering based on mutual information statistics, and we apply the techniques to standalone taggers as well as joint systems for tagging and parsing. Experiments on five languages show that feature selection can result in more compact models as well as higher accuracy under all conditions, but also that a dynamic ordering works better than a static ordering and that joint systems benefit more than standalone taggers. We also show that the same techniques can be used to select which morphosyntactic categories to predict in order to maximize syntactic accuracy in a joint system. Our final results represent a substantial improvement of the state of the art for several languages, while at the same time reducing both the number of features and the running time by up to 80% in some cases

arXiv.org e-Print Archive

Comparing Convolutional Neural Networks to Traditional Models for Slot Filling

Author: Adel Heike
Roth Benjamin
Schütze Hinrich
Publication venue
Publication date: 04/04/2016
Field of study

We address relation classification in the context of slot filling, the task of finding and evaluating fillers like "Steve Jobs" for the slot X in "X founded Apple". We propose a convolutional neural network which splits the input sentence into three parts according to the relation arguments and compare it to state-of-the-art and traditional approaches of relation classification. Finally, we combine different methods and show that the combination is better than individual approaches. We also analyze the effect of genre differences on performance.Comment: NAACL 201

arXiv.org e-Print Archive

Structure Regularized Bidirectional Recurrent Convolutional Neural Network for Relation Classification

Author: Wen Ji
Publication venue
Publication date: 06/11/2017
Field of study

Relation classification is an important semantic processing task in the field of natural language processing (NLP). In this paper, we present a novel model, Structure Regularized Bidirectional Recurrent Convolutional Neural Network(SR-BRCNN), to classify the relation of two entities in a sentence, and the new dataset of Chinese Sanwen for named entity recognition and relation classification. Some state-of-the-art systems concentrate on modeling the shortest dependency path (SDP) between two entities leveraging convolutional or recurrent neural networks. We further explore how to make full use of the dependency relations information in the SDP and how to improve the model by the method of structure regularization. We propose a structure regularized model to learn relation representations along the SDP extracted from the forest formed by the structure regularized dependency tree, which benefits reducing the complexity of the whole model and helps improve the

F_{1}

score by 10.3. Experimental results show that our method outperforms the state-of-the-art approaches on the Chinese Sanwen task and performs as well on the SemEval-2010 Task 8 dataset\footnote{The Chinese Sanwen corpus this paper developed and used will be released in the further.Comment: arXiv admin note: text overlap with arXiv:1411.6243 by other author

arXiv.org e-Print Archive