Search CORE

6,204 research outputs found

Joint morphological-lexical language modeling for processing morphologically rich languages with application to dialectal Arabic

Author: Afify Mohamed
Deng Yonggang
Erdogan Hakan
Erdoğan Hakan
Gao Yuqing
Sarıkaya Ruhi
Sarikaya Ruhi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

Language modeling for an inflected language such as Arabic poses new challenges for speech recognition and machine translation due to its rich morphology. Rich morphology results in large increases in out-of-vocabulary (OOV) rate and poor language model parameter estimation in the absence of large quantities of data. In this study, we present a joint morphological-lexical language model (JMLLM) that takes advantage of Arabic morphology. JMLLM combines morphological segments with the underlying lexical items and additional available information sources with regards to morphological segments and lexical items in a single joint model. Joint representation and modeling of morphological and lexical items reduces the OOV rate and provides smooth probability estimates while keeping the predictive power of whole words. Speech recognition and machine translation experiments in dialectal-Arabic show improvements over word and morpheme based trigram language models. We also show that as the tightness of integration between different information sources increases, both speech recognition and machine translation performances improve

CiteSeerX

Crossref

Sabanci University Research Database

Learning Language from a Large (Unannotated) Corpus

Author: Goertzel Ben
Vepstas Linas
Publication venue
Publication date: 14/01/2014
Field of study

A novel approach to the fully automated, unsupervised extraction of dependency grammars and associated syntax-to-semantic-relationship mappings from large text corpora is described. The suggested approach builds on the authors' prior work with the Link Grammar, RelEx and OpenCog systems, as well as on a number of prior papers and approaches from the statistical language learning literature. If successful, this approach would enable the mining of all the information needed to power a natural language comprehension and generation system, directly from a large, unannotated corpus.Comment: 29 pages, 5 figures, research proposa

arXiv.org e-Print Archive

CiteSeerX

Machine Learning and the Cognitive Basis of Natural Language

Author: Lappin Shalom
Publication venue
Publication date: 01/01/2005
Field of study

Articl

SAS-SPACE

Utrecht University Repository

Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning

Author: Cohen S. B.
Smith N. A.
Publication venue
Publication date: 01/01/2012
Field of study

Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics. We present a framework, reminiscent of structural risk minimization, for empirical risk minimization of probabilistic grammars using the log-loss. We derive sample complexity bounds in this framework that apply both to the supervised setting and the unsupervised setting. By making assumptions about the underlying distribution that are appropriate for natural language scenarios, we are able to derive distribution-dependent sample complexity bounds for probabilistic grammars. We also give simple algorithms for carrying out empirical risk minimization using this framework in both the supervised and unsupervised settings. In the unsupervised case, we show that the problem of minimizing empirical risk is NP-hard. We therefore suggest an approximate algorithm, similar to expectation-maximization, to minimize the empirical risk. Learning from data is central to contemporary computational linguistics. It is in common in such learning to estimate a model in a parametric family using the maximum likelihood principle. This principle applies in the supervised case (i.e., using annotate

CiteSeerX

Edinburgh Research Explorer

A Tutorial on the Expectation-Maximization Algorithm Including Maximum-Likelihood Estimation and EM Training of Probabilistic Context-Free Grammars

Author: Prescher Detlef
Publication venue
Publication date: 01/01/2004
Field of study

The paper gives a brief review of the expectation-maximization algorithm (Dempster 1977) in the comprehensible framework of discrete mathematics. In Section 2, two prominent estimation methods, the relative-frequency estimation and the maximum-likelihood estimation are presented. Section 3 is dedicated to the expectation-maximization algorithm and a simpler variant, the generalized expectation-maximization algorithm. In Section 4, two loaded dice are rolled. A more interesting example is presented in Section 5: The estimation of probabilistic context-free grammars.Comment: Presented at the 15th European Summer School in Logic, Language and Information (ESSLLI 2003). Example 5 extended (and partially corrected

arXiv.org e-Print Archive

CiteSeerX

International Migration, Integration and Social Cohesion online publications

Attentive Tensor Product Learning

Author: Deng Li
He Xiaodong
Huang Qiuyuan
Liu Chang
Wu Dapeng
Publication venue
Publication date: 01/11/2018
Field of study

This paper proposes a new architecture - Attentive Tensor Product Learning (ATPL) - to represent grammatical structures in deep learning models. ATPL is a new architecture to bridge this gap by exploiting Tensor Product Representations (TPR), a structured neural-symbolic model developed in cognitive science, aiming to integrate deep learning with explicit language structures and rules. The key ideas of ATPL are: 1) unsupervised learning of role-unbinding vectors of words via TPR-based deep neural network; 2) employing attention modules to compute TPR; and 3) integration of TPR with typical deep learning architectures including Long Short-Term Memory (LSTM) and Feedforward Neural Network (FFNN). The novelty of our approach lies in its ability to extract the grammatical structure of a sentence by using role-unbinding vectors, which are obtained in an unsupervised manner. This ATPL approach is applied to 1) image captioning, 2) part of speech (POS) tagging, and 3) constituency parsing of a sentence. Experimental results demonstrate the effectiveness of the proposed approach

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications