487 research outputs found
GPstruct: Bayesian structured prediction using Gaussian processes
We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design. We motivate the model with respect to existing approaches, among others, conditional random fields (CRFs), maximum margin Markov networks (M ^3 N), and structured support vector machines (SVMstruct), which embody only a subset of its properties. We present an inference procedure based on Markov Chain Monte Carlo. The framework can be instantiated for a wide range of structured objects such as linear chains, trees, grids, and other general graphs. As a proof of concept, the model is benchmarked on several natural language processing tasks and a video gesture segmentation task involving a linear chain structure. We show prediction accuracies for GPstruct which are comparable to or exceeding those of CRFs and SVMstruct
Bayesian Structured Prediction Using Gaussian Processes
We introduce a conceptually novel structured prediction model, GPstruct,
which is kernelized, non-parametric and Bayesian, by design. We motivate the
model with respect to existing approaches, among others, conditional random
fields (CRFs), maximum margin Markov networks (M3N), and structured support
vector machines (SVMstruct), which embody only a subset of its properties. We
present an inference procedure based on Markov Chain Monte Carlo. The framework
can be instantiated for a wide range of structured objects such as linear
chains, trees, grids, and other general graphs. As a proof of concept, the
model is benchmarked on several natural language processing tasks and a video
gesture segmentation task involving a linear chain structure. We show
prediction accuracies for GPstruct which are comparable to or exceeding those
of CRFs and SVMstruct.This is the accepted manuscript version. The final version is available from IEEE at http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6942234
Gaussian Process Pseudo-Likelihood Models for Sequence Labeling
Several machine learning problems arising in natural language processing can
be modeled as a sequence labeling problem. We provide Gaussian process models
based on pseudo-likelihood approximation to perform sequence labeling. Gaussian
processes (GPs) provide a Bayesian approach to learning in a kernel based
framework. The pseudo-likelihood model enables one to capture long range
dependencies among the output components of the sequence without becoming
computationally intractable. We use an efficient variational Gaussian
approximation method to perform inference in the proposed model. We also
provide an iterative algorithm which can effectively make use of the
information from the neighboring labels to perform prediction. The ability to
capture long range dependencies makes the proposed approach useful for a wide
range of sequence labeling problems. Numerical experiments on some sequence
labeling data sets demonstrate the usefulness of the proposed approach.Comment: 18 pages, 5 figure
Word alignment and smoothing methods in statistical machine translation: Noise, prior knowledge and overfitting
This thesis discusses how to incorporate linguistic knowledge into an SMT system. Although one important category of linguistic knowledge is that obtained by a constituent / dependency parser, a POS / super tagger, and a morphological analyser, linguistic knowledge here includes larger domains than this: Multi-Word Expressions, Out-Of-Vocabulary words, paraphrases, lexical semantics (or non-literal translations), named-entities, coreferences, and transliterations. The first discussion is about word alignment where we propose a MWE-sensitive word aligner. The second discussion is about the smoothing methods for a language model and a translation model where we propose a hierarchical Pitman-Yor process-based smoothing method. The common grounds for these discussion are the examination of three exceptional cases from real-world data: the presence
of noise, the availability of prior knowledge, and the problem of underfitting. Notable characteristics of this design are the careful usage of (Bayesian) priors in order that it can capture both frequent and linguistically important phenomena. This can be considered to provide one example to solve the problems of statistical models which often aim to learn from frequent examples only, and often overlook less frequent but linguistically important phenomena
Climbing the tower of babel: Unsupervised multilingual learning
For centuries, scholars have explored the deep
links among human languages. In this paper,
we present a class of probabilistic models
that use these links as a form of naturally
occurring supervision. These models allow
us to substantially improve performance for
core text processing tasks, such as morphological
segmentation, part-of-speech tagging,
and syntactic parsing. Besides these traditional
NLP tasks, we also present a multilingual
model for the computational decipherment
of lost languages
How tight is your language? A semantic typology based on Mutual Information
Languages differ in the degree of semantic flexibility of their syntactic roles. For example, Eng- lish and Indonesian are considered more flexible with regard to the semantics of subjects, whereas German and Japanese are less flexible. In Hawkins’ classification, more flexible lan- guages are said to have a loose fit, and less flexible ones are those that have a tight fit. This classification has been based on manual inspection of example sentences. The present paper proposes a new, quantitative approach to deriving the measures of looseness and tightness from corpora. We use corpora of online news from the Leipzig Corpora Collection in thirty typolog- ically and genealogically diverse languages and parse them syntactically with the help of the Universal Dependencies annotation software. Next, we compute Mutual Information scores for each language using the matrices of lexical lemmas and four syntactic dependencies (intransi- tive subjects, transitive subject, objects and obliques). The new approach allows us not only to reproduce the results of previous investigations, but also to extend the typology to new lan- guages. We also demonstrate that verb-final languages tend to have a tighter relationship be- tween lexemes and syntactic roles, which helps language users to recognize thematic roles early during comprehension
Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration
Cross-language information retrieval (CLIR), where queries and documents are
in different languages, has of late become one of the major topics within the
information retrieval community. This paper proposes a Japanese/English CLIR
system, where we combine a query translation and retrieval modules. We
currently target the retrieval of technical documents, and therefore the
performance of our system is highly dependent on the quality of the translation
of technical terms. However, the technical term translation is still
problematic in that technical terms are often compound words, and thus new
terms are progressively created by combining existing base words. In addition,
Japanese often represents loanwords based on its special phonogram.
Consequently, existing dictionaries find it difficult to achieve sufficient
coverage. To counter the first problem, we produce a Japanese/English
dictionary for base words, and translate compound words on a word-by-word
basis. We also use a probabilistic method to resolve translation ambiguity. For
the second problem, we use a transliteration method, which corresponds words
unlisted in the base word dictionary to their phonetic equivalents in the
target language. We evaluate our system using a test collection for CLIR, and
show that both the compound word translation and transliteration methods
improve the system performance
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
- …