89,330 research outputs found
Discourse Structure in Machine Translation Evaluation
In this article, we explore the potential of using sentence-level discourse
structure for machine translation evaluation. We first design discourse-aware
similarity measures, which use all-subtree kernels to compare discourse parse
trees in accordance with the Rhetorical Structure Theory (RST). Then, we show
that a simple linear combination with these measures can help improve various
existing machine translation evaluation metrics regarding correlation with
human judgments both at the segment- and at the system-level. This suggests
that discourse information is complementary to the information used by many of
the existing evaluation metrics, and thus it could be taken into account when
developing richer evaluation metrics, such as the WMT-14 winning combined
metric DiscoTKparty. We also provide a detailed analysis of the relevance of
various discourse elements and relations from the RST parse trees for machine
translation evaluation. In particular we show that: (i) all aspects of the RST
tree are relevant, (ii) nuclearity is more useful than relation type, and (iii)
the similarity of the translation RST tree to the reference tree is positively
correlated with translation quality.Comment: machine translation, machine translation evaluation, discourse
analysis. Computational Linguistics, 201
Learning Discriminative Stein Kernel for SPD Matrices and Its Applications
Stein kernel has recently shown promising performance on classifying images
represented by symmetric positive definite (SPD) matrices. It evaluates the
similarity between two SPD matrices through their eigenvalues. In this paper,
we argue that directly using the original eigenvalues may be problematic
because: i) Eigenvalue estimation becomes biased when the number of samples is
inadequate, which may lead to unreliable kernel evaluation; ii) More
importantly, eigenvalues only reflect the property of an individual SPD matrix.
They are not necessarily optimal for computing Stein kernel when the goal is to
discriminate different sets of SPD matrices. To address the two issues in one
shot, we propose a discriminative Stein kernel, in which an extra parameter
vector is defined to adjust the eigenvalues of the input SPD matrices. The
optimal parameter values are sought by optimizing a proxy of classification
performance. To show the generality of the proposed method, three different
kernel learning criteria that are commonly used in the literature are employed
respectively as a proxy. A comprehensive experimental study is conducted on a
variety of image classification tasks to compare our proposed discriminative
Stein kernel with the original Stein kernel and other commonly used methods for
evaluating the similarity between SPD matrices. The experimental results
demonstrate that, the discriminative Stein kernel can attain greater
discrimination and better align with classification tasks by altering the
eigenvalues. This makes it produce higher classification performance than the
original Stein kernel and other commonly used methods.Comment: 13 page
Higher-Order Momentum Distributions and Locally Affine LDDMM Registration
To achieve sparse parametrizations that allows intuitive analysis, we aim to
represent deformation with a basis containing interpretable elements, and we
wish to use elements that have the description capacity to represent the
deformation compactly. To accomplish this, we introduce in this paper
higher-order momentum distributions in the LDDMM registration framework. While
the zeroth order moments previously used in LDDMM only describe local
displacement, the first-order momenta that are proposed here represent a basis
that allows local description of affine transformations and subsequent compact
description of non-translational movement in a globally non-rigid deformation.
The resulting representation contains directly interpretable information from
both mathematical and modeling perspectives. We develop the mathematical
construction of the registration framework with higher-order momenta, we show
the implications for sparse image registration and deformation description, and
we provide examples of how the parametrization enables registration with a very
low number of parameters. The capacity and interpretability of the
parametrization using higher-order momenta lead to natural modeling of
articulated movement, and the method promises to be useful for quantifying
ventricle expansion and progressing atrophy during Alzheimer's disease
On the Effect of Semantically Enriched Context Models on Software Modularization
Many of the existing approaches for program comprehension rely on the
linguistic information found in source code, such as identifier names and
comments. Semantic clustering is one such technique for modularization of the
system that relies on the informal semantics of the program, encoded in the
vocabulary used in the source code. Treating the source code as a collection of
tokens loses the semantic information embedded within the identifiers. We try
to overcome this problem by introducing context models for source code
identifiers to obtain a semantic kernel, which can be used for both deriving
the topics that run through the system as well as their clustering. In the
first model, we abstract an identifier to its type representation and build on
this notion of context to construct contextual vector representation of the
source code. The second notion of context is defined based on the flow of data
between identifiers to represent a module as a dependency graph where the nodes
correspond to identifiers and the edges represent the data dependencies between
pairs of identifiers. We have applied our approach to 10 medium-sized open
source Java projects, and show that by introducing contexts for identifiers,
the quality of the modularization of the software systems is improved. Both of
the context models give results that are superior to the plain vector
representation of documents. In some cases, the authoritativeness of
decompositions is improved by 67%. Furthermore, a more detailed evaluation of
our approach on JEdit, an open source editor, demonstrates that inferred topics
through performing topic analysis on the contextual representations are more
meaningful compared to the plain representation of the documents. The proposed
approach in introducing a context model for source code identifiers paves the
way for building tools that support developers in program comprehension tasks
such as application and domain concept location, software modularization and
topic analysis
Large-Margin Determinantal Point Processes
Determinantal point processes (DPPs) offer a powerful approach to modeling
diversity in many applications where the goal is to select a diverse subset. We
study the problem of learning the parameters (the kernel matrix) of a DPP from
labeled training data. We make two contributions. First, we show how to
reparameterize a DPP's kernel matrix with multiple kernel functions, thus
enhancing modeling flexibility. Second, we propose a novel parameter estimation
technique based on the principle of large margin separation. In contrast to the
state-of-the-art method of maximum likelihood estimation, our large-margin loss
function explicitly models errors in selecting the target subsets, and it can
be customized to trade off different types of errors (precision vs. recall).
Extensive empirical studies validate our contributions, including applications
on challenging document and video summarization, where flexibility in modeling
the kernel matrix and balancing different errors is indispensable.Comment: 15 page
Time series kernel similarities for predicting Paroxysmal Atrial Fibrillation from ECGs
We tackle the problem of classifying Electrocardiography (ECG) signals with
the aim of predicting the onset of Paroxysmal Atrial Fibrillation (PAF). Atrial
fibrillation is the most common type of arrhythmia, but in many cases PAF
episodes are asymptomatic. Therefore, in order to help diagnosing PAF, it is
important to design procedures for detecting and, more importantly, predicting
PAF episodes. We propose a method for predicting PAF events whose first step
consists of a feature extraction procedure that represents each ECG as a
multi-variate time series. Successively, we design a classification framework
based on kernel similarities for multi-variate time series, capable of handling
missing data. We consider different approaches to perform classification in the
original space of the multi-variate time series and in an embedding space,
defined by the kernel similarity measure. We achieve a classification accuracy
comparable with state of the art methods, with the additional advantage of
detecting the PAF onset up to 15 minutes in advance
- …