2,363 research outputs found
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation
This paper demonstrates that word sense disambiguation (WSD) can improve
neural machine translation (NMT) by widening the source context considered when
modeling the senses of potentially ambiguous words. We first introduce three
adaptive clustering algorithms for WSD, based on k-means, Chinese restaurant
processes, and random walks, which are then applied to large word contexts
represented in a low-rank space and evaluated on SemEval shared-task data. We
then learn word vectors jointly with sense vectors defined by our best WSD
method, within a state-of-the-art NMT system. We show that the concatenation of
these vectors, and the use of a sense selection mechanism based on the weighted
average of sense vectors, outperforms several baselines including sense-aware
ones. This is demonstrated by translation on five language pairs. The
improvements are above one BLEU point over strong NMT baselines, +4% accuracy
over all ambiguous nouns and verbs, or +20% when scored manually over several
challenging words.Comment: To appear in TAC
Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank- Updates
In this paper, we provide local and global convergence guarantees for
recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the
proposed algorithm is a simple alternating rank- update which is the
alternating version of the tensor power iteration adapted for asymmetric
tensors. Local convergence guarantees are established for third order tensors
of rank in dimensions, when and the tensor
components are incoherent. Thus, we can recover overcomplete tensor
decomposition. We also strengthen the results to global convergence guarantees
under stricter rank condition (for arbitrary constant ) through a simple initialization procedure where the algorithm is
initialized by top singular vectors of random tensor slices. Furthermore, the
approximate local convergence guarantees for -th order tensors are also
provided under rank condition . The guarantees also
include tight perturbation analysis given noisy tensor.Comment: We have added an additional sub-algorithm to remove the (approximate)
residual error left after the tensor power iteratio
Clustering and Latent Semantic Indexing Aspects of the Nonnegative Matrix Factorization
This paper provides a theoretical support for clustering aspect of the
nonnegative matrix factorization (NMF). By utilizing the Karush-Kuhn-Tucker
optimality conditions, we show that NMF objective is equivalent to graph
clustering objective, so clustering aspect of the NMF has a solid
justification. Different from previous approaches which usually discard the
nonnegativity constraints, our approach guarantees the stationary point being
used in deriving the equivalence is located on the feasible region in the
nonnegative orthant. Additionally, since clustering capability of a matrix
decomposition technique can sometimes imply its latent semantic indexing (LSI)
aspect, we will also evaluate LSI aspect of the NMF by showing its capability
in solving the synonymy and polysemy problems in synthetic datasets. And more
extensive evaluation will be conducted by comparing LSI performances of the NMF
and the singular value decomposition (SVD), the standard LSI method, using some
standard datasets.Comment: 28 pages, 5 figure
- …