4,194 research outputs found
Language Modeling with Power Low Rank Ensembles
We present power low rank ensembles (PLRE), a flexible framework for n-gram
language modeling where ensembles of low rank matrices and tensors are used to
obtain smoothed probability estimates of words in context. Our method can be
understood as a generalization of n-gram modeling to non-integer n, and
includes standard techniques such as absolute discounting and Kneser-Ney
smoothing as special cases. PLRE training is efficient and our approach
outperforms state-of-the-art modified Kneser Ney baselines in terms of
perplexity on large corpora as well as on BLEU score in a downstream machine
translation task
Statistical Machine Translation Features with Multitask Tensor Networks
We present a three-pronged approach to improving Statistical Machine
Translation (SMT), building on recent success in the application of neural
networks to SMT. First, we propose new features based on neural networks to
model various non-local translation phenomena. Second, we augment the
architecture of the neural network with tensor layers that capture important
higher-order interaction among the network units. Third, we apply multitask
learning to estimate the neural network parameters jointly. Each of our
proposed methods results in significant improvements that are complementary.
The overall improvement is +2.7 and +1.8 BLEU points for Arabic-English and
Chinese-English translation over a state-of-the-art system that already
includes neural network features.Comment: 11 pages (9 content + 2 references), 2 figures, accepted to ACL 2015
as a long pape
Bayesian learning of joint distributions of objects
There is increasing interest in broad application areas in defining flexible
joint models for data having a variety of measurement scales, while also
allowing data of complex types, such as functions, images and documents. We
consider a general framework for nonparametric Bayes joint modeling through
mixture models that incorporate dependence across data types through a joint
mixing measure. The mixing measure is assigned a novel infinite tensor
factorization (ITF) prior that allows flexible dependence in cluster allocation
across data types. The ITF prior is formulated as a tensor product of
stick-breaking processes. Focusing on a convenient special case corresponding
to a Parafac factorization, we provide basic theory justifying the flexibility
of the proposed prior and resulting asymptotic properties. Focusing on ITF
mixtures of product kernels, we develop a new Gibbs sampling algorithm for
routine implementation relying on slice sampling. The methods are compared with
alternative joint mixture models based on Dirichlet processes and related
approaches through simulations and real data applications.Comment: Appearing in Proceedings of the 16th International Conference on
Artificial Intelligence and Statistics (AISTATS) 2013, Scottsdale, AZ, US
Enabling High-Dimensional Hierarchical Uncertainty Quantification by ANOVA and Tensor-Train Decomposition
Hierarchical uncertainty quantification can reduce the computational cost of
stochastic circuit simulation by employing spectral methods at different
levels. This paper presents an efficient framework to simulate hierarchically
some challenging stochastic circuits/systems that include high-dimensional
subsystems. Due to the high parameter dimensionality, it is challenging to both
extract surrogate models at the low level of the design hierarchy and to handle
them in the high-level simulation. In this paper, we develop an efficient
ANOVA-based stochastic circuit/MEMS simulator to extract efficiently the
surrogate models at the low level. In order to avoid the curse of
dimensionality, we employ tensor-train decomposition at the high level to
construct the basis functions and Gauss quadrature points. As a demonstration,
we verify our algorithm on a stochastic oscillator with four MEMS capacitors
and 184 random parameters. This challenging example is simulated efficiently by
our simulator at the cost of only 10 minutes in MATLAB on a regular personal
computer.Comment: 14 pages (IEEE double column), 11 figure, accepted by IEEE Trans CAD
of Integrated Circuits and System
- …