4,298 research outputs found
Multi-resolution Tensor Learning for Large-Scale Spatial Data
High-dimensional tensor models are notoriously computationally expensive to
train. We present a meta-learning algorithm, MMT, that can significantly speed
up the process for spatial tensor models. MMT leverages the property that
spatial data can be viewed at multiple resolutions, which are related by
coarsening and finegraining from one resolution to another. Using this
property, MMT learns a tensor model by starting from a coarse resolution and
iteratively increasing the model complexity. In order to not "over-train" on
coarse resolution models, we investigate an information-theoretic fine-graining
criterion to decide when to transition into higher-resolution models. We
provide both theoretical and empirical evidence for the advantages of this
approach. When applied to two real-world large-scale spatial datasets for
basketball player and animal behavior modeling, our approach demonstrate 3 key
benefits: 1) it efficiently captures higher-order interactions (i.e., tensor
latent factors), 2) it is orders of magnitude faster than fixed resolution
learning and scales to very fine-grained spatial resolutions, and 3) it
reliably yields accurate and interpretable models
Recommended from our members
Stochastic Yield Analysis of Rare Failure Events in High-Dimensional Variation Space
As semiconductor industry kept shrinking the feature size to nanometer scale, circuit reliability has become an area of growing concern due to the uncertainty introduced by process variations. For highly-replicated standard cells, the failure event for each individual component must be extremely rare in order to maintain sufficiently high yield rate. Existing yield analysis approaches works fine at low dimension, but less effective either when there are a large amount of circuit parameters, or when the failure samples are distributed in multiple regions. In this thesis, four novel high sigma analysis approaches have been proposed. First, we propose an adaptive importance sampling (AIS) algorithm. AIS has several iterations of sampling region adjustments, while existing methods pre-decide a static sampling distribution. At each iteration, AIS generates samples from current proposed distribution. Next, AIS carefully assigns weight to each sample based on its tilted occurrence probability between failure region and current failure region distribution. Then we design two adaptive frameworks based on Resampling and population Metropolis-Hastings (MH) to iteratively search for failure regions. Second, we develop an Adaptive Clustering and Sampling (ACS) method to estimate the failure rate of high-dimensional and multi-failure-region circuit cases. The basic idea of the algorithm is to cluster failure samples and build global sampling distribution at each iteration. Specifically, in clustering step, we propose a multi-cone clustering method, which partitions the parametric space and clusters failure samples. Then global sampling distribution is constructed from a set of weighted Gaussian distributions. Next, we calculate importance weight for each sample based on the discrepancy between sampling distribution and target distribution. Failure probability is updated at the end of each iteration. This clustering and sampling procedure proceeds iteratively until all the failure regions are covered.Moreover, two meta-model based approaches are proposed for high sigma analysis. The Low-Rank Tensor Approximation (LRTA) formulate the meta-model in tensor space by representing a multi-way tensor into a finite sum of rank-one tensor. The polynomial degree of our LRTA model grows linearly with circuit dimension, which makes it especially promising for high-dimensional circuit problems. Then we solve our LRTA model efficiently with a robust greedy algorithm, and calibrate iteratively with an adaptive sampling method. The meta-model based importance sampling (MIS) method utilizes Gaussian Process meta-model to construct quasi-optimal importance sampling distribution, and performs Markov Chain Monte Carlo (MCMC) simulation to generate new samples from the proposed distribution. By updating our global Importance Sampling estimator in an iterated framework, MIS leads to better efficiency and higher accuracy than traditional importance sampling methods. Experiment results validate that the proposed approaches are 3 orders faster than Monte Carlo, and more accurate than both academia solutions such as importance sampling and classification based methods, and industrial solutions such as mixture IS used by Intel
Fast and Guaranteed Tensor Decomposition via Sketching
Tensor CANDECOMP/PARAFAC (CP) decomposition has wide applications in
statistical learning of latent variable models and in data mining. In this
paper, we propose fast and randomized tensor CP decomposition algorithms based
on sketching. We build on the idea of count sketches, but introduce many novel
ideas which are unique to tensors. We develop novel methods for randomized
computation of tensor contractions via FFTs, without explicitly forming the
tensors. Such tensor contractions are encountered in decomposition methods such
as tensor power iterations and alternating least squares. We also design novel
colliding hashes for symmetric tensors to further save time in computing the
sketches. We then combine these sketching ideas with existing whitening and
tensor power iterative techniques to obtain the fastest algorithm on both
sparse and dense tensors. The quality of approximation under our method does
not depend on properties such as sparsity, uniformity of elements, etc. We
apply the method for topic modeling and obtain competitive results.Comment: 29 pages. Appeared in Proceedings of Advances in Neural Information
Processing Systems (NIPS), held at Montreal, Canada in 201
Multi-Layer Potfit: An Accurate Potential Representation for Efficient High-Dimensional Quantum Dynamics
The multi-layer multi-configuration time-dependent Hartree method (ML-MCTDH)
is a highly efficient scheme for studying the dynamics of high-dimensional
quantum systems. Its use is greatly facilitated if the Hamiltonian of the
system possesses a particular structure through which the multi-dimensional
matrix elements can be computed efficiently. In the field of quantum molecular
dynamics, the effective interaction between the atoms is often described by
potential energy surfaces (PES), and it is necessary to fit such PES into the
desired structure. For high-dimensional systems, the current approaches for
this fitting process either lead to fits that are too large to be practical, or
their accuracy is difficult to predict and control.
This article introduces multi-layer Potfit (MLPF), a novel fitting scheme
that results in a PES representation in the hierarchical tensor (HT) format.
The scheme is based on the hierarchical singular value decomposition, which can
yield a near-optimal fit and give strict bounds for the obtained accuracy.
Here, a recursive scheme for using the HT-format PES within ML-MCTDH is
derived, and theoretical estimates as well as a computational example show that
the use of MLPF can reduce the numerical effort for ML-MCTDH by orders of
magnitude, compared to the traditionally used Potfit representation of the PES.
Moreover, it is shown that MLPF is especially beneficial for high-accuracy PES
representations, and it turns out that MLPF leads to computational savings
already for comparatively small systems with just four modes.Comment: Copyright (2014) American Institute of Physics. This article may be
downloaded for personal use only. Any other use requires prior permission of
the author and the American Institute of Physic
Transferable atomic multipole machine learning models for small organic molecules
Accurate representation of the molecular electrostatic potential, which is
often expanded in distributed multipole moments, is crucial for an efficient
evaluation of intermolecular interactions. Here we introduce a machine learning
model for multipole coefficients of atom types H, C, O, N, S, F, and Cl in any
molecular conformation. The model is trained on quantum chemical results for
atoms in varying chemical environments drawn from thousands of organic
molecules. Multipoles in systems with neutral, cationic, and anionic molecular
charge states are treated with individual models. The models' predictive
accuracy and applicability are illustrated by evaluating intermolecular
interaction energies of nearly 1,000 dimers and the cohesive energy of the
benzene crystal.Comment: 11 pages, 6 figure
Online and Differentially-Private Tensor Decomposition
In this paper, we resolve many of the key algorithmic questions regarding
robustness, memory efficiency, and differential privacy of tensor
decomposition. We propose simple variants of the tensor power method which
enjoy these strong properties. We present the first guarantees for online
tensor power method which has a linear memory requirement. Moreover, we present
a noise calibrated tensor power method with efficient privacy guarantees. At
the heart of all these guarantees lies a careful perturbation analysis derived
in this paper which improves up on the existing results significantly.Comment: 19 pages, 9 figures. To appear at the 30th Annual Conference on
Advances in Neural Information Processing Systems (NIPS 2016), to be held at
Barcelona, Spain. Fix small typos in proofs of Lemmas C.5 and C.
- …