15,933 research outputs found
Markov Chain Monte Carlo Based on Deterministic Transformations
In this article we propose a novel MCMC method based on deterministic
transformations T: X x D --> X where X is the state-space and D is some set
which may or may not be a subset of X. We refer to our new methodology as
Transformation-based Markov chain Monte Carlo (TMCMC). One of the remarkable
advantages of our proposal is that even if the underlying target distribution
is very high-dimensional, deterministic transformation of a one-dimensional
random variable is sufficient to generate an appropriate Markov chain that is
guaranteed to converge to the high-dimensional target distribution. Apart from
clearly leading to massive computational savings, this idea of
deterministically transforming a single random variable very generally leads to
excellent acceptance rates, even though all the random variables associated
with the high-dimensional target distribution are updated in a single block.
Since it is well-known that joint updating of many random variables using
Metropolis-Hastings (MH) algorithm generally leads to poor acceptance rates,
TMCMC, in this regard, seems to provide a significant advance. We validate our
proposal theoretically, establishing the convergence properties. Furthermore,
we show that TMCMC can be very effectively adopted for simulating from doubly
intractable distributions.
TMCMC is compared with MH using the well-known Challenger data, demonstrating
the effectiveness of of the former in the case of highly correlated variables.
Moreover, we apply our methodology to a challenging posterior simulation
problem associated with the geostatistical model of Diggle et al. (1998),
updating 160 unknown parameters jointly, using a deterministic transformation
of a one-dimensional random variable. Remarkable computational savings as well
as good convergence properties and acceptance rates are the results.Comment: 28 pages, 3 figures; Longer abstract inside articl
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
The Benefit of Multitask Representation Learning
We discuss a general method to learn data representations from multiple
tasks. We provide a justification for this method in both settings of multitask
learning and learning-to-learn. The method is illustrated in detail in the
special case of linear feature learning. Conditions on the theoretical
advantage offered by multitask representation learning over independent task
learning are established. In particular, focusing on the important example of
half-space learning, we derive the regime in which multitask representation
learning is beneficial over independent task learning, as a function of the
sample size, the number of tasks and the intrinsic data dimensionality. Other
potential applications of our results include multitask feature learning in
reproducing kernel Hilbert spaces and multilayer, deep networks.Comment: To appear in Journal of Machine Learning Research (JMLR). 31 page
Characteristic Kernels and Infinitely Divisible Distributions
We connect shift-invariant characteristic kernels to infinitely divisible
distributions on . Characteristic kernels play an important
role in machine learning applications with their kernel means to distinguish
any two probability measures. The contribution of this paper is two-fold.
First, we show, using the L\'evy-Khintchine formula, that any shift-invariant
kernel given by a bounded, continuous and symmetric probability density
function (pdf) of an infinitely divisible distribution on is
characteristic. We also present some closure property of such characteristic
kernels under addition, pointwise product, and convolution. Second, in
developing various kernel mean algorithms, it is fundamental to compute the
following values: (i) kernel mean values , , and
(ii) kernel mean RKHS inner products , for probability measures . If , and
kernel are Gaussians, then computation (i) and (ii) results in Gaussian
pdfs that is tractable. We generalize this Gaussian combination to more general
cases in the class of infinitely divisible distributions. We then introduce a
{\it conjugate} kernel and {\it convolution trick}, so that the above (i) and
(ii) have the same pdf form, expecting tractable computation at least in some
cases. As specific instances, we explore -stable distributions and a
rich class of generalized hyperbolic distributions, where the Laplace, Cauchy
and Student-t distributions are included
Data Mining and Machine Learning in Astronomy
We review the current state of data mining and machine learning in astronomy.
'Data Mining' can have a somewhat mixed connotation from the point of view of a
researcher in this field. If used correctly, it can be a powerful approach,
holding the potential to fully exploit the exponentially increasing amount of
available data, promising great scientific advance. However, if misused, it can
be little more than the black-box application of complex computing algorithms
that may give little physical insight, and provide questionable results. Here,
we give an overview of the entire data mining process, from data collection
through to the interpretation of results. We cover common machine learning
algorithms, such as artificial neural networks and support vector machines,
applications from a broad range of astronomy, emphasizing those where data
mining techniques directly resulted in improved science, and important current
and future directions, including probability density functions, parallel
algorithms, petascale computing, and the time domain. We conclude that, so long
as one carefully selects an appropriate algorithm, and is guided by the
astronomical problem at hand, data mining can be very much the powerful tool,
and not the questionable black box.Comment: Published in IJMPD. 61 pages, uses ws-ijmpd.cls. Several extra
figures, some minor additions to the tex
- …