371,126 research outputs found
A Bayesian Multivariate Functional Dynamic Linear Model
We present a Bayesian approach for modeling multivariate, dependent
functional data. To account for the three dominant structural features in the
data--functional, time dependent, and multivariate components--we extend
hierarchical dynamic linear models for multivariate time series to the
functional data setting. We also develop Bayesian spline theory in a more
general constrained optimization framework. The proposed methods identify a
time-invariant functional basis for the functional observations, which is
smooth and interpretable, and can be made common across multivariate
observations for additional information sharing. The Bayesian framework permits
joint estimation of the model parameters, provides exact inference (up to MCMC
error) on specific parameters, and allows generalized dependence structures.
Sampling from the posterior distribution is accomplished with an efficient
Gibbs sampling algorithm. We illustrate the proposed framework with two
applications: (1) multi-economy yield curve data from the recent global
recession, and (2) local field potential brain signals in rats, for which we
develop a multivariate functional time series approach for multivariate
time-frequency analysis. Supplementary materials, including R code and the
multi-economy yield curve data, are available online
Experimental String Field Theory
We develop efficient algorithms for level-truncation computations in open
bosonic string field theory. We determine the classical action in the universal
subspace to level (18,54) and apply this knowledge to numerical evaluations of
the tachyon condensate string field. We obtain two main sets of results. First,
we directly compute the solutions up to level L=18 by extremizing the
level-truncated action. Second, we obtain predictions for the solutions for L >
18 from an extrapolation to higher levels of the functional form of the tachyon
effective action. We find that the energy of the stable vacuum overshoots -1
(in units of the brane tension) at L=14, reaches a minimum E_min = -1.00063 at
L ~ 28 and approaches with spectacular accuracy the predicted answer of -1 as L
-> infinity. Our data are entirely consistent with the recent perturbative
analysis of Taylor and strongly support the idea that level-truncation is a
convergent approximation scheme. We also check systematically that our
numerical solution, which obeys the Siegel gauge condition, actually satisfies
the full gauge-invariant equations of motion. Finally we investigate the
presence of analytic patterns in the coefficients of the tachyon string field,
which we are able to reliably estimate in the L -> infinity limit.Comment: 37 pages, 6 figure
Accelerating Nearest Neighbor Search on Manycore Systems
We develop methods for accelerating metric similarity search that are
effective on modern hardware. Our algorithms factor into easily parallelizable
components, making them simple to deploy and efficient on multicore CPUs and
GPUs. Despite the simple structure of our algorithms, their search performance
is provably sublinear in the size of the database, with a factor dependent only
on its intrinsic dimensionality. We demonstrate that our methods provide
substantial speedups on a range of datasets and hardware platforms. In
particular, we present results on a 48-core server machine, on graphics
hardware, and on a multicore desktop
Specious rules: an efficient and effective unifying method for removing misleading and uninformative patterns in association rule mining
We present theoretical analysis and a suite of tests and procedures for
addressing a broad class of redundant and misleading association rules we call
\emph{specious rules}. Specious dependencies, also known as \emph{spurious},
\emph{apparent}, or \emph{illusory associations}, refer to a well-known
phenomenon where marginal dependencies are merely products of interactions with
other variables and disappear when conditioned on those variables.
The most extreme example is Yule-Simpson's paradox where two variables
present positive dependence in the marginal contingency table but negative in
all partial tables defined by different levels of a confounding factor. It is
accepted wisdom that in data of any nontrivial dimensionality it is infeasible
to control for all of the exponentially many possible confounds of this nature.
In this paper, we consider the problem of specious dependencies in the context
of statistical association rule mining. We define specious rules and show they
offer a unifying framework which covers many types of previously proposed
redundant or misleading association rules. After theoretical analysis, we
introduce practical algorithms for detecting and pruning out specious
association rules efficiently under many key goodness measures, including
mutual information and exact hypergeometric probabilities. We demonstrate that
the procedure greatly reduces the number of associations discovered, providing
an elegant and effective solution to the problem of association mining
discovering large numbers of misleading and redundant rules.Comment: Note: This is a corrected version of the paper published in SDM'17.
In the equation on page 4, the range of the sum has been correcte
GIANT: Globally Improved Approximate Newton Method for Distributed Optimization
For distributed computing environment, we consider the empirical risk
minimization problem and propose a distributed and communication-efficient
Newton-type optimization method. At every iteration, each worker locally finds
an Approximate NewTon (ANT) direction, which is sent to the main driver. The
main driver, then, averages all the ANT directions received from workers to
form a {\it Globally Improved ANT} (GIANT) direction. GIANT is highly
communication efficient and naturally exploits the trade-offs between local
computations and global communications in that more local computations result
in fewer overall rounds of communications. Theoretically, we show that GIANT
enjoys an improved convergence rate as compared with first-order methods and
existing distributed Newton-type methods. Further, and in sharp contrast with
many existing distributed Newton-type methods, as well as popular first-order
methods, a highly advantageous practical feature of GIANT is that it only
involves one tuning parameter. We conduct large-scale experiments on a computer
cluster and, empirically, demonstrate the superior performance of GIANT.Comment: Fixed some typos. Improved writin
- …