21,862 research outputs found
Greedy Column Subset Selection for Large-scale Data Sets
In today's information systems, the availability of massive amounts of data
necessitates the development of fast and accurate algorithms to summarize these
data and represent them in a succinct format. One crucial problem in big data
analytics is the selection of representative instances from large and
massively-distributed data, which is formally known as the Column Subset
Selection (CSS) problem. The solution to this problem enables data analysts to
understand the insights of the data and explore its hidden structure. The
selected instances can also be used for data preprocessing tasks such as
learning a low-dimensional embedding of the data points or computing a low-rank
approximation of the corresponding matrix. This paper presents a fast and
accurate greedy algorithm for large-scale column subset selection. The
algorithm minimizes an objective function which measures the reconstruction
error of the data matrix based on the subset of selected columns. The paper
first presents a centralized greedy algorithm for column subset selection which
depends on a novel recursive formula for calculating the reconstruction error
of the data matrix. The paper then presents a MapReduce algorithm which selects
a few representative columns from a matrix whose columns are massively
distributed across several commodity machines. The algorithm first learns a
concise representation of all columns using random projection, and it then
solves a generalized column subset selection problem at each machine in which a
subset of columns are selected from the sub-matrix on that machine such that
the reconstruction error of the concise representation is minimized. The paper
demonstrates the effectiveness and efficiency of the proposed algorithm through
an empirical evaluation on benchmark data sets.Comment: Under consideration for publication in Knowledge and Information
System
Far-Field Compression for Fast Kernel Summation Methods in High Dimensions
We consider fast kernel summations in high dimensions: given a large set of
points in dimensions (with ) and a pair-potential function (the
{\em kernel} function), we compute a weighted sum of all pairwise kernel
interactions for each point in the set. Direct summation is equivalent to a
(dense) matrix-vector multiplication and scales quadratically with the number
of points. Fast kernel summation algorithms reduce this cost to log-linear or
linear complexity.
Treecodes and Fast Multipole Methods (FMMs) deliver tremendous speedups by
constructing approximate representations of interactions of points that are far
from each other. In algebraic terms, these representations correspond to
low-rank approximations of blocks of the overall interaction matrix. Existing
approaches require an excessive number of kernel evaluations with increasing
and number of points in the dataset.
To address this issue, we use a randomized algebraic approach in which we
first sample the rows of a block and then construct its approximate, low-rank
interpolative decomposition. We examine the feasibility of this approach
theoretically and experimentally. We provide a new theoretical result showing a
tighter bound on the reconstruction error from uniformly sampling rows than the
existing state-of-the-art. We demonstrate that our sampling approach is
competitive with existing (but prohibitively expensive) methods from the
literature. We also construct kernel matrices for the Laplacian, Gaussian, and
polynomial kernels -- all commonly used in physics and data analysis. We
explore the numerical properties of blocks of these matrices, and show that
they are amenable to our approach. Depending on the data set, our randomized
algorithm can successfully compute low rank approximations in high dimensions.
We report results for data sets with ambient dimensions from four to 1,000.Comment: 43 pages, 21 figure
Estimation Considerations in Contextual Bandits
Contextual bandit algorithms are sensitive to the estimation method of the
outcome model as well as the exploration method used, particularly in the
presence of rich heterogeneity or complex outcome models, which can lead to
difficult estimation problems along the path of learning. We study a
consideration for the exploration vs. exploitation framework that does not
arise in multi-armed bandits but is crucial in contextual bandits; the way
exploration and exploitation is conducted in the present affects the bias and
variance in the potential outcome model estimation in subsequent stages of
learning. We develop parametric and non-parametric contextual bandits that
integrate balancing methods from the causal inference literature in their
estimation to make it less prone to problems of estimation bias. We provide the
first regret bound analyses for contextual bandits with balancing in the domain
of linear contextual bandits that match the state of the art regret bounds. We
demonstrate the strong practical advantage of balanced contextual bandits on a
large number of supervised learning datasets and on a synthetic example that
simulates model mis-specification and prejudice in the initial training data.
Additionally, we develop contextual bandits with simpler assignment policies by
leveraging sparse model estimation methods from the econometrics literature and
demonstrate empirically that in the early stages they can improve the rate of
learning and decrease regret
Optimal CUR Matrix Decompositions
The CUR decomposition of an matrix finds an
matrix with a subset of columns of together with an matrix with a subset of rows of as well as a
low-rank matrix such that the matrix approximates the matrix
that is, , where
denotes the Frobenius norm and is the best matrix
of rank constructed via the SVD. We present input-sparsity-time and
deterministic algorithms for constructing such a CUR decomposition where
and and rank. Up to constant
factors, our algorithms are simultaneously optimal in and rank.Comment: small revision in lemma 4.
Restricted Boltzmann machine to determine the input weights for extreme learning machines
The Extreme Learning Machine (ELM) is a single-hidden layer feedforward
neural network (SLFN) learning algorithm that can learn effectively and
quickly. The ELM training phase assigns the input weights and bias randomly and
does not change them in the whole process. Although the network works well, the
random weights in the input layer can make the algorithm less effective and
impact on its performance. Therefore, we propose a new approach to determine
the input weights and bias for the ELM using the restricted Boltzmann machine
(RBM), which we call RBM-ELM. We compare our new approach with a well-known
approach to improve the ELM and a state of the art algorithm to select the
weights for the ELM. The results show that the RBM-ELM outperforms both
methodologies and achieve a better performance than the ELM.Comment: 14 pages, 7 figures and 5 table
Deterministic Sampling of Sparse Trigonometric Polynomials
One can recover sparse multivariate trigonometric polynomials from few
randomly taken samples with high probability (as shown by Kunis and Rauhut). We
give a deterministic sampling of multivariate trigonometric polynomials
inspired by Weil's exponential sum. Our sampling can produce a deterministic
matrix satisfying the statistical restricted isometry property, and also nearly
optimal Grassmannian frames. We show that one can exactly reconstruct every
-sparse multivariate trigonometric polynomial with fixed degree and of
length from the determinant sampling , using the orthogonal matching
pursuit, and # X is a prime number greater than . This result is
almost optimal within the factor. The simulations show that the
deterministic sampling can offer reconstruction performance similar to the
random sampling.Comment: 9 page
Iteratively Reweighted Approaches to Sparse Composite Regularization
Motivated by the observation that a given signal admits
sparse representations in multiple dictionaries but with
varying levels of sparsity across dictionaries, we propose two new algorithms
for the reconstruction of (approximately) sparse signals from noisy linear
measurements. Our first algorithm, Co-L1, extends the well-known lasso
algorithm from the L1 regularizer to composite
regularizers of the form while self-adjusting the regularization weights
. Our second algorithm, Co-IRW-L1, extends the well-known
iteratively reweighted L1 algorithm to the same family of composite
regularizers. We provide several interpretations of both algorithms: i)
majorization-minimization (MM) applied to a non-convex log-sum-type penalty,
ii) MM applied to an approximate -type penalty, iii) MM applied to
Bayesian MAP inference under a particular hierarchical prior, and iv)
variational expectation-maximization (VEM) under a particular prior with
deterministic unknown parameters. A detailed numerical study suggests that our
proposed algorithms yield significantly improved recovery SNR when compared to
their non-composite L1 and IRW-L1 counterparts
Motion Planning of Uncertain Ordinary Differential Equation Systems
This work presents a novel motion planning framework, rooted in nonlinear programming theory, that treats uncertain fully and under-actuated dynamical systems described by ordinary differential equations. Uncertainty in multibody dynamical systems comes from various sources, such as: system parameters, initial conditions, sensor and actuator noise, and external forcing. Treatment of uncertainty in design is of paramount practical importance because all real-life systems are affected by it, and poor robustness and suboptimal performance result if it’s not accounted for in a given design. In this work uncertainties are modeled using Generalized Polynomial Chaos and are solved quantitatively using a least-square collocation method. The computational efficiency of this approach enables the inclusion of uncertainty statistics in the nonlinear programming optimization process. As such, the proposed framework allows the user to pose, and answer, new design questions related to uncertain dynamical systems.
Specifically, the new framework is explained in the context of forward, inverse, and hybrid dynamics formulations. The forward dynamics formulation, applicable to both fully and under-actuated systems, prescribes deterministic actuator inputs which yield uncertain state trajectories. The inverse dynamics formulation is the dual to the forward dynamic, and is only applicable to fully-actuated systems; deterministic state trajectories are prescribed and yield uncertain actuator inputs. The inverse dynamics formulation is more computationally efficient as it requires only algebraic evaluations and completely avoids numerical integration. Finally, the hybrid dynamics formulation is applicable to under-actuated systems where it leverages the benefits of inverse dynamics for actuated joints and forward dynamics for unactuated joints; it prescribes actuated state and unactuated input trajectories which yield uncertain unactuated states and actuated inputs.
The benefits of the ability to quantify uncertainty when planning the motion of multibody dynamic systems are illustrated through several case-studies. The resulting designs determine optimal motion plans—subject to deterministic and statistical constraints—for all possible systems within the probability space
Approximate Policy Iteration with a Policy Language Bias: Solving Relational Markov Decision Processes
We study an approach to policy selection for large relational Markov Decision
Processes (MDPs). We consider a variant of approximate policy iteration (API)
that replaces the usual value-function learning step with a learning step in
policy space. This is advantageous in domains where good policies are easier to
represent and learn than the corresponding value functions, which is often the
case for the relational MDPs we are interested in. In order to apply API to
such problems, we introduce a relational policy language and corresponding
learner. In addition, we introduce a new bootstrapping routine for goal-based
planning domains, based on random walks. Such bootstrapping is necessary for
many large relational MDPs, where reward is extremely sparse, as API is
ineffective in such domains when initialized with an uninformed policy. Our
experiments show that the resulting system is able to find good policies for a
number of classical planning domains and their stochastic variants by solving
them as extremely large relational MDPs. The experiments also point to some
limitations of our approach, suggesting future work
A Near-Optimal Sampling Strategy for Sparse Recovery of Polynomial Chaos Expansions
Compressive sampling has become a widely used approach to construct
polynomial chaos surrogates when the number of available simulation samples is
limited. Originally, these expensive simulation samples would be obtained at
random locations in the parameter space. It was later shown that the choice of
sample locations could significantly impact the accuracy of resulting
surrogates. This motivated new sampling strategies or design-of-experiment
approaches, such as coherence-optimal sampling, which aim at improving the
coherence property. In this paper, we propose a sampling strategy that can
identify near-optimal sample locations that lead to improvement in
local-coherence property and also enhancement of cross-correlation properties
of measurement matrices. We provide theoretical motivations for the proposed
sampling strategy along with several numerical examples that show that our
near-optimal sampling strategy produces substantially more accurate results,
compared to other sampling strategies
- …