1,644 research outputs found
Query-driven learning for predictive analytics of data subspace cardinality
Fundamental to many predictive analytics tasks is the ability to estimate the cardinality (number of data items) of multi-dimensional data subspaces, defined by query selections over datasets. This is crucial for data analysts dealing with, e.g., interactive data subspace explorations, data subspace visualizations, and in query processing optimization. However, in many modern data systems, predictive analytics may be (i) too costly money-wise, e.g., in clouds, (ii) unreliable, e.g., in modern Big Data query engines, where accurate statistics are difficult to obtain/maintain, or (iii) infeasible, e.g., for privacy issues. We contribute a novel, query-driven, function estimation model of analyst-defined data subspace cardinality. The proposed estimation model is highly accurate in terms of prediction and accommodating the well-known selection queries: multi-dimensional range and distance-nearest neighbors (radius) queries. Our function estimation model: (i) quantizes the vectorial query space, by learning the analysts’ access patterns over a data space, (ii) associates query vectors with their corresponding cardinalities of the analyst-defined data subspaces, (iii) abstracts and employs query vectorial similarity to predict the cardinality of an unseen/unexplored data subspace, and (iv) identifies and adapts to possible changes of the query subspaces based on the theory of optimal stopping. The proposed model is decentralized, facilitating the scaling-out of such predictive analytics queries. The research significance of the model lies in that (i) it is an attractive solution when data-driven statistical techniques are undesirable or infeasible, (ii) it offers a scale-out, decentralized training solution, (iii) it is applicable to different selection query types, and (iv) it offers a performance that is superior to that of data-driven approaches
Distributed Regression in Sensor Networks: Training Distributively with Alternating Projections
Wireless sensor networks (WSNs) have attracted considerable attention in
recent years and motivate a host of new challenges for distributed signal
processing. The problem of distributed or decentralized estimation has often
been considered in the context of parametric models. However, the success of
parametric methods is limited by the appropriateness of the strong statistical
assumptions made by the models. In this paper, a more flexible nonparametric
model for distributed regression is considered that is applicable in a variety
of WSN applications including field estimation. Here, starting with the
standard regularized kernel least-squares estimator, a message-passing
algorithm for distributed estimation in WSNs is derived. The algorithm can be
viewed as an instantiation of the successive orthogonal projection (SOP)
algorithm. Various practical aspects of the algorithm are discussed and several
numerical simulations validate the potential of the approach.Comment: To appear in the Proceedings of the SPIE Conference on Advanced
Signal Processing Algorithms, Architectures and Implementations XV, San
Diego, CA, July 31 - August 4, 200
Fast ADMM Algorithm for Distributed Optimization with Adaptive Penalty
We propose new methods to speed up convergence of the Alternating Direction
Method of Multipliers (ADMM), a common optimization tool in the context of
large scale and distributed learning. The proposed method accelerates the speed
of convergence by automatically deciding the constraint penalty needed for
parameter consensus in each iteration. In addition, we also propose an
extension of the method that adaptively determines the maximum number of
iterations to update the penalty. We show that this approach effectively leads
to an adaptive, dynamic network topology underlying the distributed
optimization. The utility of the new penalty update schemes is demonstrated on
both synthetic and real data, including a computer vision application of
distributed structure from motion.Comment: 8 pages manuscript, 2 pages appendix, 5 figure
- …