183 research outputs found
Distributed Stochastic Optimization of the Regularized Risk
Many machine learning algorithms minimize a regularized risk, and stochastic
optimization is widely used for this task. When working with massive data, it
is desirable to perform stochastic optimization in parallel. Unfortunately,
many existing stochastic optimization algorithms cannot be parallelized
efficiently. In this paper we show that one can rewrite the regularized risk
minimization problem as an equivalent saddle-point problem, and propose an
efficient distributed stochastic optimization (DSO) algorithm. We prove the
algorithm's rate of convergence; remarkably, our analysis shows that the
algorithm scales almost linearly with the number of processors. We also verify
with empirical evaluations that the proposed algorithm is competitive with
other parallel, general purpose stochastic and batch optimization algorithms
for regularized risk minimization
Decomposition Algorithms for Stochastic Programming on a Computational Grid
We describe algorithms for two-stage stochastic linear programming with
recourse and their implementation on a grid computing platform. In particular,
we examine serial and asynchronous versions of the L-shaped method and a
trust-region method. The parallel platform of choice is the dynamic,
heterogeneous, opportunistic platform provided by the Condor system. The
algorithms are of master-worker type (with the workers being used to solve
second-stage problems, and the MW runtime support library (which supports
master-worker computations) is key to the implementation. Computational results
are presented on large sample average approximations of problems from the
literature.Comment: 44 page
One-Class Conditional Random Fields for Sequential Anomaly Detection
Sequential anomaly detection is a challenging problem due to the one-class nature of the data (i.e., data is collected from only one class) and the temporal dependence in sequential data. We present One-Class Conditional Random Fields (OCCRF) for sequential anomaly detection that learn from a one-class dataset and capture the temporal dependence structure, in an unsupervised fashion. We propose a hinge loss in a regularized risk minimization framework that maximizes the margin between each sequence being classified as "normal" and "abnormal." This allows our model to accept most (but not all) of the training data as normal, yet keeps the solution space tight. Experimental results on a number of real-world datasets show our model outperforming several baselines. We also report an exploratory study on detecting abnormal organizational behavior in enterprise social networks.United States. Defense Advanced Research Projects Agency (W911NF-12-C-0028)United States. Office of Naval Research (N000140910625)National Science Foundation (U.S.) (IIS-1018055
Convex Optimization for Big Data
This article reviews recent advances in convex optimization algorithms for
Big Data, which aim to reduce the computational, storage, and communications
bottlenecks. We provide an overview of this emerging field, describe
contemporary approximation techniques like first-order methods and
randomization for scalability, and survey the important role of parallel and
distributed computation. The new Big Data algorithms are based on surprisingly
simple principles and attain staggering accelerations even on classical
problems.Comment: 23 pages, 4 figurs, 8 algorithm
Efficient Stochastic Programming in Julia
We present StochasticPrograms.jl, a user-friendly and powerful open-source
framework for stochastic programming written in the Julia language. The
framework includes both modeling tools and structure-exploiting optimization
algorithms. Stochastic programming models can be efficiently formulated using
expressive syntax and models can be instantiated, inspected, and analyzed
interactively. The framework scales seamlessly to distributed environments.
Small instances of a model can be run locally to ensure correctness, while
larger instances are automatically distributed in a memory-efficient way onto
supercomputers or clouds and solved using parallel optimization algorithms.
These structure-exploiting solvers are based on variations of the classical
L-shaped and progressive-hedging algorithms. We provide a concise mathematical
background for the various tools and constructs available in the framework,
along with code listings exemplifying their usage. Both software innovations
related to the implementation of the framework and algorithmic innovations
related to the structured solvers are highlighted. We conclude by demonstrating
strong scaling properties of the distributed algorithms on numerical benchmarks
in a multi-node setup
Incremental proximal methods for large scale convex optimization
Laboratory for Information and Decision Systems Report LIDS-P-2847We consider the minimization of a sum∑m [over]i=1 fi (x) consisting of a large
number of convex component functions fi . For this problem, incremental methods
consisting of gradient or subgradient iterations applied to single components have
proved very effective. We propose new incremental methods, consisting of proximal
iterations applied to single components, as well as combinations of gradient, subgradient,
and proximal iterations. We provide a convergence and rate of convergence
analysis of a variety of such methods, including some that involve randomization in
the selection of components.We also discuss applications in a few contexts, including
signal processing and inference/machine learning.United States. Air Force Office of Scientific Research (grant FA9550-10-1-0412
- …