62 research outputs found
Exponentiated Subgradient Algorithm for Online Optimization under the Random Permutation Model
Online optimization problems arise in many resource allocation tasks, where
the future demands for each resource and the associated utility functions
change over time and are not known apriori, yet resources need to be allocated
at every point in time despite the future uncertainty. In this paper, we
consider online optimization problems with general concave utilities. We modify
and extend an online optimization algorithm proposed by Devanur et al. for
linear programming to this general setting. The model we use for the arrival of
the utilities and demands is known as the random permutation model, where a
fixed collection of utilities and demands are presented to the algorithm in
random order. We prove that under this model the algorithm achieves a
competitive ratio of under a near-optimal assumption that the
bid to budget ratio is , where
is the number of resources, while enjoying a significantly lower computational
cost than the optimal algorithm proposed by Kesselheim et al. We draw a
connection between the proposed algorithm and subgradient methods used in
convex optimization. In addition, we present numerical experiments that
demonstrate the performance and speed of this algorithm in comparison to
existing algorithms
Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
We apply stochastic average gradient (SAG) algorithms for training
conditional random fields (CRFs). We describe a practical implementation that
uses structure in the CRF gradient to reduce the memory requirement of this
linearly-convergent stochastic gradient method, propose a non-uniform sampling
scheme that substantially improves practical performance, and analyze the rate
of convergence of the SAGA variant under non-uniform sampling. Our experimental
results reveal that our method often significantly outperforms existing methods
in terms of the training objective, and performs as well or better than
optimally-tuned stochastic gradient methods in terms of test error.Comment: AI/Stats 2015, 24 page
Inverse Optimization for Routing Problems
We propose a method for learning decision-makers' behavior in routing
problems using Inverse Optimization (IO). The IO framework falls into the
supervised learning category and builds on the premise that the target behavior
is an optimizer of an unknown cost function. This cost function is to be
learned through historical data, and in the context of routing problems, can be
interpreted as the routing preferences of the decision-makers. In this view,
the main contributions of this study are to propose an IO methodology with a
hypothesis function, loss function, and stochastic first-order algorithm
tailored to routing problems. We further test our IO approach in the Amazon
Last Mile Routing Research Challenge, where the goal is to learn models that
replicate the routing preferences of human drivers, using thousands of
real-world routing examples. Our final IO-learned routing model achieves a
score that ranks 2nd compared with the 48 models that qualified for the final
round of the challenge. Our results showcase the flexibility and real-world
potential of the proposed IO methodology to learn from decision-makers'
decisions in routing problems
Online Matrix Completion with Side Information
We give an online algorithm and prove novel mistake and regret bounds for
online binary matrix completion with side information. The mistake bounds we
prove are of the form . The term is
analogous to the usual margin term in SVM (perceptron) bounds. More
specifically, if we assume that there is some factorization of the underlying
matrix into where the rows of are interpreted
as "classifiers" in and the rows of as "instances" in
, then is the maximum (normalized) margin over all
factorizations consistent with the observed matrix. The
quasi-dimension term measures the quality of side information. In the
presence of vacuous side information, . However, if the side
information is predictive of the underlying factorization of the matrix, then
in an ideal case, where is the number of distinct row
factors and is the number of distinct column factors. We additionally
provide a generalization of our algorithm to the inductive setting. In this
setting, we provide an example where the side information is not directly
specified in advance. For this example, the quasi-dimension is now bounded
by
Graph Matching via convex relaxation to the simplex
This paper addresses the Graph Matching problem, which consists of finding
the best possible alignment between two input graphs, and has many applications
in computer vision, network deanonymization and protein alignment. A common
approach to tackle this problem is through convex relaxations of the NP-hard
\emph{Quadratic Assignment Problem} (QAP).
Here, we introduce a new convex relaxation onto the unit simplex and develop
an efficient mirror descent scheme with closed-form iterations for solving this
problem. Under the correlated Gaussian Wigner model, we show that the simplex
relaxation admits a unique solution with high probability. In the noiseless
case, this is shown to imply exact recovery of the ground truth permutation.
Additionally, we establish a novel sufficiency condition for the input matrix
in standard greedy rounding methods, which is less restrictive than the
commonly used `diagonal dominance' condition. We use this condition to show
exact one-step recovery of the ground truth (holding almost surely) via the
mirror descent scheme, in the noiseless setting. We also use this condition to
obtain significantly improved conditions for the GRAMPA algorithm [Fan et al.
2019] in the noiseless setting
Bundle methods for regularized risk minimization with applications to robust learning
Supervised learning in general and regularized risk minimization in particular is about solving optimization problem which is jointly defined by a performance measure and a set of labeled training examples. The outcome of learning, a model, is then used mainly for predicting the labels for unlabeled examples in the testing environment. In real-world scenarios: a typical learning process often involves solving a sequence of similar problems with different parameters before a final model is identified. For learning to be successful, the final model must be produced timely, and the model should be robust to (mild) irregularities in the testing environment. The purpose of this thesis is to investigate ways to speed up the learning process and improve the robustness of the learned model. We first develop a batch convex optimization solver specialized to the regularized risk minimization based on standard bundle methods. The solver inherits two main properties of the standard bundle methods. Firstly, it is capable of solving both differentiable and non-differentiable problems, hence its implementation can be reused for different tasks with minimal modification. Secondly, the optimization is easily amenable to parallel and distributed computation settings; this makes the solver highly scalable in the number of training examples. However, unlike the standard bundle methods, the solver does not have extra parameters which need careful tuning. Furthermore, we prove that the solver has faster convergence rate. In addition to that, the solver is very efficient in computing approximate regularization path and model selection. We also present a convex risk formulation for incorporating invariances and prior knowledge into the learning problem. This formulation generalizes many existing approaches for robust learning in the setting of insufficient or noisy training examples and covariate shift. Lastly, we extend a non-convex risk formulation for binary classification to structured prediction. Empirical results show that the model obtained with this risk formulation is robust to outliers in the training examples
- …