207 research outputs found
Generalized Bundle Methods
We study a class of generalized bundle methods for which the stabilizing term can be any closed convex function satisfying certain properties. This setting covers several algorithms from the literature that have been so far regarded as distinct. Under a different hypothesis on the stabilizing term and/or the function to be minimized, we prove finite termination, asymptotic convergence, and finite convergence to an optimal point, with or without limits on the number of serious steps and/or requiring the proximal parameter to go to infinity. The convergence proofs leave a high degree of freedom in the crucial implementative features of the algorithm, i.e., the management of the bundle of subgradients (β-strategy) and of the proximal parameter (t-strategy). We extensively exploit a dual view of bundle methods, which are shown to be a dual ascent approach to one nonlinear problem in an appropriate dual space, where nonlinear subproblems are approximately solved at each step with an inner linearization approach. This allows us to precisely characterize the changes in the subproblems during the serious steps, since the dual problem is not tied to the local concept of ε-subdifferential. For some of the proofs, a generalization of inf-compactness, called *-compactness, is required; this concept is related to that of asymptotically well-behaved functions
Standard Bundle Methods: Untrusted Models and Duality
We review the basic ideas underlying the vast family of algorithms for nonsmooth convex optimization known as "bundle methods|. In a nutshell, these approaches are based on constructing models of the function, but lack of continuity of first-order information implies that these models cannot be trusted, not even close to an optimum. Therefore, many different forms of stabilization have been proposed to try to avoid being led to areas where the model is so inaccurate as to result in almost useless steps. In the development of these methods, duality arguments are useful, if not outright necessary, to better analyze the behaviour of the algorithms. Also, in many relevant applications the function at hand is itself a dual one, so that duality allows to map back algorithmic concepts and results into a "primal space" where they can be exploited; in turn, structure in that space can be exploited to improve the algorithms' behaviour, e.g. by developing better models. We present an updated picture of the many developments around the basic idea along at least three different axes: form of the stabilization, form of the model, and approximate evaluation of the function
Stabilized Benders methods for large-scale combinatorial optimization, with appllication to data privacy
The Cell Suppression Problem (CSP) is a challenging Mixed-Integer Linear Problem arising in statistical tabular data protection. Medium sized instances of CSP involve thousands of binary variables and million of continuous variables and constraints. However, CSP has the typical
structure that allows application of the renowned Benders’ decomposition method: once the “complicating” binary variables are fixed, the problem decomposes into a large set of linear subproblems on the “easy” continuous ones. This allows to project away the easy variables, reducing to a master problem in the complicating ones where the value functions of the subproblems are approximated with the standard cutting-plane approach. Hence, Benders’ decomposition suffers from the same drawbacks of the cutting-plane method, i.e., oscillation and slow convergence, compounded with the fact that the master problem is combinatorial. To overcome this drawback we present a stabilized Benders decomposition whose master is restricted to a neighborhood of successful candidates by local branching constraints, which are dynamically adjusted, and even dropped, during the iterations. Our experiments with randomly generated and real-world CSP instances with up to 3600 binary variables, 90M continuous variables and 15M inequality constraints show that our approach is competitive with both the current state-of-the-art (cutting-plane-based) code for cell suppression, and the Benders implementation in CPLEX 12.7. In some instances, stabilized Benders is able to quickly provide a very good solution in less than one minute, while the other approaches were not able to find any feasible solution in one hour.Peer ReviewedPreprin
Bicriteria data compression
The advent of massive datasets (and the consequent design of high-performing
distributed storage systems) have reignited the interest of the scientific and
engineering community towards the design of lossless data compressors which
achieve effective compression ratio and very efficient decompression speed.
Lempel-Ziv's LZ77 algorithm is the de facto choice in this scenario because of
its decompression speed and its flexibility in trading decompression speed
versus compressed-space efficiency. Each of the existing implementations offers
a trade-off between space occupancy and decompression speed, so software
engineers have to content themselves by picking the one which comes closer to
the requirements of the application in their hands. Starting from these
premises, and for the first time in the literature, we address in this paper
the problem of trading optimally, and in a principled way, the consumption of
these two resources by introducing the Bicriteria LZ77-Parsing problem, which
formalizes in a principled way what data-compressors have traditionally
approached by means of heuristics. The goal is to determine an LZ77 parsing
which minimizes the space occupancy in bits of the compressed file, provided
that the decompression time is bounded by a fixed amount (or vice-versa). This
way, the software engineer can set its space (or time) requirements and then
derive the LZ77 parsing which optimizes the decompression speed (or the space
occupancy, respectively). We solve this problem efficiently in O(n log^2 n)
time and optimal linear space within a small, additive approximation, by
proving and deploying some specific structural properties of the weighted graph
derived from the possible LZ77-parsings of the input file. The preliminary set
of experiments shows that our novel proposal dominates all the highly
engineered competitors, hence offering a win-win situation in theory&practice
Optimization Methods: an Applications-Oriented Primer
Effectively sharing resources requires solving complex decision problems. This requires constructing a mathematical model of the underlying system, and then applying appropriate mathematical methods to find an optimal solution of the model, which is ultimately translated into actual decisions. The development of mathematical tools for solving optimization problems dates back to Newton and Leibniz, but it has tremendously accelerated since the advent of digital computers. Today, optimization is an inter-disciplinary subject, lying at the interface between management science, computer science, mathematics and engineering. This chapter offers an introduction to the main theoretical and software tools that are nowadays available to practitioners to solve the kind of optimization problems that are more likely to be encountered in the context of this book. Using, as a case study, a simplified version of the bike sharing problem, we guide the reader through the discussion of modelling and algorithmic issues, concentrating on methods for solving optimization problems to proven optimality
A Stabilized Structured Dantzig-Wolfe Decomposition Method
We discuss an algorithmic scheme, which we call the stabilized structured Dantzig-Wolfe decomposition method, for solving large-scale structured linear programs. It can be applied when the subproblem of the standard Dantzig-Wolfe approach admits an alternative master model amenable to column generation, other than the standard one in which there is a variable for each of the extreme points and extreme rays of the corresponding polyhedron. Stabilization is achieved by the same techniques developed for the standard Dantzig-Wolfe approach and it is equally useful to improve the performance, as shown by computational results obtained on an application to the multicommodity capacitated network design problem
Prim-based Support-Graph Preconditioners for Min-Cost Flow Problems
Support-graph preconditioners have been shown to be a valuable tool for the iterative solution, via a Preconditioned Conjugate Gradient method, of the KKT systems that must be solved at each iteration of an Interior Point algorithm for the solution of Min Cost Flow problems. These preconditioners extract a proper triangulated subgraph, with ``large'' weight, of the original graph: in practice, trees and Brother-Connected Trees (BCTs) of depth two have been shown to be the most computationally efficient families of subgraphs. In the literature, approximate versions of the Kruskal algorithm for maximum-weight spanning trees have most often been used for choosing the subgraphs; Prim-based approaches have been used for trees, but no comparison have ever been reported. We propose Prim-based heuristics for BCTs, which require nontrivial modifications w.r.t. the previously proposed Kruskal-based approaches, and present a computational comparison of the different approaches, which shows that Prim-based heuristics are most often preferable to Kruskal-based ones
Incremental bundle methods using upper models
We propose a family of proximal bundle methods for minimizing sum-structured convex nondifferentiable functions which require two slightly uncommon assumptions, that are satisfied in many relevant applications: Lipschitz continuity of the functions and oracles which also produce upper estimates on the function values. In exchange, the methods: i) use upper models of the functions that allow to estimate function values at points where the oracle has not been called; ii) provide the oracles with more information about when the function computation can be interrupted, possibly diminishing their cost; iii) allow to skip oracle calls entirely for some of the component functions, not only at "null steps" but also at "serious steps"; iv) provide explicit and reliable a-posteriori estimates of the quality of the obtained solutions; v) work with all possible combinations of different assumptions on how the oracles deal with not being able to compute the function with arbitrary accuracy. We also discuss the introduction of constraints (or, more generally, of easy components) and use of (partly) aggregated models
Dynamic smoothness parameter for fast gradient methods
We present and computationally evaluate a variant of the fast gradient method by Nesterov that is capable of exploiting information, even if approximate, about the optimal value of the problem. This information is available in some applications, among which the computation of bounds for hard integer programs. We show that dynamically changing the smoothness parameter of the algorithm using this information results in a better convergence profile of the algorithm in practice
QoS Routing with worst-case delay constraints: models, algorithms and performance analysis
In a network where weighted fair-queueing schedulers are used at each link, a flow is guaranteed an end-to-end worst-case delays which depends on the rate reserved for it at each link it traverses. Therefore, it is possible to compute resource-constrained paths that meet target delay constraints, and optimize some key performance metrics (e.g., minimize the overall reserved rate, maximize the remaining capacity at bottleneck links, etc.). Despite the large amount of literature that has appeared on weighted fair-queueing schedulers since the mid '90s, this has so far been done only for a single type of scheduler, probably because the complexity of solving the problem in general appeared forbidding. In this paper, we formulate and solve the optimal path computation and resource allocation problem for a broad category of weighted fair-queueing schedulers, from those emulating a Generalized Processor Sharing fluid server to variants of Deficit Round Robin. We classify schedulers according to their latency expressions, and show that a significant divide exists between those where routing a new flow affects the performance of existing flows, and those for which this do not happen. For the former, explicit admission control constraints are required to ensure that existing flows still meet their deadline afterwards. However, despite this major difference and the differences among categories of schedulers, the problem can always be formulated as a Mixed-Integer Second-Order Cone problem (MI-SOCP), and be solved at optimality in split-second times even in fairly large networks
- …