Search CORE

580 research outputs found

Models and Matheuristics for Large-Scale Combinatorial Optimization Problems

Author: Gnägi Mario
Publication venue: Universität Bern
Publication date
Field of study

Combinatorial optimization deals with efficiently determining an optimal (or at least a good) decision among a finite set of alternatives. In business administration, such combinatorial optimization problems arise in, e.g., portfolio selection, project management, data analysis, and logistics. These optimization problems have in common that the set of alternatives becomes very large as the problem size increases, and therefore an exhaustive search of all alternatives may require a prohibitively long computation time. Moreover, due to their combinatorial nature no closed-form solutions to these problems exist. In practice, a common approach to tackle combinatorial optimization problems is to formulate them as mathematical models and to solve them using a mathematical programming solver (cf., e.g., Bixby et al. 1999, Achterberg et al. 2020). For small-scale problem instances, the mathematical models comprise a manageable number of variables and constraints such that mathematical programming solvers are able to devise optimal solutions within a reasonable computation time. For large-scale problem instances, the number of variables and constraints becomes very large which extends the computation time required to find an optimal solution considerably. Therefore, despite the continuously improving performance of mathematical programming solvers and computing hardware, the availability of mathematical models that are efficient in terms of the number of variables and constraints used is of crucial importance. Another frequently used approach to address combinatorial optimization problems are matheuristics. Matheuristics decompose the considered optimization problem into subproblems, which are then formulated as mathematical models and solved with the help of a mathematical programming solver. Matheuristics are particularly suitable for situations where it is required to find a good, but not necessarily an optimal solution within a short computation time, since the speed of the solution process can be controlled by choosing an appropriate size of the subproblems. This thesis consists of three papers on large-scale combinatorial optimization problems. We consider a portfolio optimization problem in finance, a scheduling problem in project management, and a clustering problem in data analysis. For these problems, we present novel mathematical models that require a relatively small number of variables and constraints, and we develop matheuristics that are based on novel problem-decomposition strategies. In extensive computational experiments, the proposed models and matheuristics performed favorably compared to state-of-the-art models and solution approaches from the literature. In the first paper, we consider the problem of determining a portfolio for an enhanced index-tracking fund. Enhanced index-tracking funds aim to replicate the returns of a particular financial stock-market index as closely as possible while outperforming that index by a small positive excess return. Additionally, we consider various real-life constraints that may be imposed by investors, stock exchanges, or investment guidelines. Since enhanced index-tracking funds are particularly attractive to investors if the index comprises a large number of stocks and thus is well diversified, it is of particular interest to tackle large-scale problem instances. For this problem, we present two matheuristics that consist of a novel construction matheuristic, and two different improvement matheuristics that are based on the concepts of local branching (cf. Fischetti and Lodi 2003) and iterated greedy heuristics (cf., e.g., Ruiz and Stützle 2007). Moreover, both matheuristics are based on a novel mathematical model for which we provide insights that allow to remove numerous redundant variables and constraints. We tested both matheuristics in a computational experiment on problem instances that are based on large stock-market indices with up to 9,427 constituents. It turns out that our matheuristics yield better portfolios than benchmark approaches in terms of out-of-sample risk-return characteristics. In the second paper, we consider the problem of scheduling a set of precedence-related project activities, each of which requiring some time and scarce resources during their execution. For each activity, alternative execution modes are given, which differ in the duration and the resource requirements of the activity. Sought is a start time and an execution mode for each activity, such that all precedence relationships are respected, the required amount of each resource does not exceed its prescribed capacity, and the project makespan is minimized. For this problem, we present two novel mathematical models, in which the number of variables remains constant when the range of the activities' durations and thus also the planning horizon is increased. Moreover, we enhance the performance of the proposed mathematical models by eliminating some symmetric solutions from the search space and by adding some redundant sequencing constraints for activities that cannot be processed in parallel. In a computational experiment based on instances consisting of activities with durations ranging from one up to 260 time units, the proposed models consistently outperformed all reference models from the literature. In the third paper, we consider the problem of grouping similar objects into clusters, where the similarity between a pair of objects is determined by a distance measure based on some features of the objects. In addition, we consider constraints that impose a maximum capacity for the clusters, since the size of the clusters is often restricted in practical clustering applications. Furthermore, practical clustering applications are often characterized by a very large number of objects to be clustered. For this reason, we present a matheuristic based on novel problem-decomposition strategies that are specifically designed for large-scale problem instances. The proposed matheuristic comprises two phases. In the first phase, we decompose the considered problem into a series of generalized assignment problems, and in the second phase, we decompose the problem into subproblems that comprise groups of clusters only. In a computational experiment, we tested the proposed matheuristic on problem instances with up to 498,378 objects. The proposed matheuristic consistently outperformed the state-of-the-art approach on medium- and large-scale instances, while matching the performance for small-scale instances. Although we considered three specific optimization problems in this thesis, the proposed models and matheuristics can be adapted to related optimization problems with only minor modifications. Examples for such related optimization problems are the UCITS-constrained index-tracking problem (cf, e.g., Strub and Trautmann 2019), which consists of determining the portfolio of an investment fund that must comply with regulatory restrictions imposed by the European Union, the multi-site resource-constrained project scheduling problem (cf., e.g., Laurent et al. 2017), which comprises the scheduling of a set of project activities that can be executed at alternative sites, or constrained clustering problems with must-link and cannot-link constraints (cf., e.g., González-Almagro et al. 2020)

BORIS Theses

Randomized heuristics for the Capacitated Clustering Problem

Author: Campos Vicente
Landa-Silva Dario
Marti Rafael
Martinez-Gavara Anna
Publication venue: 'Elsevier BV'
Publication date: 01/11/2017
Field of study

In this paper, we investigate the adaptation of the Greedy Randomized Adaptive Search Procedure (GRASP) and Iterated Greedy methodologies to the Capacitated Clustering Problem (CCP). In particular, we focus on the effect of the balance between randomization and greediness on the performance of these multi-start heuristic search methods when solving this NP-hard problem. The former is a memory-less approach that constructs independent solutions, while the latter is a memory-based method that constructs linked solutions, obtained by partially rebuilding previous ones. Both are based on the combination of greediness and randomization in the constructive process, and coupled with a subsequent local search phase. We propose these two multi-start methods and their hybridization and compare their performance on the CCP. Additionally, we propose a heuristic based on the mathematical programming formulation of this problem, which constitutes a so-called matheuristic. We also implement a classical randomized method based on simulated annealing to complete the picture of randomized heuristics. Our extensive experimentation reveals that Iterated Greedy performs better than GRASP in this problem, and improved outcomes are obtained when both methods are hybridized and coupled with the matheuristic. In fact, the hybridization is able to outperform the best approaches previously published for the CCP. This study shows that memory-based construction is an effective mechanism within multi-start heuristic search techniques

Nottingham ePrints

Nottingham eTheses

Repository@Nottingham

Data-Collection for the Sloan Digital Sky Survey: a Network-Flow Heuristic

Author: Agarwal
Bar-Ilan
Brönnimann
Cornuéjols
Crease
Erkut
F.Miller Maley
Fisher
Francis
Francis
Frederickson
Goldberg
Gurevich
He
Johnson
Johnson
Love
Lovász
Maass
Marchetti-Spaccamela
Megiddo
Megiddo
Megiddo
Neal Young
Nemhauser
Papadimitriou
Papadimitriou
Plotkin
Robert Lupton
Savani
Shetty
Sridharan
Publication venue: 'Elsevier BV'
Publication date: 17/05/2002
Field of study

The goal of the Sloan Digital Sky Survey is ``to map in detail one-quarter of the entire sky, determining the positions and absolute brightnesses of more than 100 million celestial objects''. The survey will be performed by taking ``snapshots'' through a large telescope. Each snapshot can capture up to 600 objects from a small circle of the sky. This paper describes the design and implementation of the algorithm that is being used to determine the snapshots so as to minimize their number. The problem is NP-hard in general; the algorithm described is a heuristic, based on Lagriangian-relaxation and min-cost network flow. It gets within 5-15% of a naive lower bound, whereas using a ``uniform'' cover only gets within 25-35%.Comment: proceedings version appeared in ACM-SIAM Symposium on Discrete Algorithms (1998

arXiv.org e-Print Archive

Crossref

Practical Minimum Cut Algorithms

Author: Henzinger Monika
Noe Alexander
Schulz Christian
Strash Darren
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 27/08/2017
Field of study

The minimum cut problem for an undirected edge-weighted graph asks us to divide its set of nodes into two blocks while minimizing the weight sum of the cut edges. Here, we introduce a linear-time algorithm to compute near-minimum cuts. Our algorithm is based on cluster contraction using label propagation and Padberg and Rinaldi's contraction heuristics [SIAM Review, 1991]. We give both sequential and shared-memory parallel implementations of our algorithm. Extensive experiments on both real-world and generated instances show that our algorithm finds the optimal cut on nearly all instances significantly faster than other state-of-the-art algorithms while our error rate is lower than that of other heuristic algorithms. In addition, our parallel algorithm shows good scalability

arXiv.org e-Print Archive

Crossref

The Unreasonable Success of Local Search: Geometric Optimization

Author: Cohen-Addad Vincent
Mathieu Claire
Publication venue
Publication date: 09/09/2015
Field of study

What is the effectiveness of local search algorithms for geometric problems in the plane? We prove that local search with neighborhoods of magnitude

1/\epsilon^c

is an approximation scheme for the following problems in the Euclidian plane: TSP with random inputs, Steiner tree with random inputs, facility location (with worst case inputs), and bicriteria

k

-median (also with worst case inputs). The randomness assumption is necessary for TSP

arXiv.org e-Print Archive

CiteSeerX