32 research outputs found

    From Massive Parallelization to Quantum Computing: Seven Novel Approaches to Query Optimization

    Get PDF
    The goal of query optimization is to map a declarative query (describing data to generate) to a query plan (describing how to generate the data) with optimal execution cost. Query optimization is required to support declarative query interfaces. It is a core problem in the area of database systems and has received tremendous attention in the research community, starting with an initial publication in 1979. In this thesis, we revisit the query optimization problem. This visit is motivated by several developments that change the context of query optimization. That change is not reflected in prior literature. First, advances in query execution platforms and processing techniques have changed the context of query optimization. Novel provisioning models and processing techniques such as Cloud computing, crowdsourcing, or approximate processing allow to trade between different execution cost metrics (e.g., execution time versus monetary execution fees in case of Cloud computing). This makes it necessary to compare alternative execution plans according to multiple cost metrics in query optimization. While this is a common scenario nowadays, the literature on query optimization with multiple cost metrics (a generalization of the classical problem variant with one execution cost metric) is surprisingly sparse. While prior methods take hours to optimize even moderately sized queries when considering multiple cost metrics, we propose a multitude of approaches to make query optimization in such scenarios practical. A second development that we address in this thesis is the availability of novel software and hardware platforms that can be exploited for optimization. We will show that integer programming solvers, massively parallel clusters (which nowadays are commonly used for query execution), and adiabatic quantum annealers enable us to solve query optimization problem instances that are far beyond the capabilities of prior approaches. In summary, we propose seven novel approaches to query optimization that significantly increase the size of the problem instances that can be addressed (measured by the query size and by the number of considered execution cost metrics). Those novel approaches can be classified into three broad categories: moving query optimization before run time to relax constraints on optimization time, trading optimization time for relaxed optimality guarantees (leading to approximation schemes, incremental algorithms, and randomized algorithms for query optimization with multiple cost metrics), and reducing optimization time by leveraging novel software and hardware platforms (integer programming solvers, massively parallel clusters, and adiabatic quantum annealers). Those approaches are novel since they address novel problem variants of query optimization, introduced in this thesis, since they are novel for their respective problem variant (e.g., we propose the first randomized algorithm for query optimization with multiple cost metrics), or because they have never been used for optimization problems in the database domain (e.g., this is the first time that quantum computing is used to solve a database-specific optimization problem)

    On High-Performance Benders-Decomposition-Based Exact Methods with Application to Mixed-Integer and Stochastic Problems

    Get PDF
    RÉSUMÉ : La programmation stochastique en nombres entiers (SIP) combine la difficulté de l’incertitude et de la non-convexité et constitue une catégorie de problèmes extrêmement difficiles à résoudre. La résolution efficace des problèmes SIP est d’une grande importance en raison de leur vaste applicabilité. Par conséquent, l’intérêt principal de cette dissertation porte sur les méthodes de résolution pour les SIP. Nous considérons les SIP en deux étapes et présentons plusieurs algorithmes de décomposition améliorés pour les résoudre. Notre objectif principal est de développer de nouveaux schémas de décomposition et plusieurs techniques pour améliorer les méthodes de décomposition classiques, pouvant conduire à résoudre optimalement divers problèmes SIP. Dans le premier essai de cette thèse, nous présentons une revue de littérature actualisée sur l’algorithme de décomposition de Benders. Nous fournissons une taxonomie des améliorations algorithmiques et des stratégies d’accélération de cet algorithme pour synthétiser la littérature et pour identifier les lacunes, les tendances et les directions de recherche potentielles. En outre, nous discutons de l’utilisation de la décomposition de Benders pour développer une (méta- )heuristique efficace, décrire les limites de l’algorithme classique et présenter des extensions permettant son application à un plus large éventail de problèmes. Ensuite, nous développons diverses techniques pour surmonter plusieurs des principaux inconvénients de l’algorithme de décomposition de Benders. Nous proposons l’utilisation de plans de coupe, de décomposition partielle, d’heuristiques, de coupes plus fortes, de réductions et de stratégies de démarrage à chaud pour pallier les difficultés numériques dues aux instabilités, aux inefficacités primales, aux faibles coupes d’optimalité ou de réalisabilité, et à la faible relaxation linéaire. Nous testons les stratégies proposées sur des instances de référence de problèmes de conception de réseau stochastique. Des expériences numériques illustrent l’efficacité des techniques proposées. Dans le troisième essai de cette thèse, nous proposons une nouvelle approche de décomposition appelée méthode de décomposition primale-duale. Le développement de cette méthode est fondé sur une reformulation spécifique des sous-problèmes de Benders, où des copies locales des variables maîtresses sont introduites, puis relâchées dans la fonction objective. Nous montrons que la méthode proposée atténue significativement les inefficacités primales et duales de la méthode de décomposition de Benders et qu’elle est étroitement liée à la méthode de décomposition duale lagrangienne. Les résultats de calcul sur divers problèmes SIP montrent la supériorité de cette méthode par rapport aux méthodes classiques de décomposition. Enfin, nous étudions la parallélisation de la méthode de décomposition de Benders pour étendre ses performances numériques à des instances plus larges des problèmes SIP. Les variantes parallèles disponibles de cette méthode appliquent une synchronisation rigide entre les processeurs maître et esclave. De ce fait, elles souffrent d’un important déséquilibre de charge lorsqu’elles sont appliquées aux problèmes SIP. Cela est dû à un problème maître difficile qui provoque un important déséquilibre entre processeur et charge de travail. Nous proposons une méthode Benders parallèle asynchrone dans un cadre de type branche-et-coupe. L’assouplissement des exigences de synchronisation entraine des problèmes de convergence et d’efficacité divers auxquels nous répondons en introduisant plusieurs techniques d’accélération et de recherche. Les résultats indiquent que notre algorithme atteint des taux d’accélération plus élevés que les méthodes synchronisées conventionnelles et qu’il est plus rapide de plusieurs ordres de grandeur que CPLEX 12.7.----------ABSTRACT : Stochastic integer programming (SIP) combines the difficulty of uncertainty and non-convexity, and constitutes a class of extremely challenging problems to solve. Efficiently solving SIP problems is of high importance due to their vast applicability. Therefore, the primary focus of this dissertation is on solution methods for SIPs. We consider two-stage SIPs and present several enhanced decomposition algorithms for solving them. Our main goal is to develop new decomposition schemes and several acceleration techniques to enhance the classical decomposition methods, which can lead to efficiently solving various SIP problems to optimality. In the first essay of this dissertation, we present a state-of-the-art survey of the Benders decomposition algorithm. We provide a taxonomy of the algorithmic enhancements and the acceleration strategies of this algorithm to synthesize the literature, and to identify shortcomings, trends and potential research directions. In addition, we discuss the use of Benders decomposition to develop efficient (meta-)heuristics, describe the limitations of the classical algorithm, and present extensions enabling its application to a broader range of problems. Next, we develop various techniques to overcome some of the main shortfalls of the Benders decomposition algorithm. We propose the use of cutting planes, partial decomposition, heuristics, stronger cuts, and warm-start strategies to alleviate the numerical challenges arising from instabilities, primal inefficiencies, weak optimality/feasibility cuts, and weak linear relaxation. We test the proposed strategies with benchmark instances from stochastic network design problems. Numerical experiments illustrate the computational efficiency of the proposed techniques. In the third essay of this dissertation, we propose a new and high-performance decomposition approach, called Benders dual decomposition method. The development of this method is based on a specific reformulation of the Benders subproblems, where local copies of the master variables are introduced and then priced out into the objective function. We show that the proposed method significantly alleviates the primal and dual shortfalls of the Benders decomposition method and it is closely related to the Lagrangian dual decomposition method. Computational results on various SIP problems show the superiority of this method compared to the classical decomposition methods as well as CPLEX 12.7. Finally, we study parallelization of the Benders decomposition method. The available parallel variants of this method implement a rigid synchronization among the master and slave processors. Thus, it suffers from significant load imbalance when applied to the SIP problems. This is mainly due to having a hard mixed-integer master problem that can take hours to be optimized. We thus propose an asynchronous parallel Benders method in a branchand- cut framework. However, relaxing the synchronization requirements entails convergence and various efficiency problems which we address them by introducing several acceleration techniques and search strategies. In particular, we propose the use of artificial subproblems, cut generation, cut aggregation, cut management, and cut propagation. The results indicate that our algorithm reaches higher speedup rates compared to the conventional synchronized methods and it is several orders of magnitude faster than CPLEX 12.7

    Enhancing Productivity and Performance Portability of General-Purpose Parallel Programming

    Get PDF
    This work focuses on compiler and run-time techniques for improving the productivity and the performance portability of general-purpose parallel programming. More specifically, we focus on shared-memory task-parallel languages, where the programmer explicitly exposes parallelism in the form of short tasks that may outnumber the cores by orders of magnitude. The compiler, the run-time, and the platform (henceforth the system) are responsible for harnessing this unpredictable amount of parallelism, which can vary from none to excessive, towards efficient execution. The challenge arises from the aspiration to support fine-grained irregular computations and nested parallelism. This work is even more ambitious by also aspiring to lay the foundations to efficiently support declarative code, where the programmer exposes all available parallelism, using high-level language constructs such as parallel loops, reducers or futures. The appeal of declarative code is twofold for general-purpose programming: it is often easier for the programmer who does not have to worry about the granularity of the exposed parallelism, and it achieves better performance portability by avoiding overfitting to a small range of platforms and inputs for which the programmer is coarsening. Furthermore, PRAM algorithms, an important class of parallel algorithms, naturally lend themselves to declarative programming, so supporting it is a necessary condition for capitalizing on the wealth of the PRAM theory. Unfortunately, declarative codes often expose such an overwhelming number of fine-grained tasks that existing systems fail to deliver performance. Our contributions can be partitioned into three components. First, we tackle the issue of coarsening, which declarative code leaves to the system. We identify two goals of coarsening and advocate tackling them separately, using static compiler transformations for one and dynamic run-time approaches for the other. Additionally, we present evidence that the current practice of burdening the programmer with coarsening either leads to codes with poor performance-portability, or to a significantly increased programming effort. This is a ``show-stopper'' for general-purpose programming. To compare the performance portability among approaches, we define an experimental framework and two metrics, and we demonstrate that our approaches are preferable. We close the chapter on coarsening by presenting compiler transformations that automatically coarsen some types of very fine-grained codes. Second, we propose Lazy Scheduling, an innovative run-time scheduling technique that infers the platform load at run-time, using information already maintained. Based on the inferred load, Lazy Scheduling adapts the amount of available parallelism it exposes for parallel execution and, thus, saves parallelism overheads that existing approaches pay. We implement Lazy Scheduling and present experimental results on four different platforms. The results show that Lazy Scheduling is vastly superior for declarative codes and competitive, if not better, for coarsened codes. Moreover, Lazy Scheduling is also superior in terms of performance-portability, supporting our thesis that it is possible to achieve reasonable efficiency and performance portability with declarative codes. Finally, we also implement Lazy Scheduling on XMT, an experimental manycore platform developed at the University of Maryland, which was designed to support codes derived from PRAM algorithms. On XMT, we manage to harness the existing hardware support for scheduling flat parallelism to compose it with Lazy Scheduling, which supports nested parallelism. In the resulting hybrid scheduler, the hardware and software work in synergy to overcome each other's weaknesses. We show the performance composability of the hardware and software schedulers, both in an abstract cost model and experimentally, as the hybrid always performs better than the software scheduler alone. Furthermore, the cost model is validated by using it to predict if it is preferable to execute a code sequentially, with outer parallelism, or with nested parallelism, depending on the input, the available hardware parallelism and the calling context of the parallel code

    Stochastic programming for City Logistics: new models and methods

    Get PDF
    The need for mobility that emerged in the last decades led to an impressive increase in the number of vehicles as well as to a saturation of transportation infrastructures. Consequently, traffic congestion, accidents, transportation delays, and polluting emissions are some of the most recurrent concerns transportation and city managers have to deal with. However, just building new infrastructures might be not sustainable because of their cost, the land usage, which usually lacks in metropolitan regions, and their negative impact on the environment. Therefore, a different way of improving the performance of transportation systems while enhancing travel safety has to be found in order to make people and good transportation operations more efficient and support their key role in the economic development of either a city or a whole country. The concept of City Logistics (CL) is being developed to answer to this need. Indeed, CL focus on reducing the number of vehicles operating in the city, controlling their dimension and characteristics. CL solutions do not only improve the transportation system but the whole logistics system within an urban area, trying to integrate interests of the several. This global view challenges researchers to develop planning models, methods and decision support tools for the optimization of the structures and the activities of the transportation system. In particular, this leads researchers to the definition of strategic and tactical problems belonging to well-known problem classes, including network design problem, vehicle routing problem (VRP), traveling salesman problem (TSP), bin packing problem (BPP), which typically act as sub-problems of the overall CL system optimization. When long planning horizons are involved, these problems become stochastic and, thus, must explicitly take into account the different sources of uncertainty that can affect the transportation system. Due to these reasons and the large-scale of CL systems, the optimization problems arising in the urban context are very challenging. Their solution requires investigations in mathematical and combinatorial optimization methods as well as the implementation of efficient exact and heuristic algorithms. However, contributions answering these challenges are still limited number. This work contributes in filling this gap in the literature in terms of both modeling framework for new planning problems in CL context and developing new and effective heuristic solving methods for the two-stage formulation of these problems. Three stochastic problems are proposed in the context of CL: the stochastic variable cost and size bin packing problem (SVCSBPP), the multi-handler knapsack problem under uncertainty (MHKPu) and the multi-path traveling salesman problem with stochastic travel times (mpTSPs). The SVCSBPP arises in supply-chain management, in which companies outsource the logistics activities to a third-party logistic firm (3PL). The procurement of sufficient capacity, expressed in terms of vehicles, containers or space in a warehouse for varying periods of time to satisfy the demand plays a crucial role. The SVCSBPP focuses on the relation between a company and its logistics capacity provider and the tactical-planning problem of determining the quantity of capacity units to secure for the next period of activity. The SVCSBPP is the first attempt to introduce a stochastic variant of the variable cost and size bin packing problem (VCSBPP) considering not only the uncertainty on the demand to deliver, but also on the renting cost of the different bins and their availability. A large number of real-life situations can be satisfactorily modeled as a MHKPu, in particular in the last mile delivery. Last mile delivery may involve different sequences of consolidation operations, each handled by different workers with different skill levels and reliability. The improper management of consolidation operations can cause delay in the operations reducing the overall profit of the deliveries. Thus, given a set of potential logistics handlers and a set of items to deliver, characterized by volume and random profit, the MHKPu consists in finding a subset of items which maximizes the expected total profit. The profit is given by the sum of a deterministic profit and a stochastic profit oscillation, with unknown probability distribution, due to the random handling costs of the handlers.The mpTSPs arises mainly in City Logistics applications. Cities offer several services, such as garbage collection, periodic delivery of goods in urban grocery distribution and bike sharing services. These services require the planning of fixed and periodic tours that will be used from one to several weeks. However, the enlarged time horizon as well as strong dynamic changes in travel times due to traffic congestion and other nuisances typical of the urban transportation induce the presence of multiple paths with stochastic travel times. Given a graph characterized by a set of nodes connected by arcs, mpTSPs considers that, for every pair of nodes, multiple paths between the two nodes are present. Each path is characterized by a random travel time. Similarly to the standard TSP, the aim of the problem is to define the Hamiltonian cycle minimizing the expected total cost. These planning problems have been formulated as two-stage integer stochastic programs with recourse. Discretization methods are usually applied to approximate the probability distribution of the random parameters. The resulting approximated program becomes a deterministic linear program with integer decision variables of generally very large dimensions, beyond the reach of exact methods. Therefore, heuristics are required. For the MHKPu, we apply the extreme value theory and derive a deterministic approximation, while for the SVCSBPP and the mpTSPs we introduce effective and accurate heuristics based on the progressive hedging (PH) ideas. The PH mitigates the computational difficulty associated with large problem instances by decomposing the stochastic program by scenario. When effective heuristic techniques exist for solving individual scenario, that is the case of the SVCSBPP and the mpTSPs, the PH further reduces the computational effort of solving scenario subproblems by means of a commercial solver. In particular, we propose a series of specific strategies to accelerate the search and efficiently address the symmetry of solutions, including an aggregated consensual solution, heuristic penalty adjustments, and a bundle fixing technique. Yet, although solution methods become more powerful, combinatorial problems in the CL context are very large and difficult to solve. Thus, in order to significantly enhance the computational efficiency, these heuristics implement parallel schemes. With the aim to make a complete analysis of the problems proposed, we perform extensive numerical experiments on a large set of instances of various dimensions, including realistic setting derived by real applications in the urban area, and combinations of different levels of variability and correlations in the stochastic parameters. The campaign includes the assessment of the efficiency of the meta-heuristic, the evaluation of the interest to explicitly consider uncertainty, an analysis of the impact of problem characteristics, the structure of solutions, as well as an evaluation of the robustness of the solutions when used as decision tool. The numerical analysis indicates that the stochastic programs have significant effects in terms of both the economic impact (e.g. cost reduction) and the operations management (e.g. prediction of the capacity needed by the firm). The proposed methodologies outperform the use of commercial solvers, also when small-size instances are considered. In fact, they find good solutions in manageable computing time. This makes these heuristics a strategic tool that can be incorporated in larger decision support systems for CL

    Matlab

    Get PDF
    This book is a collection of 19 excellent works presenting different applications of several MATLAB tools that can be used for educational, scientific and engineering purposes. Chapters include tips and tricks for programming and developing Graphical User Interfaces (GUIs), power system analysis, control systems design, system modelling and simulations, parallel processing, optimization, signal and image processing, finite different solutions, geosciences and portfolio insurance. Thus, readers from a range of professional fields will benefit from its content

    High performance computing and communications: FY 1995 implementation plan

    Full text link

    High performance computing and communications: FY 1997 implementation plan

    Full text link

    High performance computing and communications: FY 1996 implementation plan

    Full text link
    corecore