Search CORE

60 research outputs found

An examination of heuristic algorithms for the travelling salesman problem

Author: Höck Barbar Katja
Publication venue: Department of Statistical Sciences
Publication date: 01/01/1988
Field of study

The role of heuristics in combinatorial optimization is discussed. Published heuristics for the Travelling Salesman Problem (TSP) were reviewed and morphological boxes were used to develop new heuristics for the TSP. New and published heuristics were programmed for symmetric TSPs where the triangle inequality holds, and were tested on micro computer. The best of the quickest heuristics was the furthest insertion heuristic, finding tours 3 to 9% above the best known solutions (2 minutes for 100 nodes). Better results were found by longer running heuristics, e.g. the cheapest angle heuristic (CCAO), 0-6% above best (80 minutes for 100 nodes). The savings heuristic found the best results overall, but took more than 2 hours to complete. Of the new heuristics, the MST path algorithm at times improved on the results of the furthest insertion heuristic while taking the same time as the CCAO. The study indicated that there is little likelihood of improving on present methods unless a fundamental new approach is discovered. Finally a case study using TSP heuristics to aid the planning of grid surveys was described

Cape Town University OpenUCT

Engineering of Algorithms for Very Large k Partitioning

Author: Haag Manuel
Publication venue: Karlsruher Institut für Technologie
Publication date: 05/09/2022
Field of study

KITopen

Spectral partitioning with multiple eigenvectors

Author: Alpert Charles J
Kahng Andrew B
Yao So-Zen
Publication venue: Published by Elsevier B.V.
Publication date: 15/01/1999
Field of study

AbstractThe graph partitioning problem is to divide the vertices of a graph into disjoint clusters to minimize the total cost of the edges cut by the clusters. A spectral partitioning heuristic uses the graph's eigenvectors to construct a geometric representation of the graph (e.g., linear orderings) which are subsequently partitioned. Our main result shows that when all the eigenvectors are used, graph partitioning reduces to a new vector partitioning problem. This result implies that as many eigenvectors as are practically possible should be used to construct a solution. This philosophy is in contrast to that of the widely used spectral bipartitioning (SB) heuristic (which uses only a single eigenvector) and several previous multi-way partitioning heuristics [8, 11, 17, 27, 38] (which use k eigenvectors to construct k-way partitionings). Our result motivates a simple ordering heuristic that is a multiple-eigenvector extension of SB. This heuristic not only significantly outperforms recursive SB, but can also yield excellent multi-way VLSI circuit partitionings as compared to [1, 11]. Our experiments suggest that the vector partitioning perspective opens the door to new and effective partitioning heuristics. The present paper updates and improves a preliminary version of this work [5]

Elsevier - Publisher Connector

Engineering an Approximation Scheme for Traveling Salesman in Planar Graphs

Author: Becker Amariah
Fox-Epstein Eli
Klein Philip N.
Meierfrankenfeld David
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 16th International Symposium on Experimental Algorithms (SEA 2017)
Publication date: 01/01/2017
Field of study

We present an implementation of a linear-time approximation scheme for the traveling salesman problem on planar graphs with edge weights. We observe that the theoretical algorithm involves constants that are too large for practical use. Our implementation, which is not subject to the theoretical algorithm\u27s guarantee, can quickly find good tours in very large planar graphs

Dagstuhl Research Online Publication Server

A survey on graph partitioning approach to spectral clustering

Author: Goyal Subhanshu
KUMAR SUSHIL
SHUKLA A K
ZAVERI M A
Publication venue: 'Publishing House for Science and Technology, Vietnam Academy of Science and Technology'
Publication date: 16/03/2015
Field of study

Cluster analysis is an unsupervised technique of grouping related objects without considering their label or class. The objects belonging to the same cluster are relatively more homogeneous in comparison with other clusters. The application of cluster analysis is in areas like gene expression analysis, galaxy formation, natural language processing and image segmentation etc. The clustering problem can be formulated as a graph cut problem where a suitable objective function has to be optimized. This study uses different graph cluster formulations based on graph cut and partitioning problems. A special class of graph clustering algorithm known as spectral clustering algorithms is used for the study. Two widely used spectral clustering algorithms are applied to explaining solution to these problems. These algorithms are generally based on the Eigen-decomposition of Laplacian matrices of either weighted or non-weighted graphs.

Vietnam Academy of Science and Technology: Journals Online

Locality-Aware Parallel Sparse Matrix-Vector and Matrix-Transpose-Vector Multiplication on Many-Core Processors

Author: Akbudak K.
Aykanat C.
Karsavuran M.O.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

Sparse matrix-vector and matrix-transpose-vector multiplication (SpMMTV) repeatedly performed as z ← ATx and y ← A z (or y ← A w) for the same sparse matrix A is a kernel operation widely used in various iterative solvers. One important optimization for serial SpMMTV is reusing A-matrix nonzeros, which halves the memory bandwidth requirement. However, thread-level parallelization of SpMMTV that reuses A-matrix nonzeros necessitates concurrent writes to the same output-vector entries. These concurrent writes can be handled in two ways: via atomic updates or thread-local temporary output vectors that will undergo a reduction operation, both of which are not efficient or scalable on processors with many cores and complicated cache-coherency protocols. In this work, we identify five quality criteria for efficient and scalable thread-level parallelization of SpMMTV that utilizes one-dimensional (1D) matrix partitioning. We also propose two locality-aware 1D partitioning methods, which achieve reusing A-matrix nonzeros and intermediate z-vector entries; exploiting locality in accessing x -, y -, and -vector entries; and reducing the number of concurrent writes to the same output-vector entries. These two methods utilize rowwise and columnwise singly bordered block-diagonal (SB) forms of A. We evaluate the validity of our methods on a wide range of sparse matrices. Experiments on the 60-core cache-coherent Intel Xeon Phi processor show the validity of the identified quality criteria and the validity of the proposed methods in practice. The results also show that the performance improvement from reusing A-matrix nonzeros compensates for the overhead of concurrent writes through the proposed SB-based methods. © 2015 IEEE

Bilkent University Institutional Repository

Query estimation techniques in database systems

Author: König Arnd Christian
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 23/09/2004
Field of study

The effctiveness of query optimization in database systems critically depends on the system';s ability to assess the execution costs of different query execution plans. For this purpose, the sizes and data distributions of the intermediate results generated during plan execution need to be estimated as accurately as possible. This estimation requires the maintenance of statistics on the data stored in the database, which are referred to as data synopses. While the problem of query cost estimation has received significant attention for over a decade, it has remained an open issue in practice, because most previous techniques have focused on singular aspects of the problem such as minimizing the estimation error of a single type of query and a single data distribution, whereas database management systems generally need to support a wide range of queries over a number of datasets. In this thesis I introduce a new technique for query result estimation, which extends existing techniques in that it offers estimation for all combinations of the three major database operators selection, projection, and join. The approach is based on separate and independent approximations of the attribute values contained in a dataset and their frequencies. Through the use of space-filling curves, the approach extends to multi-dimensional data, while maintaining its accuracy and computational properties. The resulting estimation accuracy is competitive with specialized techniques and superior to the histogram techniques currently implemented in commercial database management systems. Because data synopses reside in main memory, they compete for available space with the database cache and query execution buffers. Consequently, the memory available to data synopses needs to be used efficiently. This results in a physical design problem for data synopses, which is to determine the best set of synopses for a given combination of datasets, queries, and available memory. This thesis introduces a formalization of the problem, and efficient algorithmic solutions. All discussed techniques are evaluated with regard to their overhead and resulting estimation accuracy on a variety of synthetic and real-life datasets.Die Effektivität der Anfrage-Optimierung in Datenbanksystemen hängt entscheidend von der Fähigkeit des Systems ab, die Kosten der verschiedenen Möglichkeiten, eine Anfrage auszuführen, abzuschätzen. Zu diesem Zweck ist es nötig, die Größen und Datenverteilungen der Zwischenresultate, die während der Ausführung einer Anfrage generiert werden, so genau wie möglich zu schätzen. Zur Lösung dieses Schätzproblems benötigt man Statistiken über die Daten, welche in dem Datenbanksystem gespeichert werden; diese Statistiken werden auch als Daten Synopsen bezeichnet. Obwohl das Problem der Schätzung von Anfragekosten innerhalb der letzten 10 Jahre intensiv untersucht wurde, gilt es weiterhin als offen, da viele der vorgeschlagenen Ansätze nur einen Teilaspekt des Problems betrachten. In den meisten Fällen wurden Techniken für das Abschätzen eines einzelnen Operators auf einer einzelnen Datenverteilung untersucht, wohingegen Datenbanksysteme in der Praxis eine Vielfalt von Anfragen über diverse Datensätze unterstützen müssen. Aus diesem Grund stellt diese Arbeit einen neuen Ansatz zur Resultatsabschätzung vor, welcher insofern über bestehende Ansätze hinausgeht, als dass er akkurate Abschätzung beliebiger Kombinationen der drei wichtigsten Datenbank-Operatoren erlaubt: Selektion, Projektion und Join. Meine Technik basiert auf separaten und unabhängigen Approximationen der Verteilung der Attributwerte eines Datensatzes und der Verteilung der Häufigkeiten dieser Attributwerte. Durch den Einsatz raumfüllender Kurven können diese Approximationstechniken zudem auf mehrdimensionale Datenverteilungen angewandt werden, ohne ihre Genauigkeit und geringen Berechnungskosten einzubüßen. Die resultierende Schätzgenauigkeit ist vergleichbar mit der von auf einen einzigen Operator spezialisierten Techniken, und deutlich höher als die der auf Histogrammen basierenden Ansätze, welche momentan in kommerziellen Datenbanksystemen eingesetzt werden. Da Daten Synopsen im Arbeitsspeicher residieren, reduzieren sie den Speicher, der für den Seitencache oder Ausführungspuffer zur Verfügung steht. Somit sollte der für Synopsen reservierte Speicher effizient genutzt werden, bzw. möglichst klein sein. Dies führt zu dem Problem, die optimale Kombination von Synopsen für eine gegebene Kombination an Daten, Anfragen und verfügbarem Speicher zu bestimmen. Diese Arbeit stellt eine formale Beschreibung des Problems, sowie effiziente Algorithmen zu dessen Lösung vor. Alle beschriebenen Techniken werden in Hinsicht auf ihren Aufwand und die resultierende Schätzgenauigkeit mittels Experimenten über eine Vielzahl von Datenverteilungen evaluiert

Universaar

Acronym

Query estimation techniques in database systems

Author: König Arnd Christian
Publication venue: Fakultät 6 - Naturwissenschaftlich-Technische Fakultät I. Fachrichtung 6.2 - Informatik
Publication date: 01/01/2001
Field of study

The effctiveness of query optimization in database systems critically depends on the system\u27;s ability to assess the execution costs of different query execution plans. For this purpose, the sizes and data distributions of the intermediate results generated during plan execution need to be estimated as accurately as possible. This estimation requires the maintenance of statistics on the data stored in the database, which are referred to as data synopses. While the problem of query cost estimation has received significant attention for over a decade, it has remained an open issue in practice, because most previous techniques have focused on singular aspects of the problem such as minimizing the estimation error of a single type of query and a single data distribution, whereas database management systems generally need to support a wide range of queries over a number of datasets. In this thesis I introduce a new technique for query result estimation, which extends existing techniques in that it offers estimation for all combinations of the three major database operators selection, projection, and join. The approach is based on separate and independent approximations of the attribute values contained in a dataset and their frequencies. Through the use of space-filling curves, the approach extends to multi-dimensional data, while maintaining its accuracy and computational properties. The resulting estimation accuracy is competitive with specialized techniques and superior to the histogram techniques currently implemented in commercial database management systems. Because data synopses reside in main memory, they compete for available space with the database cache and query execution buffers. Consequently, the memory available to data synopses needs to be used efficiently. This results in a physical design problem for data synopses, which is to determine the best set of synopses for a given combination of datasets, queries, and available memory. This thesis introduces a formalization of the problem, and efficient algorithmic solutions. All discussed techniques are evaluated with regard to their overhead and resulting estimation accuracy on a variety of synthetic and real-life datasets.Die Effektivität der Anfrage-Optimierung in Datenbanksystemen hängt entscheidend von der Fähigkeit des Systems ab, die Kosten der verschiedenen Möglichkeiten, eine Anfrage auszuführen, abzuschätzen. Zu diesem Zweck ist es nötig, die Größen und Datenverteilungen der Zwischenresultate, die während der Ausführung einer Anfrage generiert werden, so genau wie möglich zu schätzen. Zur Lösung dieses Schätzproblems benötigt man Statistiken über die Daten, welche in dem Datenbanksystem gespeichert werden; diese Statistiken werden auch als Daten Synopsen bezeichnet. Obwohl das Problem der Schätzung von Anfragekosten innerhalb der letzten 10 Jahre intensiv untersucht wurde, gilt es weiterhin als offen, da viele der vorgeschlagenen Ansätze nur einen Teilaspekt des Problems betrachten. In den meisten Fällen wurden Techniken für das Abschätzen eines einzelnen Operators auf einer einzelnen Datenverteilung untersucht, wohingegen Datenbanksysteme in der Praxis eine Vielfalt von Anfragen über diverse Datensätze unterstützen müssen. Aus diesem Grund stellt diese Arbeit einen neuen Ansatz zur Resultatsabschätzung vor, welcher insofern über bestehende Ansätze hinausgeht, als dass er akkurate Abschätzung beliebiger Kombinationen der drei wichtigsten Datenbank-Operatoren erlaubt: Selektion, Projektion und Join. Meine Technik basiert auf separaten und unabhängigen Approximationen der Verteilung der Attributwerte eines Datensatzes und der Verteilung der Häufigkeiten dieser Attributwerte. Durch den Einsatz raumfüllender Kurven können diese Approximationstechniken zudem auf mehrdimensionale Datenverteilungen angewandt werden, ohne ihre Genauigkeit und geringen Berechnungskosten einzubüßen. Die resultierende Schätzgenauigkeit ist vergleichbar mit der von auf einen einzigen Operator spezialisierten Techniken, und deutlich höher als die der auf Histogrammen basierenden Ansätze, welche momentan in kommerziellen Datenbanksystemen eingesetzt werden. Da Daten Synopsen im Arbeitsspeicher residieren, reduzieren sie den Speicher, der für den Seitencache oder Ausführungspuffer zur Verfügung steht. Somit sollte der für Synopsen reservierte Speicher effizient genutzt werden, bzw. möglichst klein sein. Dies führt zu dem Problem, die optimale Kombination von Synopsen für eine gegebene Kombination an Daten, Anfragen und verfügbarem Speicher zu bestimmen. Diese Arbeit stellt eine formale Beschreibung des Problems, sowie effiziente Algorithmen zu dessen Lösung vor. Alle beschriebenen Techniken werden in Hinsicht auf ihren Aufwand und die resultierende Schätzgenauigkeit mittels Experimenten über eine Vielzahl von Datenverteilungen evaluiert

CiteSeerX

A repartitioning hypergraph model for dynamic load balancing

Author: Alpert
Aykanat
Aykanat
Berger
Bui
Bultan
Cambazoglu
Catalyurek
Cybenko
deCougny
Deerwester
Devine
Devine
Devine
Doruk Bozdağ
Erik G. Boman
Flaherty
Garey
Hendrickson
Hendrickson
Hendrickson
Hu
Karen D. Devine
Karypis
Lee Ann Riesen
Lengauer
Oliker
Patra
Robert T. Heaphy
Sadayappan
Schloegel
Schloegel
Teresco
Trifunovic
Umit V. Catalyurek
Walshaw
Walshaw
Willebeek-LeMair
Williams
Çatalyürek
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref