11 research outputs found
Streaming, Local, and MultiLevel (Hyper)Graph Decomposition
(Hyper)Graph decomposition is a family of problems that aim to break down large (hyper)graphs into smaller sub(hyper)graphs for easier analysis. The importance of this lies in its ability to enable efficient computation on large and complex (hyper)graphs, such as social networks, chemical compounds, and computer networks. This dissertation explores several types of (hyper)graph decomposition problems, including graph partitioning, hypergraph partitioning, local graph clustering, process mapping, and signed graph clustering. Our main focus is on streaming algorithms, local algorithms and multilevel algorithms. In terms of streaming algorithms, we make contributions with highly efficient and effective algorithms for (hyper)graph partitioning and process mapping. In terms of local algorithms, we propose sub-linear algorithms which are effective in detecting high-quality local communities around a given seed node in a graph based on the distribution of a given motif. In terms of multilevel algorithms, we engineer high-quality multilevel algorithms for process mapping and signed graph clustering. We provide a thorough discussion of each algorithm along with experimental results demonstrating their superiority over existing state-of-the-art techniques.
The results show that the proposed algorithms achieve improved performance and better solutions in various metrics, making them highly promising for practical applications. Overall, this dissertation showcases the effectiveness of advanced combinatorial algorithmic techniques in solving challenging (hyper)graph decomposition problems
FREIGHT: Fast Streaming Hypergraph Partitioning
Partitioning the vertices of a (hyper)graph into k roughly balanced blocks such that few (hyper)edges run between blocks is a key problem for large-scale distributed processing. A current trend for partitioning huge (hyper)graphs using low computational resources are streaming algorithms. In this work, we propose FREIGHT: a Fast stREamInG Hypergraph parTitioning algorithm which is an adaptation of the widely-known graph-based algorithm Fennel. By using an efficient data structure, we make the overall running of FREIGHT linearly dependent on the pin-count of the hypergraph and the memory consumption linearly dependent on the numbers of nets and blocks. The results of our extensive experimentation showcase the promising performance of FREIGHT as a highly efficient and effective solution for streaming hypergraph partitioning. Our algorithm demonstrates competitive running time with the Hashing algorithm, with a difference of a maximum factor of four observed on three fourths of the instances. Significantly, our findings highlight the superiority of FREIGHT over all existing (buffered) streaming algorithms and even the in-memory algorithm HYPE, with respect to both cut-net and connectivity measures. This indicates that our proposed algorithm is a promising hypergraph partitioning tool to tackle the challenge posed by large-scale and dynamic data processing
Buffered Streaming Edge Partitioning
Addressing the challenges of processing massive graphs, which are prevalent in diverse fields such as social, biological, and technical networks, we introduce HeiStreamE and FreightE, two innovative (buffered) streaming algorithms designed for efficient edge partitioning of large-scale graphs. HeiStreamE utilizes an adapted Split-and-Connect graph model and a Fennel-based multilevel partitioning scheme, while FreightE partitions a hypergraph representation of the input graph. Besides ensuring superior solution quality, these approaches also overcome the limitations of existing algorithms by maintaining linear dependency on the graph size in both time and memory complexity with no dependence on the number of blocks of partition. Our comprehensive experimental analysis demonstrates that HeiStreamE outperforms current streaming algorithms and the re-streaming algorithm 2PS in partitioning quality (replication factor), and is more memory-efficient for real-world networks where the number of edges is far greater than the number of vertices. Further, FreightE is shown to produce fast and efficient partitions, particularly for higher numbers of partition blocks
High-Quality Hierarchical Process Mapping
Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation when processing graphs on a parallel computer. When a topology of a distributed system is known, an important task is then to map the blocks of the partition onto the processors such that the overall communication cost is reduced. We present novel multilevel algorithms that integrate graph partitioning and process mapping. Important ingredients of our algorithm include fast label propagation, more localized local search, initial partitioning, as well as a compressed data structure to compute processor distances without storing a distance matrix. Moreover, our algorithms are able to exploit a given hierarchical structure of the distributed system under consideration. Experiments indicate that our algorithms speed up the overall mapping process and, due to the integrated multilevel approach, also find much better solutions in practice. For example, one configuration of our algorithm yields similar solution quality as the previous state-of-the-art in terms of mapping quality for large numbers of partitions while being a factor 9.3 faster. Compared to the currently fastest iterated multilevel mapping algorithm Scotch, we obtain 16% better solutions while investing slightly more running time
Open Problems in (Hyper)Graph Decomposition
Large networks are useful in a wide range of applications. Sometimes problem
instances are composed of billions of entities. Decomposing and analyzing these
structures helps us gain new insights about our surroundings. Even if the final
application concerns a different problem (such as traversal, finding paths,
trees, and flows), decomposing large graphs is often an important subproblem
for complexity reduction or parallelization. This report is a summary of
discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph
Decomposition" and presents currently open problems and future directions in
the area of (hyper)graph decomposition
Gamma Deployment Problem in Grids: Complexity and a new Integer Linear Programming Formulation
Exportado OPUSMade available in DSpace on 2019-08-14T20:11:25Z (GMT). No. of bitstreams: 1
marcelofaraj.pdf: 1437942 bytes, checksum: 4943b013a2573893475e42772b0c19d8 (MD5)
Previous issue date: 15Redes veiculares constituem um dos componentes mais essenciais dos sistemas inteligentes de transporte. Elas possuem um potencial para facilitar a gestão de tráfego, reduzir taxas de acidente de trânsito e proporcionar outras soluções para a construção de cidades inteligentes. Um dos principais desafios associados a redes veiculares é a escolha das melhores localizações para instalação das infraestruturas de comunicação, as quais são conhecidas como roadside units ou RSUs. Esta dissertação lida com o Problema da Deposição Gamma, o qual consiste em depositar o mínimo número de roadside units em uma rede rodoviária de modo a cumprir a métrica Deposição Gamma. De acordo com esta métrica, uma porcentagem mínima dos veículos transitando pela rede rodoviária devem estar cobertos, sendo que um veículo é considerado coberto caso encontre ao menos uma roadside unit a cada intervalo pré-determinado de sua viagem. Nesta dissertação, propõe-se um tratamento formal baseado em teoria dos grafos e apresenta-se uma prova de que a versão de decisão do Problema da Deposição Gamma em Grades pertence à classe de complexidade NP-completo. Em seguida, expõe-se um problema associado ao modelo multifluxo de programação linear inteira presente na literatura e propõe-se uma pequena correção. Também se introduz um novo modelo de programação linear inteira baseado em cobertura de conjuntos e demonstra-se que o politopo associado a sua relaxação linear está contido no politopo associado à relaxação linear do modelo multifluxo. Por fim, experimentos computacionais com um otimizador comercial mostram que a formulação cobertura de conjuntos se comporta de modo muito superior à formulação multifluxo em termos de gap de relaxação linear e tempo de execução.Vehicular ad hoc networks are one of the most significant components of intelligent transportation systems. They have the potential to ease traffic management, lower accident rates and provide other solutions to smart cities. One of the main challenges on vehicular ad hoc networks is to choose the best places to deploy roadside units. This thesis deals with the Gamma Deployment Problem, which consists of deploying the minimum number of roadside units on a road network meeting the Gamma Deployment metric. Within this metric, at least a given fraction of vehicles passing in the road network must be covered, i.e they should meet at least one roadside unit each predetermined time interval. In this thesis, I propose a formal treatment based on graph theoretical concepts and provide a proof that the decision version of the Gamma Deployment Problem in Grids is NP-complete. In addition, I expose an issue with the multi-flow integer linear programming formulation present in literature and propose a slight correction for it. I also introduce a new integer linear programming formulation based on set covering and provide a proof that the polytope associated with its linear programming relaxation is contained in the polytope associated with the linear programming relaxation of the multi-flow formulation. Finally, computational experiments with a commercial optimizer show that the set covering formulation widely outperforms the multi-flow formulation regarding linear programming relaxation gap and running time
Local Motif Clustering via (Hyper)Graph Partitioning
Local clustering consists of finding a good cluster around a seed node in a graph. Recently local motif clustering has been proposed: it is a local clustering approach based on motifs rather than edges. Since this approach is recent, most algorithms to solve it are extensions of statistical and numerical methods previously used for local clustering, while combinatorial approaches are still few and simple. In this work, we build a (hyper)graph to represent the motif-distribution around the seed node. We solve this model using sophisticated (hyper)graph partitioners. On average, our algorithm computes clusters six times faster and three times better than the state-of-the-art for the triangle motif
Faster Local Motif Clustering via Maximum Flows
Local clustering aims to identify a cluster within a given graph that includes a designated seed node or a significant portion of a group of seed nodes. This cluster should be well-characterized, i.e., it has a high number of internal edges and a low number of external edges. In this work, we propose SOCIAL, a novel algorithm for local motif clustering which optimizes for motif conductance based on a local hypergraph model representation of the problem and an adapted version of the max-flow quotient-cut improvement algorithm (MQI). In our experiments with the triangle motif, SOCIAL produces local clusters with an average motif conductance 1.7% lower than the state-of-the-art, while being up to multiple orders of magnitude faster
Open Problems in (Hyper)Graph Decomposition
Large networks are useful in a wide range of applications. Sometimes problem instances are composed of billions of entities. Decomposing and analyzing these structures helps us gain new insights about our surroundings. Even if the final application concerns a different problem (such as traversal, finding paths, trees, and flows), decomposing large graphs is often an important subproblem for complexity reduction or parallelization. This report is a summary of discussions that happened at Dagstuhl seminar 23331 on "Recent Trends in Graph Decomposition" and presents currently open problems and future directions in the area of (hyper)graph decomposition