106 research outputs found
Algorithms for Vertex-Weighted Matching in Graphs
A matching M in a graph is a subset of edges such that no two edges in M are incident on the same vertex. Matching is a fundamental combinatorial problem that has applications in many contexts: high-performance computing, bioinformatics, network switch design, web technologies, etc. Examples in the first context include sparse linear systems of equations, where matchings are used to place large matrix elements on or close to the diagonal, to compute the block triangular decomposition of sparse matrices, to construct sparse bases for the null space or column space of under-determined matrices, and to coarsen graphs in multi-level graph partitioning algorithms. In the first part of this thesis, we develop exact and approximation algorithms for vertex weighted matchings, an under-studied variant of the weighted matching problem. We propose three exact algorithms, three half approximation algorithms, and a two-third approximation algorithm. We exploit inherent properties of this problem such as lexicographical orders, decomposition into sub-problems, and the reachability property, not only to design efficient algorithms, but also to provide simple proofs of correctness of the proposed algorithms. In the second part of this thesis, we describe work on a new parallel half-approximation algorithm for weighted matching. Algorithms for computing optimal matchings are not amenable to parallelism, and hence we consider approximation algorithms here. We extend the existing work on a parallel half approximation algorithm for weighted matching and provide an analysis of its time complexity. We support the theoretical observations with experimental results obtained with MatchBoxP, toolkit designed and implemented in C++ and MPI using modern software engineering techniques. The work in this thesis has resulted in better understanding of matching theory, a functional public-domain software toolkit, and modeling of the sparsest basis problem as a vertex-weighted matching problem
MinT-Net: Novel and Scalable Network-enabled Comparative Tools for Stress Studies of Microbiomes in Transition
Community detection is the process of analyzing graphs to distinguish different groups of nodes from one another. A community is defined as a group of nodes that are closely related among each other but loosely related to the other nodes in the network. These communities exist within the species, gene, and protein networks of a microbiome. Many different algorithms have been developed to detect these communities. The project as a whole is intended to track communities in dynamic networks using known community detection algorithms. An initial effort created implementations of different algorithms for community detection to test for community quality with respect to computational time, focusing on the Girvan-Newman algorithm and the Louvain algorithm. Trials were run on assortative planted partition models to test the accuracy of the algorithms with respect to their computational time. After the trials, the Louvain algorithm was identified to not only be more computationally-time efficient, but more accurate when detecting communities in models with less assortativity. The accuracy and efficiency of the Louvain algorithm is promising for its future use in dynamic community detection in networks that model microbiomes in transition. Preliminary detection efforts on dynamic networks with community structure were performed on models using the framework of the Chinese Restaurant stochastic process. These efforts attempted to track community structure over time, utilizing the Jaccard index and Pointwise Mutual Information, or PMI. Leveraging these preliminary results, we plan on developing a set of formal rules to track communities in dynamic graphs
Streaming Matching and Edge Cover in Practice
Graph algorithms with polynomial space and time requirements often become infeasible for massive graphs with billions of edges or more. State-of-the-art approaches therefore employ approximate serial, parallel, and distributed algorithms to tackle these challenges. However, such approaches require storing the entire graph in memory and thus need access to costly computing resources such as clusters and supercomputers. In this paper, we present practical streaming approaches for solving massive graph problems using limited memory for two prototypical graph problems: maximum weighted matching and minimum weighted edge cover. For matching, we conduct a thorough computational study on two of the semi-streaming algorithms including a recent breakthrough result that achieves a 1/(2+ε)-approximation of the weight while using O(n log W /ε) memory (here n is the number of vertices and W is the maximum edge weight), designed by Paz and Schwartzman [SODA, 2017]. Empirically, we show that the semi-streaming algorithms produce matchings whose weight is close to the best 1/2-approximate offline algorithm while requiring less time and an order-of-magnitude less memory.
For minimum weighted edge cover, we develop three novel semi-streaming algorithms. Two of these algorithms require a single pass through the input graph, require O(n log n) memory, and provide a 2-approximation guarantee on the objective. We also leverage a relationship between approximate maximum weighted matching and approximate minimum weighted edge cover to develop a two-pass 3/2+ε-approximate algorithm with the memory requirement of Paz and Schwartzman’s semi-streaming matching algorithm. These streaming approaches are compared against the state-of-the-art 3/2-approximate offline algorithm.
The semi-streaming matching and the novel edge cover algorithms proposed in this paper can process graphs with several billions of edges in under 30 minutes using 6 GB of memory, which is at least an order of magnitude improvement from the offline (non-streaming) algorithms. For the largest graph, the best alternative offline parallel approximation algorithm (GPA+ROMA) could not finish in three hours even while employing hundreds of processors and 1 TB of memory. We also demonstrate an application of semi-streaming algorithm by computing a matching using linearly bounded memory on intersection graphs derived from three machine learning datasets, while the existing offline algorithms could not complete on one of these datasets since its memory requirement exceeded 1TB
- …
