145 research outputs found

    Exact Covers via Determinants

    Full text link
    Given a k-uniform hypergraph on n vertices, partitioned in k equal parts such that every hyperedge includes one vertex from each part, the k-dimensional matching problem asks whether there is a disjoint collection of the hyperedges which covers all vertices. We show it can be solved by a randomized polynomial space algorithm in time O*(2^(n(k-2)/k)). The O*() notation hides factors polynomial in n and k. When we drop the partition constraint and permit arbitrary hyperedges of cardinality k, we obtain the exact cover by k-sets problem. We show it can be solved by a randomized polynomial space algorithm in time O*(c_k^n), where c_3=1.496, c_4=1.642, c_5=1.721, and provide a general bound for larger k. Both results substantially improve on the previous best algorithms for these problems, especially for small k, and follow from the new observation that Lovasz' perfect matching detection via determinants (1979) admits an embedding in the recently proposed inclusion-exclusion counting scheme for set covers, despite its inability to count the perfect matchings

    Exact Covers via Determinants

    Get PDF
    Given a kk-uniform hypergraph on nn vertices, partitioned in kk equal parts such that every hyperedge includes one vertex from each part, the kk-Dimensional Matching problem asks whether there is a disjoint collection of the hyperedges which covers all vertices. We show it can be solved by a randomized polynomial space algorithm in O(2n(k2)/k)O^*(2^{n(k-2)/k}) time. The O()O^*() notation hides factors polynomial in nn and kk. The general Exact Cover by kk-Sets problem asks the same when the partition constraint is dropped and arbitrary hyperedges of cardinality kk are permitted. We show it can be solved by a randomized polynomial space algorithm in O(ckn)O^*(c_k^n) time, where c3=1.496,c4=1.642,c5=1.721c_3=1.496, c_4=1.642, c_5=1.721, and provide a general bound for larger kk. Both results substantially improve on the previous best algorithms for these problems, especially for small kk. They follow from the new observation that Lov\u27asz\u27 perfect matching detection via determinants (Lov\u27asz, 1979) admits an embedding in the recently proposed inclusion--exclusion counting scheme for set covers, emph{despite} its inability to count the perfect matchings

    Exact Tests via Complete Enumeration: A Distributed Computing Approach

    No full text
    The analysis of categorical data often leads to the analysis of a contingency table. For large samples, asymptotic approximations are sufficient when calculating p-values, but for small samples the tests can be unreliable. In these situations an exact test should be considered. This bases the test on the exact distribution of the test statistic. Sampling techniques can be used to estimate the distribution. Alternatively, the distribution can be found by complete enumeration. A new algorithm is developed that enables a model to be defined by a model matrix, and all tables that satisfy the model are found. This provides a more efficient enumeration mechanism for complex models and extends the range of models that can be tested. The technique can lead to large calculations and a distributed version of the algorithm is developed that enables a number of machines to work efficiently on the same problem

    Communication Efficient Algorithms for Generating Massive Networks

    Get PDF
    Massive complex systems are prevalent throughout all of our lives, from various biological systems as the human genome to technological networks such as Facebook or Twitter. Rapid advances in technology allow us to gather more and more data that is connected to these systems. Analyzing and extracting this huge amount of information is a crucial task for a variety of scientific disciplines. A common abstraction for handling complex systems are networks (graphs) made up of entities and their relationships. For example, we can represent wireless ad hoc networks in terms of nodes and their connections with each other.We then identify the nodes as vertices and their connections as edges between the vertices. This abstraction allows us to develop algorithms that are independent of the underlying domain. Designing algorithms for massive networks is a challenging task that requires thorough analysis and experimental evaluation. A major hurdle for this task is the scarcity of publicly available large-scale datasets. To approach this issue, we can make use of network generators [21]. These generators allow us to produce synthetic instances that exhibit properties found in many real-world networks. In this thesis we develop a set of novel graph generators that have a focus on scalability. In particular, we cover the classic Erd˝os-Rényi model, random geometric graphs and random hyperbolic graphs. These models represent different real-world systems, from the aforementioned wireless ad-hoc networks [40] to social networks [44].We ensure scalability by making use of pseudorandomization via hash functions and redundant computations. The resulting network generators are communication agnostic, i.e. they require no communication. This allows us to generate massive instances of up to 243 vertices and 247 edges in less than 22 minutes on 32:768 processors. In addition to proving theoretical bounds for each generator, we perform an extensive experimental evaluation. We cover both their sequential performance, as well as scaling behavior.We are able to show that our algorithms are competitive to state-of-the-art implementations found in network analysis libraries. Additionally, our generators exhibit near optimal scaling behavior for large instances. Finally, we show that pseudorandomization has little to no measurable impact on the quality of our generated instances

    Austrian High-Performance-Computing meeting (AHPC2020)

    Get PDF
    This booklet is a collection of abstracts presented at the AHPC conference

    Finding Statistically Significant Communities in Networks

    Get PDF
    Community structure is one of the main structural features of networks, revealing both their internal organization and the similarity of their elementary units. Despite the large variety of methods proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able to handle different types of datasets and the subtleties of community structure. In this paper we present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect clusters in networks accounting for edge directions, edge weights, overlapping communities, hierarchies and community dynamics. It is based on the local optimization of a fitness function expressing the statistical significance of clusters with respect to random fluctuations, which is estimated with tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of partitions/covers delivered by other techniques. We have also implemented sequential algorithms combining OSLOM with other fast techniques, so that the community structure of very large networks can be uncovered. Our method has a comparable performance as the best existing algorithms on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM is implemented in a freely available software (http://www.oslom.org), and we believe it will be a valuable tool in the analysis of networks

    Learning with mistures of trees

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1999.Includes bibliographical references (p. 125-129).by Marina Meilă-Predoviciu.Ph.D
    corecore