138 research outputs found
Optimal Query Complexity for Reconstructing Hypergraphs
In this paper we consider the problem of reconstructing a hidden weighted
hypergraph of constant rank using additive queries. We prove the following: Let
be a weighted hidden hypergraph of constant rank with n vertices and
hyperedges. For any there exists a non-adaptive algorithm that finds the
edges of the graph and their weights using
additive queries. This solves the open problem in [S. Choi, J. H. Kim. Optimal
Query Complexity Bounds for Finding Graphs. {\em STOC}, 749--758,~2008].
When the weights of the hypergraph are integers that are less than
where is the rank of the hypergraph (and therefore for
unweighted hypergraphs) there exists a non-adaptive algorithm that finds the
edges of the graph and their weights using additive queries.
Using the information theoretic bound the above query complexities are tight
Recommended from our members
ALGORITHMS FOR MASSIVE, EXPENSIVE, OR OTHERWISE INCONVENIENT GRAPHS
A long-standing assumption common in algorithm design is that any part of the input is accessible at any time for unit cost. However, as we work with increasingly large data sets, or as we build smaller devices, we must revisit this assumption. In this thesis, I present some of my work on graph algorithms designed for circumstances where traditional assumptions about inputs do not apply. 1. Classical graph algorithms require direct access to the input graph and this is not feasible when the graph is too large to fit in memory. For computation on massive graphs we consider the dynamic streaming graph model. Given an input graph defined by as a stream of edge insertions and deletions, our goal is to approximate properties of this graph using space that is sublinear in the size of the stream. In this thesis, I present algorithms for approximating vertex connectivity, hypergraph edge connectivity, maximum coverage, unique coverage, and temporal connectivity in graph streams. 2. In certain applications the input graph is not explicitly represented, but its edges may be discovered via queries which require costly computation or measurement. I present two open-source systems which solve real-world problems via graph algorithms which may access their inputs only through costly edge queries. M ESH is a memory manager which compacts memory efficiently by finding an approximate graph matching subject to stringent time and edge query restrictions. PathCache is an efficiently scalable network measurement platform that outperforms the current state of the art
Supervised Hypergraph Reconstruction
We study an issue commonly seen with graph data analysis: many real-world
complex systems involving high-order interactions are best encoded by
hypergraphs; however, their datasets often end up being published or studied
only in the form of their projections (with dyadic edges). To understand this
issue, we first establish a theoretical framework to characterize this issue's
implications and worst-case scenarios. The analysis motivates our formulation
of the new task, supervised hypergraph reconstruction: reconstructing a
real-world hypergraph from its projected graph, with the help of some existing
knowledge of the application domain.
To reconstruct hypergraph data, we start by analyzing hyperedge distributions
in the projection, based on which we create a framework containing two modules:
(1) to handle the enormous search space of potential hyperedges, we design a
sampling strategy with efficacy guarantees that significantly narrows the space
to a smaller set of candidates; (2) to identify hyperedges from the candidates,
we further design a hyperedge classifier in two well-working variants that
capture structural features in the projection. Extensive experiments validate
our claims, approach, and extensions. Remarkably, our approach outperforms all
baselines by an order of magnitude in accuracy on hard datasets. Our code and
data can be downloaded from bit.ly/SHyRe
Finding Weighted Graphs by Combinatorial Search
We consider the problem of finding edges of a hidden weighted graph using a
certain type of queries. Let be a weighted graph with vertices. In the
most general setting, the vertices are known and no other information about
is given. The problem is finding all edges of and their weights using
additive queries, where, for an additive query, one chooses a set of vertices
and asks the sum of the weights of edges with both ends in the set. This model
has been extensively used in bioinformatics including genom sequencing.
Extending recent results of Bshouty and Mazzawi, and Choi and Kim, we present a
polynomial time randomized algorithm to find the hidden weighted graph when
the number of edges in is known to be at most and the weight
of each edge satisfies \ga \leq |w(e)|\leq \gb for fixed constants
\ga, \gb>0. The query complexity of the algorithm is , which is optimal up to a constant factor
Computing Exact Minimum Cuts Without Knowing the Graph
We give query-efficient algorithms for the global min-cut and the s-t cut problem in unweighted, undirected graphs. Our oracle model is inspired by the submodular function minimization problem:
on query S subset V, the oracle returns the size of the cut between S and V S.
We provide algorithms computing an exact minimum - cut in with ~{O}(n^{5/3}) queries, and computing an exact global minimum cut of G with only ~{O}(n) queries (while learning the graph requires ~{Theta}(n^2) queries)
Integration of Heterogeneous Databases: Discovery of Meta-Information and Maintenance of Schema-Restructuring Views
In today\u27s networked world, information is widely distributed across many independent databases in heterogeneous formats. Integrating such information is a difficult task and has been adressed by several projects. However, previous integration solutions, such as the EVE-Project, have several shortcomings. Database contents and structure change frequently, and users often have incomplete information about the data content and structure of the databases they use. When information from several such insufficiently described sources is to be extracted and integrated, two problems have to be solved: How can we discover the structure and contents of and interrelationships among unknown databases, and how can we provide durable integration views over several such databases? In this dissertation, we have developed solutions for those key problems in information integration. The first part of the dissertation addresses the fact that knowledge about the interrelationships between databases is essential for any attempt at solving the information integration problem. We are presenting an algorithm called FIND2 based on the clique-finding problem in graphs and k-uniform hypergraphs to discover redundancy relationships between two relations. Furthermore, the algorithm is enhanced by heuristics that significantly reduce the search space when necessary. Extensive experimental studies on the algorithm both with and without heuristics illustrate its effectiveness on a variety of real-world data sets. The second part of the dissertation addresses the durable view problem and presents the first algorithm for incremental view maintenance in schema-restructuring views. Such views are essential for the integration of heterogeneous databases. They are typically defined in schema-restructuring query languages like SchemaSQL, which can transform schema into data and vice versa, making traditional view maintenance based on differential queries impossible. Based on an existing algebra for SchemaSQL, we present an update propagation algorithm that propagates updates along the query algebra tree and prove its correctness. We also propose optimizations on our algorithm and present experimental results showing its benefits over view recomputation
Quantum machine learning: a classical perspective
Recently, increased computational power and data availability, as well as
algorithmic advances, have led machine learning techniques to impressive
results in regression, classification, data-generation and reinforcement
learning tasks. Despite these successes, the proximity to the physical limits
of chip fabrication alongside the increasing size of datasets are motivating a
growing number of researchers to explore the possibility of harnessing the
power of quantum computation to speed-up classical machine learning algorithms.
Here we review the literature in quantum machine learning and discuss
perspectives for a mixed readership of classical machine learning and quantum
computation experts. Particular emphasis will be placed on clarifying the
limitations of quantum algorithms, how they compare with their best classical
counterparts and why quantum resources are expected to provide advantages for
learning problems. Learning in the presence of noise and certain
computationally hard problems in machine learning are identified as promising
directions for the field. Practical questions, like how to upload classical
data into quantum form, will also be addressed.Comment: v3 33 pages; typos corrected and references adde
- …