251 research outputs found
Graph Reconstruction with a Betweenness Oracle
Graph reconstruction algorithms seek to learn a hidden graph by repeatedly querying a black-box oracle for information about the graph structure. Perhaps the most well studied and applied version of the problem uses a distance oracle, which can report the shortest path distance between any pair of nodes.
We introduce and study the betweenness oracle, where bet(a, m, z) is true iff m lies on a shortest path between a and z. This oracle is strictly weaker than a distance oracle, in the sense that a betweenness query can be simulated by a constant number of distance queries, but not vice versa. Despite this, we are able to develop betweenness reconstruction algorithms that match the current state of the art for distance reconstruction, and even improve it for certain types of graphs. We obtain the following algorithms: (1) Reconstruction of general graphs in O(n^2) queries, (2) Reconstruction of degree-bounded graphs in ~O(n^{3/2}) queries, (3) Reconstruction of geodetic degree-bounded graphs in ~O(n) queries
In addition to being a fundamental graph theoretic problem with some natural applications, our new results shed light on some avenues for progress in the distance reconstruction problem
A Simple Algorithm for Graph Reconstruction
How efficiently can we find an unknown graph using distance queries between
its vertices? We assume that the unknown graph is connected, unweighted, and
has bounded degree. The goal is to find every edge in the graph. This problem
admits a reconstruction algorithm based on multi-phase Voronoi-cell
decomposition and using distance queries.
In our work, we analyze a simple reconstruction algorithm. We show that, on
random -regular graphs, our algorithm uses distance
queries. As by-products, we can reconstruct those graphs using
queries to an all-distances oracle or queries to a betweenness
oracle, and we bound the metric dimension of those graphs by .
Our reconstruction algorithm has a very simple structure, and is highly
parallelizable. On general graphs of bounded degree, our reconstruction
algorithm has subquadratic query complexity
Optimal distance query reconstruction for graphs without long induced cycles
Let be an -vertex connected graph of maximum degree .
Given access to and an oracle that given two vertices , returns
the shortest path distance between and , how many queries are needed to
reconstruct ? We give a simple deterministic algorithm to reconstruct trees
using distance queries and show that even
randomised algorithms need to use at least
queries in expectation. The best previous lower bound was an
information-theoretic lower bound of . Our lower
bound also extends to related query models including distance queries for
phylogenetic trees, membership queries for learning partitions and path queries
in directed trees.
We extend our deterministic algorithm to reconstruct graphs without induced
cycles of length at least using queries, which
includes various graph classes of interest such as chordal graphs, permutation
graphs and AT-free graphs. Since the previously best known randomised algorithm
for chordal graphs uses queries in expectation, we both
get rid off the randomness and get the optimal dependency in for chordal
graphs and various other graph classes.
Finally, we build on an algorithm of Kannan, Mathieu, and Zhou [ICALP, 2015]
to give a randomised algorithm for reconstructing graphs of treelength
using queries in expectation.Comment: 35 page
Tight query complexity bounds for learning graph partitions
Given a partition of a graph into connected components, the membership oracle
asserts whether any two vertices of the graph lie in the same component or not.
We prove that for , learning the components of an -vertex
hidden graph with components requires at least
membership queries. Our result improves on the best known information-theoretic
bound of queries, and exactly matches the query complexity of
the algorithm introduced by [Reyzin and Srivastava, 2007] for this problem.
Additionally, we introduce an oracle, with access to which one can learn the
number of components of in asymptotically fewer queries than learning the
full partition, thus answering another question posed by the same authors.
Lastly, we introduce a more applicable version of this oracle, and prove
asymptotically tight bounds of queries for both learning
and verifying an -edge hidden graph using it.Comment: Accepted for presentation at the 35th Annual Conference of Learning
Theory, 202
Phantom cascades: The effect of hidden nodes on information diffusion
Research on information diffusion generally assumes complete knowledge of the
underlying network. However, in the presence of factors such as increasing
privacy awareness, restrictions on application programming interfaces (APIs)
and sampling strategies, this assumption rarely holds in the real world which
in turn leads to an underestimation of the size of information cascades. In
this work we study the effect of hidden network structure on information
diffusion processes. We characterise information cascades through activation
paths traversing visible and hidden parts of the network. We quantify diffusion
estimation error while varying the amount of hidden structure in five empirical
and synthetic network datasets and demonstrate the effect of topological
properties on this error. Finally, we suggest practical recommendations for
practitioners and propose a model to predict the cascade size with minimal
information regarding the underlying network.Comment: Preprint submitted to Elsevier Computer Communication
Bipartite Graph Pre-training for Unsupervised Extractive Summarization with Graph Convolutional Auto-Encoders
Pre-trained sentence representations are crucial for identifying significant
sentences in unsupervised document extractive summarization. However, the
traditional two-step paradigm of pre-training and sentence-ranking, creates a
gap due to differing optimization objectives. To address this issue, we argue
that utilizing pre-trained embeddings derived from a process specifically
designed to optimize cohensive and distinctive sentence representations helps
rank significant sentences. To do so, we propose a novel graph pre-training
auto-encoder to obtain sentence embeddings by explicitly modelling
intra-sentential distinctive features and inter-sentential cohesive features
through sentence-word bipartite graphs. These pre-trained sentence
representations are then utilized in a graph-based ranking algorithm for
unsupervised summarization. Our method produces predominant performance for
unsupervised summarization frameworks by providing summary-worthy sentence
representations. It surpasses heavy BERT- or RoBERTa-based sentence
representations in downstream tasks.Comment: Accepted by the 2023 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2023
Mapping Networks via Parallel kth-Hop Traceroute Queries
?(v,w), which return the name of the kth vertex on a shortest path from v to w, where ?(v,w) is the distance between v and w, that is, the number of edges in a shortest-path from v to w. The traceroute command is often used for network mapping applications, the study of the connectivity of networks, and it has been studied theoretically with respect to biases it introduces for network mapping when only a subset of nodes in the network can be the source of traceroute queries. In this paper, we provide efficient network mapping algorithms, that are based on kth-hop traceroute queries. Our results include an algorithm that runs in a constant number of parallel rounds with a subquadratic number of queries under reasonable assumptions about the sampling coverage of the nodes that may issue kth-hop traceroute queries. In addition, we introduce a number of new algorithmic techniques, including a high-probability parametric parallelization of a graph clustering technique of Thorup and Zwick, which may be of independent interest
- …