18,546 research outputs found
On the Properties of Gromov Matrices and their Applications in Network Inference
The spanning tree heuristic is a commonly adopted procedure in network
inference and estimation. It allows one to generalize an inference method
developed for trees, which is usually based on a statistically rigorous
approach, to a heuristic procedure for general graphs by (usually randomly)
choosing a spanning tree in the graph to apply the approach developed for
trees. However, there are an intractable number of spanning trees in a dense
graph. In this paper, we represent a weighted tree with a matrix, which we call
a Gromov matrix. We propose a method that constructs a family of Gromov
matrices using convex combinations, which can be used for inference and
estimation instead of a randomly selected spanning tree. This procedure
increases the size of the candidate set and hence enhances the performance of
the classical spanning tree heuristic. On the other hand, our new scheme is
based on simple algebraic constructions using matrices, and hence is still
computationally tractable. We discuss some applications on network inference
and estimation to demonstrate the usefulness of the proposed method
Estimating Infection Sources in Networks Using Partial Timestamps
We study the problem of identifying infection sources in a network based on
the network topology, and a subset of infection timestamps. In the case of a
single infection source in a tree network, we derive the maximum likelihood
estimator of the source and the unknown diffusion parameters. We then introduce
a new heuristic involving an optimization over a parametrized family of Gromov
matrices to develop a single source estimation algorithm for general graphs.
Compared with the breadth-first search tree heuristic commonly adopted in the
literature, simulations demonstrate that our approach achieves better
estimation accuracy than several other benchmark algorithms, even though these
require more information like the diffusion parameters. We next develop a
multiple sources estimation algorithm for general graphs, which first
partitions the graph into source candidate clusters, and then applies our
single source estimation algorithm to each cluster. We show that if the graph
is a tree, then each source candidate cluster contains at least one source.
Simulations using synthetic and real networks, and experiments using real-world
data suggest that our proposed algorithms are able to estimate the true
infection source(s) to within a small number of hops with a small portion of
the infection timestamps being observed.Comment: 15 pages, 15 figures, accepted by IEEE Transactions on Information
Forensics and Securit
D: Decentralized Training over Decentralized Data
While training a machine learning model using multiple workers, each of which
collects data from their own data sources, it would be most useful when the
data collected from different workers can be {\em unique} and {\em different}.
Ironically, recent analysis of decentralized parallel stochastic gradient
descent (D-PSGD) relies on the assumption that the data hosted on different
workers are {\em not too different}. In this paper, we ask the question: {\em
Can we design a decentralized parallel stochastic gradient descent algorithm
that is less sensitive to the data variance across workers?} In this paper, we
present D, a novel decentralized parallel stochastic gradient descent
algorithm designed for large data variance \xr{among workers} (imprecisely,
"decentralized" data). The core of D is a variance blackuction extension of
the standard D-PSGD algorithm, which improves the convergence rate from
to where
denotes the variance among data on different workers. As a result, D is
robust to data variance among workers. We empirically evaluated D on image
classification tasks where each worker has access to only the data of a limited
set of labels, and find that D significantly outperforms D-PSGD
- …