Search CORE

23,175 research outputs found

Non-linear Log-Sobolev inequalities for the Potts semigroup and applications to reconstruction problems

Author: Gu Yuzhou
Polyanskiy Yury
Publication venue
Publication date: 11/05/2020
Field of study

Consider a Markov process with state space

[k]

, which jumps continuously to a new state chosen uniformly at random and regardless of the previous state. The collection of transition kernels (indexed by time

t\ge 0

) is the Potts semigroup. Diaconis and Saloff-Coste computed the maximum of the ratio of the relative entropy and the Dirichlet form obtaining the constant

\alpha_2

in the

2

-log-Sobolev inequality (

2

-LSI). In this paper, we obtain the best possible non-linear inequality relating entropy and the Dirichlet form (i.e.,

p

-NLSI,

p\ge1

). As an example, we show

\alpha_1 = 1+\frac{1+o(1)}{\log k}

. The more precise NLSIs have been shown by Polyanskiy and Samorodnitsky to imply various geometric and Fourier-analytic results. Beyond the Potts semigroup, we also analyze Potts channels -- Markov transition matrices

[k]\times [k]

constant on and off diagonal. (Potts semigroup corresponds to a (ferromagnetic) subset of matrices with positive second eigenvalue). By integrating the

1

-NLSI we obtain the new strong data processing inequality (SDPI), which in turn allows us to improve results on reconstruction thresholds for Potts models on trees. A special case is the problem of reconstructing color of the root of a

k

-colored tree given knowledge of colors of all the leaves. We show that to have a non-trivial reconstruction probability the branching number of the tree should be at least

\frac{\log k}{\log k - \log(k-1)} = (1-o(1))k\log k.

This extends previous results (of Sly and Bhatnagar et al.) to general trees, and avoids the need for any specialized arguments. Similarly, we improve the state-of-the-art on reconstruction threshold for the stochastic block model with

k

balanced groups, for all

k\ge 3

. These improvements advocate information-theoretic methods as a useful complement to the conventional techniques originating from the statistical physics

arXiv.org e-Print Archive

Community detection and stochastic block models: recent developments

Author: Abbe Emmanuel
Publication venue
Publication date: 29/03/2017
Field of study

The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed

arXiv.org e-Print Archive

Broadcasting on Random Directed Acyclic Graphs

Author: Makur Anuran
Mossel Elchanan
Polyanskiy Yury
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/03/2020
Field of study

We study a generalization of the well-known model of broadcasting on trees. Consider a directed acyclic graph (DAG) with a unique source vertex

X

, and suppose all other vertices have indegree

d\geq 2

. Let the vertices at distance

k

from

X

be called layer

k

. At layer

0

X

is given a random bit. At layer

k\geq 1

, each vertex receives

d

bits from its parents in layer

k-1

, which are transmitted along independent binary symmetric channel edges, and combines them using a

d

-ary Boolean processing function. The goal is to reconstruct

X

with probability of error bounded away from

1/2

using the values of all vertices at an arbitrarily deep layer. This question is closely related to models of reliable computation and storage, and information flow in biological networks. In this paper, we analyze randomly constructed DAGs, for which we show that broadcasting is only possible if the noise level is below a certain degree and function dependent critical threshold. For

d\geq 3

, and random DAGs with layer sizes

\Omega(\log k)

and majority processing functions, we identify the critical threshold. For

d=2

, we establish a similar result for NAND processing functions. We also prove a partial converse for odd

d\geq 3

illustrating that the identified thresholds are impossible to improve by selecting different processing functions if the decoder is restricted to using a single vertex. Finally, for any noise level, we construct explicit DAGs (using expander graphs) with bounded degree and layer sizes

\Theta(\log k)

admitting reconstruction. In particular, we show that such DAGs can be generated in deterministic quasi-polynomial time or randomized polylogarithmic time in the depth. These results portray a doubly-exponential advantage for storing a bit in DAGs compared to trees, where

d=1

but layer sizes must grow exponentially with depth in order to enable broadcasting.Comment: 33 pages, double column format. arXiv admin note: text overlap with arXiv:1803.0752

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Efficient size estimation and impossibility of termination in uniform dense population protocols

Author: Aspnes James
Aspnes James
Belleville Amanda
Berenbrink Petra
Doty David
Gasieniec Leszek
Srinivas Niranjan
Volterra Vito
Publication venue
Publication date: 28/07/2019
Field of study

We study uniform population protocols: networks of anonymous agents whose pairwise interactions are chosen at random, where each agent uses an identical transition algorithm that does not depend on the population size

n

. Many existing polylog

(n)

time protocols for leader election and majority computation are nonuniform: to operate correctly, they require all agents to be initialized with an approximate estimate of

n

(specifically, the exact value

\lfloor \log n \rfloor

). Our first main result is a uniform protocol for calculating

\log(n) \pm O(1)

with high probability in

O(\log^2 n)

time and

O(\log^4 n)

states (

O(\log \log n)

bits of memory). The protocol is converging but not terminating: it does not signal when the estimate is close to the true value of

\log n

. If it could be made terminating, this would allow composition with protocols, such as those for leader election or majority, that require a size estimate initially, to make them uniform (though with a small probability of failure). We do show how our main protocol can be indirectly composed with others in a simple and elegant way, based on the leaderless phase clock, demonstrating that those protocols can in fact be made uniform. However, our second main result implies that the protocol cannot be made terminating, a consequence of a much stronger result: a uniform protocol for any task requiring more than constant time cannot be terminating even with probability bounded above 0, if infinitely many initial configurations are dense: any state present initially occupies

\Omega(n)

agents. (In particular, no leader is allowed.) Crucially, the result holds no matter the memory or time permitted. Finally, we show that with an initial leader, our size-estimation protocol can be made terminating with high probability, with the same asymptotic time and space bounds.Comment: Using leaderless phase cloc

arXiv.org e-Print Archive

Crossref

Incompatibility boundaries for properties of community partitions

Author: Browet Arnaud
Hendrickx Julien M.
Sarlette Alain
Publication venue
Publication date: 01/01/2016
Field of study

We prove the incompatibility of certain desirable properties of community partition quality functions. Our results generalize the impossibility result of [Kleinberg 2003] by considering sets of weaker properties. In particular, we use an alternative notion to solve the central issue of the consistency property. (The latter means that modifying the graph in a way consistent with a partition should not have counterintuitive effects). Our results clearly show that community partition methods should not be expected to perfectly satisfy all ideally desired properties. We then proceed to show that this incompatibility no longer holds when slightly relaxed versions of the properties are considered, and we provide in fact examples of simple quality functions satisfying these relaxed properties. An experimental study of these quality functions shows a behavior comparable to established methods in some situations, but more debatable results in others. This suggests that defining a notion of good partition in communities probably requires imposing additional properties.Comment: 17 pages, 3 figure

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-MINES ParisTech

DIAL UCLouvain

Edge Label Inference in Generalized Stochastic Block Models: from Spectral Theory to Impossibility Results

Author: Edge Label
Inference Generalized Stochastic
Jiaming Xu
Jiaming Xu
Jiaming Xu
Marc Lelarge
Marc Lelarge
Marc Lelarge
Publication venue
Publication date: 01/01/2014
Field of study

The classical setting of community detection consists of networks exhibiting a clustered structure. To more accurately model real systems we consider a class of networks (i) whose edges may carry labels and (ii) which may lack a clustered structure. Specifically we assume that nodes possess latent attributes drawn from a general compact space and edges between two nodes are randomly generated and labeled according to some unknown distribution as a function of their latent attributes. Our goal is then to infer the edge label distributions from a partially observed network. We propose a computationally efficient spectral algorithm and show it allows for asymptotically correct inference when the average node degree could be as low as logarithmic in the total number of nodes. Conversely, if the average node degree is below a specific constant threshold, we show that no algorithm can achieve better inference than guessing without using the observations. As a byproduct of our analysis, we show that our model provides a general procedure to construct random graph models with a spectrum asymptotic to a pre-specified eigenvalue distribution such as a power-law distribution.Comment: 17 page

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes