3,007 research outputs found
Phase Diagram and Approximate Message Passing for Blind Calibration and Dictionary Learning
We consider dictionary learning and blind calibration for signals and
matrices created from a random ensemble. We study the mean-squared error in the
limit of large signal dimension using the replica method and unveil the
appearance of phase transitions delimiting impossible, possible-but-hard and
possible inference regions. We also introduce an approximate message passing
algorithm that asymptotically matches the theoretical performance, and show
through numerical tests that it performs very well, for the calibration
problem, for tractable system sizes.Comment: 5 page
Reweighted belief propagation and quiet planting for random K-SAT
We study the random K-satisfiability problem using a partition function where
each solution is reweighted according to the number of variables that satisfy
every clause. We apply belief propagation and the related cavity method to the
reweighted partition function. This allows us to obtain several new results on
the properties of random K-satisfiability problem. In particular the
reweighting allows to introduce a planted ensemble that generates instances
that are, in some region of parameters, equivalent to random instances. We are
hence able to generate at the same time a typical random SAT instance and one
of its solutions. We study the relation between clustering and belief
propagation fixed points and we give a direct evidence for the existence of
purely entropic (rather than energetic) barriers between clusters in some
region of parameters in the random K-satisfiability problem. We exhibit, in
some large planted instances, solutions with a non-trivial whitening core; such
solutions were known to exist but were so far never found on very large
instances. Finally, we discuss algorithmic hardness of such planted instances
and we determine a region of parameters in which planting leads to satisfiable
benchmarks that, up to our knowledge, are the hardest known.Comment: 23 pages, 4 figures, revised for readability, stability expression
correcte
Non-adaptive pooling strategies for detection of rare faulty items
We study non-adaptive pooling strategies for detection of rare faulty items.
Given a binary sparse N-dimensional signal x, how to construct a sparse binary
MxN pooling matrix F such that the signal can be reconstructed from the
smallest possible number M of measurements y=Fx? We show that a very low number
of measurements is possible for random spatially coupled design of pools F. Our
design might find application in genetic screening or compressed genotyping. We
show that our results are robust with respect to the uncertainty in the matrix
F when some elements are mistaken.Comment: 5 page
Clustering from Sparse Pairwise Measurements
We consider the problem of grouping items into clusters based on few random
pairwise comparisons between the items. We introduce three closely related
algorithms for this task: a belief propagation algorithm approximating the
Bayes optimal solution, and two spectral algorithms based on the
non-backtracking and Bethe Hessian operators. For the case of two symmetric
clusters, we conjecture that these algorithms are asymptotically optimal in
that they detect the clusters as soon as it is information theoretically
possible to do so. We substantiate this claim for one of the spectral
approaches we introduce
Compressed Sensing of Approximately-Sparse Signals: Phase Transitions and Optimal Reconstruction
Compressed sensing is designed to measure sparse signals directly in a
compressed form. However, most signals of interest are only "approximately
sparse", i.e. even though the signal contains only a small fraction of relevant
(large) components the other components are not strictly equal to zero, but are
only close to zero. In this paper we model the approximately sparse signal with
a Gaussian distribution of small components, and we study its compressed
sensing with dense random matrices. We use replica calculations to determine
the mean-squared error of the Bayes-optimal reconstruction for such signals, as
a function of the variance of the small components, the density of large
components and the measurement rate. We then use the G-AMP algorithm and we
quantify the region of parameters for which this algorithm achieves optimality
(for large systems). Finally, we show that in the region where the GAMP for the
homogeneous measurement matrices is not optimal, a special "seeding" design of
a spatially-coupled measurement matrix allows to restore optimality.Comment: 8 pages, 10 figure
Champs Conditionnels Aléatoires pour l'Annotation d'Arbres
National audienceAvec en vue la transformation de documents semi-structurés de type XML, nous nous intéressons au problème de l'annotation de tels documents par apprentissage statistique, à partir d'exemples de documents déjà annotés. Afin de modéliser la probabilité d'une annotation connaissant un document, nous nous plaçons dans le cadre des champs conditionnels aléatoires. Ce modèle a déjà fait ses preuves pour l'annotation de séquences : nous l'adaptons ici aux arbres ordonnés d'arité non bornée. Nous étudions l'expressivité du nouveau modèle ainsi introduit en le comparant aux automates d'arbres stochastiques (ou grammaires régulières probabilistes d'arbres). Nous présentons aussi en détail l'algorithme de recherche de l'annotation la plus probable et l'algorithme d'inférence pour ce modèle. Ces algorithmes sont implantés dans une librairie Tree CRF écrite en JAVA. Ces travaux sont des préliminaires qui nous permettront par la suite d'étudier les applications du modèle pour la transformation de documents
- …