30 research outputs found

    Improving variational methods via pairwise linear response identities

    Get PDF
    nference methods are often formulated as variational approximations: these approxima-tions allow easy evaluation of statistics by marginalization or linear response, but theseestimates can be inconsistent. We show that by introducing constraints on covariance, onecan ensure consistency of linear response with the variational parameters, and in so doinginference of marginal probability distributions is improved. For the Bethe approximationand its generalizations, improvements are achieved with simple choices of the constraints.The approximations are presented as variational frameworks; iterative procedures relatedto message passing are provided for finding the minim

    Message Passing for Optimization and Control of Power Grid: Model of Distribution System with Redundancy

    Full text link
    We use a power grid model with MM generators and NN consumption units to optimize the grid and its control. Each consumer demand is drawn from a predefined finite-size-support distribution, thus simulating the instantaneous load fluctuations. Each generator has a maximum power capability. A generator is not overloaded if the sum of the loads of consumers connected to a generator does not exceed its maximum production. In the standard grid each consumer is connected only to its designated generator, while we consider a more general organization of the grid allowing each consumer to select one generator depending on the load from a pre-defined consumer-dependent and sufficiently small set of generators which can all serve the load. The model grid is interconnected in a graph with loops, drawn from an ensemble of random bipartite graphs, while each allowed configuration of loaded links represent a set of graph covering trees. Losses, the reactive character of the grid and the transmission-level connections between generators (and many other details relevant to realistic power grid) are ignored in this proof-of-principles study. We focus on the asymptotic limit and we show that the interconnects allow significant expansion of the parameter domains for which the probability of a generator overload is asymptotically zero. Our construction explores the formal relation between the problem of grid optimization and the modern theory of sparse graphical models. We also design heuristic algorithms that achieve the asymptotically optimal selection of loaded links. We conclude discussing the ability of this approach to include other effects, such as a more realistic modeling of the power grid and related optimization and control algorithms.Comment: 10 page

    Using Constraint Satisfaction Techniques and Variational Methods for Probabilistic Reasoning

    Get PDF
    RÉSUMÉ Cette thèse présente un certain nombre de contributions à la recherche pour la création de systèmes efficaces de raisonnement probabiliste sur les modèles graphiques de problèmes issus d'une variété d'applications scientifiques et d'ingénierie. Ce thème touche plusieurs sous-disciplines de l'intelligence artificielle. Généralement, la plupart de ces problèmes ont des modèles graphiques expressifs qui se traduisent par de grands réseaux impliquant déterminisme et des cycles, ce qui représente souvent un goulot d'étranglement pour tout système d'inférence probabiliste et affaiblit son exactitude ainsi que son évolutivité. Conceptuellement, notre recherche confirme les hypothèses suivantes. D'abord, les techniques de satisfaction de contraintes et méthodes variationnelles peuvent être exploitées pour obtenir des algorithmes précis et évolutifs pour l'inférence probabiliste en présence de cycles et de déterminisme. Deuxièmement, certaines parties intrinsèques de la structure du modèle graphique peuvent se révéler bénéfiques pour l'inférence probabiliste sur les grands modèles graphiques, au lieu de poser un défi important pour elle. Troisièmement, le re-paramétrage du modèle graphique permet d'ajouter à sa structure des caractéristiques puissantes qu'on peut utiliser pour améliorer l'inférence probabiliste. La première contribution majeure de cette thèse est la formulation d'une nouvelle approche de passage de messages (message-passing) pour inférer dans un graphe de facteurs étendu qui combine des techniques de satisfaction de contraintes et des méthodes variationnelles. Contrairement au message-passing standard, il formule sa structure sous forme d'étapes de maximisation de l'espérance variationnelle. Ainsi, on a de nouvelles règles de mise à jour des marginaux qui augmentent une borne inférieure à chaque mise à jour de manière à éviter le dépassement d'un point fixe. De plus, lors de l'étape d'espérance, nous mettons à profit les structures locales dans le graphe de facteurs en utilisant la cohérence d'arc généralisée pour effectuer une approximation de champ moyen variationnel. La deuxième contribution majeure est la formulation d'une stratégie en deux étapes qui utilise le déterminisme présent dans la structure du modèle graphique pour améliorer l'évolutivité du problème d'inférence probabiliste. Dans cette stratégie, nous prenons en compte le fait que si le modèle sous-jacent implique des contraintes inviolables en plus des préférences, alors c'est potentiellement un gaspillage d'allouer de la mémoire pour toutes les contraintes à l'avance lors de l'exécution de l'inférence. Pour éviter cela, nous commençons par la relaxation des préférences et effectuons l'inférence uniquement avec les contraintes inviolables. Cela permet d'éviter les calculs inutiles impliquant les préférences et de réduire la taille effective du réseau graphique. Enfin, nous développons une nouvelle famille d'algorithmes d'inférence par le passage de messages dans un graphe de facteurs étendus, paramétrées par un facteur de lissage (smoothing parameter). Cette famille permet d'identifier les épines dorsales (backbones) d'une grappe qui contient des solutions potentiellement optimales. Ces épines dorsales ne sont pas seulement des parties des solutions optimales, mais elles peuvent également être exploitées pour intensifier l'inférence MAP en les fixant de manière itérative afin de réduire les parties complexes jusqu'à ce que le réseau se réduise à un seul qui peut être résolu avec précision en utilisant une méthode MAP d'inférence classique. Nous décrivons ensuite des variantes paresseuses de cette famille d'algorithmes. Expérimentalement, une évaluation empirique approfondie utilisant des applications du monde réel démontre la précision, la convergence et l'évolutivité de l'ensemble de nos algorithmes et stratégies par rapport aux algorithmes d'inférence existants de l'état de l'art.----------ABSTRACT This thesis presents a number of research contributions pertaining to the theme of creating efficient probabilistic reasoning systems based on graphical models of real-world problems from relational domains. These models arise in a variety of scientific and engineering applications. Thus, the theme impacts several sub-disciplines of Artificial Intelligence. Commonly, most of these problems have expressive graphical models that translate into large probabilistic networks involving determinism and cycles. Such graphical models frequently represent a bottleneck for any probabilistic inference system and weaken its accuracy and scalability. Conceptually, our research here hypothesizes and confirms that: First, constraint satisfaction techniques and variational methods can be exploited to yield accurate and scalable algorithms for probabilistic inference in the presence of cycles and determinism. Second, some intrinsic parts of the structure of the graphical model can turn out to be beneficial to probabilistic inference on large networks, instead of posing a significant challenge to it. Third, the proper re-parameterization of the graphical model can provide its structure with characteristics that we can use to improve probabilistic inference. The first major contribution of this thesis is the formulation of a novel message-passing approach to inference in an extended factor graph that combines constraint satisfaction techniques with variational methods. In contrast to standard message-passing, it formulates the Message-Passing structure as steps of variational expectation maximization. Thus it has new marginal update rules that increase a lower bound at each marginal update in a way that avoids overshooting a fixed point. Moreover, in its expectation step, we leverage the local structures in the factor graph by using generalized arc consistency to perform a variational mean-field approximation. The second major contribution is the formulation of a novel two-stage strategy that uses the determinism present in the graphical model's structure to improve the scalability of probabilistic inference. In this strategy, we take into account the fact that if the underlying model involves mandatory constraints as well as preferences then it is potentially wasteful to allocate memory for all constraints in advance when performing inference. To avoid this, we start by relaxing preferences and performing inference with hard constraints only. This helps avoid irrelevant computations involving preferences, and reduces the effective size of the graphical network. Finally, we develop a novel family of message-passing algorithms for inference in an extended factor graph, parameterized by a smoothing parameter. This family allows one to find the ”backbones” of a cluster that involves potentially optimal solutions. The cluster's backbones are not only portions of the optimal solutions, but they also can be exploited for scaling MAP inference by iteratively fixing them to reduce the complex parts until the network is simplified into one that can be solved accurately using any conventional MAP inference method. We then describe lazy variants of this family of algorithms. One limiting case of our approach corresponds to lazy survey propagation, which in itself is novel method which can yield state of the art performance. We provide a thorough empirical evaluation using real-world applications. Our experiments demonstrate improvements to the accuracy, convergence and scalability of all our proposed algorithms and strategies over existing state-of-the-art inference algorithms

    Belief Propagation for Min-Cost Network Flow: Convergence and Correctness

    Get PDF
    Distributed, iterative algorithms operating with minimal data structure while performing little computation per iteration are popularly known as message passing in the recent literature. Belief propagation (BP), a prototypical message-passing algorithm, has gained a lot of attention across disciplines, including communications, statistics, signal processing, and machine learning as an attractive, scalable, general-purpose heuristic for a wide class of optimization and statistical inference problems. Despite its empirical success, the theoretical understanding of BP is far from complete. With the goal of advancing the state of art of our understanding of BP, we study the performance of BP in the context of the capacitated minimum-cost network flow problem—a cornerstone in the development of the theory of polynomial-time algorithms for optimization problems and widely used in the practice of operations research. As the main result of this paper, we prove that BP converges to the optimal solution in pseudopolynomial time, provided that the optimal solution of the underlying network flow problem instance is unique and the problem parameters are integral. We further provide a simple modification of the BP to obtain a fully polynomial-time randomized approximation scheme (FPRAS) without requiring uniqueness of the optimal solution. This is the first instance where BP is proved to have fully polynomial running time. Our results thus provide a theoretical justification for the viability of BP as an attractive method to solve an important class of optimization problems.National Science Foundation (U.S.). Career Project (CNS 0546590)Natural Sciences and Engineering Research Council of Canada (NSERC). Postdoctoral FellowshipNational Science Foundation (U.S.). EMT Project (CCF 0829893)National Science Foundation (U.S.). (CMMI-0726733

    Graphical models beyond standard settings: lifted decimation, labeling, and counting

    Get PDF
    With increasing complexity and growing problem sizes in AI and Machine Learning, inference and learning are still major issues in Probabilistic Graphical Models (PGMs). On the other hand, many problems are specified in such a way that symmetries arise from the underlying model structure. Exploiting these symmetries during inference, which is referred to as "lifted inference", has lead to significant efficiency gains. This thesis provides several enhanced versions of known algorithms that show to be liftable too and thereby applies lifting in "non-standard" settings. By doing so, the understanding of the applicability of lifted inference and lifting in general is extended. Among various other experiments, it is shown how lifted inference in combination with an innovative Web-based data harvesting pipeline is used to label author-paper-pairs with geographic information in online bibliographies. This results is a large-scale transnational bibliography containing affiliation information over time for roughly one million authors. Analyzing this dataset reveals the importance of understanding count data. Although counting is done literally everywhere, mainstream PGMs have widely been neglecting count data. In the case where the ranges of the random variables are defined over the natural numbers, crude approximations to the true distribution are often made by discretization or a Gaussian assumption. To handle count data, Poisson Dependency Networks (PDNs) are introduced which presents a new class of non-standard PGMs naturally handling count data

    Statistical physics of subgraph identification problem

    Get PDF

    Statistical physics of neural systems

    Get PDF
    The ability of processing and storing information is considered a characteristic trait of intelligent systems. In biological neural networks, learning is strongly believed to take place at the synaptic level, in terms of modulation of synaptic efficacy. It can be thus interpreted as the expression of a collective phenomena, emerging when neurons connect each other in constituting a complex network of interactions. In this work, we represent learning as an optimization problem, actually implementing a local search, in the synaptic space, of specific configurations, known as solutions and making a neural network able to accomplish a series of different tasks. For instance, we would like the network to adapt the strength of its synaptic connections, in order to be capable of classifying a series of objects, by assigning to each object its corresponding class-label. Supported by a series of experiments, it has been suggested that synapses may exploit a very few number of synaptic states for encoding information. It is known that this feature makes learning in neural networks a challenging task. Extending the large deviation analysis performed in the extreme case of binary synaptic couplings, in this work, we prove the existence of regions of the phase space, where solutions are organized in extremely dense clusters. This picture turns out to be invariant to the tuning of all the parameters of the model. Solutions within the clusters are more robust to noise, thus enhancing the learning performances. This has inspired the design of new learning algorithms, as well as it has clarified the effectiveness of the previously proposed ones. We further provide quantitative evidence that the gain achievable when considering a greater number of available synaptic states for encoding information, is consistent only up to a very few number of bits. This is in line with the above mentioned experimental results. Besides the challenging aspect of low precision synaptic connections, it is also known that the neuronal environment is extremely noisy. Whether stochasticity can enhance or worsen the learning performances is currently matter of debate. In this work, we consider a neural network model where the synaptic connections are random variables, sampled according to a parametrized probability distribution. We prove that, this source of stochasticity naturally drives towards regions of the phase space at high densities of solutions. These regions are directly accessible by means of gradient descent strategies, over the parameters of the synaptic couplings distribution. We further set up a statistical physics analysis, through which we show that solutions in the dense regions are characterized by robustness and good generalization performances. Stochastic neural networks are also capable of building abstract representations of input stimuli and then generating new input samples, according to the inferred statistics of the input signal. In this regard, we propose a new learning rule, called Delayed Correlation Matching (DCM), that relying on the matching between time-delayed activity correlations, makes a neural network able to store patterns of neuronal activity. When considering hidden neuronal states, the DCM learning rule is also able to train Restricted Boltzmann Machines as generative models. In this work, we further require the DCM learning rule to fulfil some biological constraints, such as locality, sparseness of the neural coding and the Dale’s principle. While retaining all these biological requirements, the DCM learning rule has shown to be effective for different network topologies, and in both on-line learning regimes and presence of correlated patterns. We further show that it is also able to prevent the creation of spurious attractor states

    Out of equilibrium Statistical Physics of learning

    Get PDF
    In the study of hard optimization problems, it is often unfeasible to achieve a full analytic control on the dynamics of the algorithmic processes that find solutions efficiently. In many cases, a static approach is able to provide considerable insight into the dynamical properties of these algorithms: in fact, the geometrical structures found in the energetic landscape can strongly affect the stationary states and the optimal configurations reached by the solvers. In this context, a classical Statistical Mechanics approach, relying on the assumption of the asymptotic realization of a Boltzmann Gibbs equilibrium, can yield misleading predictions when the studied algorithms comprise some stochastic components that effectively drive these processes out of equilibrium. Thus, it becomes necessary to develop some intuition on the relevant features of the studied phenomena and to build an ad hoc Large Deviation analysis, providing a more targeted and richer description of the geometrical properties of the landscape. The present thesis focuses on the study of learning processes in Artificial Neural Networks, with the aim of introducing an out of equilibrium statistical physics framework, based on the introduction of a local entropy potential, for supporting and inspiring algorithmic improvements in the field of Deep Learning, and for developing models of neural computation that can carry both biological and engineering interest
    corecore