38 research outputs found
The cavity approach for Steiner trees packing problems
The Belief Propagation approximation, or cavity method, has been recently
applied to several combinatorial optimization problems in its zero-temperature
implementation, the max-sum algorithm. In particular, recent developments to
solve the edge-disjoint paths problem and the prize-collecting Steiner tree
problem on graphs have shown remarkable results for several classes of graphs
and for benchmark instances. Here we propose a generalization of these
techniques for two variants of the Steiner trees packing problem where multiple
"interacting" trees have to be sought within a given graph. Depending on the
interaction among trees we distinguish the vertex-disjoint Steiner trees
problem, where trees cannot share nodes, from the edge-disjoint Steiner trees
problem, where edges cannot be shared by trees but nodes can be members of
multiple trees. Several practical problems of huge interest in network design
can be mapped into these two variants, for instance, the physical design of
Very Large Scale Integration (VLSI) chips. The formalism described here relies
on two components edge-variables that allows us to formulate a massage-passing
algorithm for the V-DStP and two algorithms for the E-DStP differing in the
scaling of the computational time with respect to some relevant parameters. We
will show that one of the two formalisms used for the edge-disjoint variant
allow us to map the max-sum update equations into a weighted maximum matching
problem over proper bipartite graphs. We developed a heuristic procedure based
on the max-sum equations that shows excellent performance in synthetic networks
(in particular outperforming standard multi-step greedy procedures by large
margins) and on large benchmark instances of VLSI for which the optimal
solution is known, on which the algorithm found the optimum in two cases and
the gap to optimality was never larger than 4 %
Statistical mechanics approaches to optimization and inference
Nowadays, typical methodologies employed in statistical physics are successfully applied to a huge set of problems arising from different research fields. In this thesis I will propose several statistical mechanics based models able to deal with two types of problems: optimization and inference problems. The intrinsic difficulty that characterizes both problems is that, due to the hard combinatorial nature of optimization and inference, finding exact solutions would require hard and impractical computations. In fact, the time needed to perform these calculations, in almost all cases, scales exponentially with respect to relevant parameters of the system and thus cannot be accomplished in practice. As combinatorial optimization addresses the problem of finding a fair configuration of variables able to minimize/maximize an objective function, inference seeks a posteriori the most fair assignment of a set of variables given a partial knowledge of the system. These two problems can be re-phrased in a statistical mechanics framework where elementary components of a physical system interact according to the constraints of the original problem. The information at our disposal can be encoded in the Boltzmann distribution of the new variables which, if properly investigated, can provide the solutions to the original problems. As a consequence, the methodologies originally adopted in statistical mechanics to study and, eventually, approximate the Boltzmann distribution can be fruitfully applied for solving inference and optimization problems.
The structure of the thesis follows the path covered during the three years of my Ph.D. At first, I will propose a set of combinatorial optimization problems on graphs, the Prize collecting and the Packing of Steiner trees problems. The tools used to face these hard problems rely on the zero-temperature implementation of the Belief Propagation algorithm, called Max Sum algorithm. The second set of problems proposed in this thesis falls under the name of linear estimation problems. One of them, the compressed sensing problem, will guide us in the modelling of these problems within a Bayesian framework along with the introduction of a powerful algorithm known as Expectation Propagation or Expectation Consistent in statistical physics. I will propose a similar approach to other challenging problems: the inference of metabolic fluxes, the inverse problem of the electro-encephalography and the reconstruction of tomographic images
An analytic approximation of the feasible space of metabolic networks
Assuming a steady-state condition within a cell, metabolic fluxes satisfy an
under-determined linear system of stoichiometric equations. Characterizing the
space of fluxes that satisfy such equations along with given bounds (and
possibly additional relevant constraints) is considered of utmost importance
for the understanding of cellular metabolism. Extreme values for each
individual flux can be computed with Linear Programming (as Flux Balance
Analysis), and their marginal distributions can be approximately computed with
Monte-Carlo sampling. Here we present an approximate analytic method for the
latter task based on Expectation Propagation equations that does not involve
sampling and can achieve much better predictions than other existing analytic
methods. The method is iterative, and its computation time is dominated by one
matrix inversion per iteration. With respect to sampling, we show through
extensive simulation that it has some advantages including computation time,
and the ability to efficiently fix empirically estimated distributions of
fluxes
Compressed sensing reconstruction using Expectation Propagation
Many interesting problems in fields ranging from telecommunications to
computational biology can be formalized in terms of large underdetermined
systems of linear equations with additional constraints or regularizers. One of
the most studied ones, the Compressed Sensing problem (CS), consists in finding
the solution with the smallest number of non-zero components of a given system
of linear equations for known
measurement vector and sensing matrix . Here, we
will address the compressed sensing problem within a Bayesian inference
framework where the sparsity constraint is remapped into a singular prior
distribution (called Spike-and-Slab or Bernoulli-Gauss). Solution to the
problem is attempted through the computation of marginal distributions via
Expectation Propagation (EP), an iterative computational scheme originally
developed in Statistical Physics. We will show that this strategy is
comparatively more accurate than the alternatives in solving instances of CS
generated from statistically correlated measurement matrices. For computational
strategies based on the Bayesian framework such as variants of Belief
Propagation, this is to be expected, as they implicitly rely on the hypothesis
of statistical independence among the entries of the sensing matrix. Perhaps
surprisingly, the method outperforms uniformly also all the other
state-of-the-art methods in our tests.Comment: 20 pages, 6 figure
DCAlign v1.0: Aligning biological sequences using co-evolution models and informed priors
DCAlign is a new alignment method able to cope with the conservation and the
co-evolution signals that characterize the columns of multiple sequence
alignments of homologous sequences. However, the pre-processing steps required
to align a candidate sequence are computationally demanding. We show in v1.0
how to dramatically reduce the overall computing time by including an empirical
prior over an informative set of variables mirroring the presence of insertions
and deletions
Optimal metabolic strategies for microbial growth in stationary random environments
In order to grow in any given environment, bacteria need to collect
information about the medium composition and implement suitable growth
strategies by adjusting their regulatory and metabolic degrees of freedom. In
the standard sense, optimal strategy selection is achieved when bacteria grow
at the fastest rate possible in that medium. While this view of optimality is
well suited for cells that have perfect knowledge about their surroundings
(e.g. nutrient levels), things are more involved in uncertain or fluctuating
conditions, especially when changes occur over timescales comparable to (or
faster than) those required to organize a response. Information theory however
provides recipes for how cells can choose the optimal growth strategy under
uncertainty about the stress levels they will face. Here we analyse the
theoretically optimal scenarios for a coarse-grained, experiment-inspired model
of bacterial metabolism for growth in a medium described by the (static)
probability density of a single variable (the `stress level'). We show that
heterogeneity in growth rates consistently emerges as the optimal response when
the environment is sufficiently complex and/or when perfect adjustment of
metabolic degrees of freedom is not possible (e.g. due to limited resources).
In addition, outcomes close to those achievable with unlimited resources are
often attained effectively with a modest amount of fine-tuning. In other terms,
heterogeneous population structures in complex media may be rather robust with
respect to the amounts of cellular resources available to probe the environment
and adjust reaction rates
Non-convex image reconstruction via Expectation Propagation
Tomographic image reconstruction can be mapped to a problem of finding
solutions to a large system of linear equations which maximize a function that
includes \textit{a priori} knowledge regarding features of typical images such
as smoothness or sharpness. This maximization can be performed with standard
local optimization tools when the function is concave, but it is generally
intractable for realistic priors, which are non-concave. We introduce a new
method to reconstruct images obtained from Radon projections by using
Expectation Propagation, which allows us to reframe the problem from an
Bayesian inference perspective. We show, by means of extensive simulations,
that, compared to state-of-the-art algorithms for this task, Expectation
Propagation paired with very simple but non log-concave priors, is often able
to reconstruct images up to a smaller error while using a lower amount of
information per pixel. We provide estimates for the critical rate of
information per pixel above which recovery is error-free by means of
simulations on ensembles of phantom and real images.Comment: 12 pages, 6 figure
Aligning biological sequences by exploiting residue conservation and coevolution
Sequences of nucleotides (for DNA and RNA) or amino acids (for proteins) are
central objects in biology. Among the most important computational problems is
that of sequence alignment, i.e. arranging sequences from different organisms
in such a way to identify similar regions, to detect evolutionary relationships
between sequences, and to predict biomolecular structure and function. This is
typically addressed through profile models, which capture
position-specificities like conservation in sequences, but assume an
independent evolution of different positions. Over the last years, it has been
well established that coevolution of different amino-acid positions is
essential for maintaining three-dimensional structure and function. Modeling
approaches based on inverse statistical physics can catch the coevolution
signal in sequence ensembles; and they are now widely used in predicting
protein structure, protein-protein interactions, and mutational landscapes.
Here, we present DCAlign, an efficient alignment algorithm based on an
approximate message-passing strategy, which is able to overcome the limitations
of profile models, to include coevolution among positions in a general way, and
to be therefore universally applicable to protein- and RNA-sequence alignment
without the need of using complementary structural information. The potential
of DCAlign is carefully explored using well-controlled simulated data, as well
as real protein and RNA sequences.Comment: 20 pages, 11 figures + Supplementary Informatio
Cannabinoids and their therapeutic applications in mental disorders
Open Access via PubMed Acknowledgments/Disclosures: This work was supported in part by funds from the Department of Biomedical Sciences Project (RICDIP_2012_Fratta_01), University of Cagliari. The authors declare no conflict of interest.Peer reviewedPublisher PD
Relationship between fitness and heterogeneity in exponentially growing microbial populations
Despite major environmental and genetic differences, microbial metabolic networks are known to generate consistent physiological outcomes across vastly different organisms. This remarkable robustness suggests that, at least in bacteria, metabolic activity may be guided by universal principles. The constrained optimization of evolutionarily motivated objective functions, such as the growth rate, has emerged as the key theoretical assumption for the study of bacterial metabolism. While conceptually and practically useful in many situations, the idea that certain functions are optimized is hard to validate in data. Moreover, it is not always clear how optimality can be reconciled with the high degree of single-cell variability observed in experiments within microbial populations. To shed light on these issues, we develop an inverse modeling framework that connects the fitness of a population of cells (represented by the mean single-cell growth rate) to the underlying metabolic variability through the maximum entropy inference of the distribution of metabolic phenotypes from data. While no clear objective function emerges, we find that, as the medium gets richer, the fitness and inferred variability for Escherichia coli populations follow and slowly approach the theoretically optimal bound defined by minimal reduction of variability at given fitness. These results suggest that bacterial metabolism may be crucially shaped by a population-level trade-off between growth and heterogeneity