92,302 research outputs found
Graph theoretic methods for the analysis of structural relationships in biological macromolecules
Subgraph isomorphism and maximum common subgraph isomorphism algorithms from graph theory provide an effective and an efficient way of identifying structural relationships between biological macromolecules. They thus provide a natural complement to the pattern matching algorithms that are used in bioinformatics to identify sequence relationships. Examples are provided of the use of graph theory to analyze proteins for which three-dimensional crystallographic or NMR structures are available, focusing on the use of the Bron-Kerbosch clique detection algorithm to identify common folding motifs and of the Ullmann subgraph isomorphism algorithm to identify patterns of amino acid residues. Our methods are also applicable to other types of biological macromolecule, such as carbohydrate and nucleic acid structures
A Factor Graph Approach to Automated GO Annotation
As volume of genomic data grows, computational methods become essential for providing a first glimpse onto gene annotations. Automated Gene Ontology (GO) annotation methods based on hierarchical ensemble classification techniques are particularly interesting when interpretability of annotation results is a main concern. In these methods, raw GO-term predictions computed by base binary classifiers are leveraged by checking the consistency of predefined GO relationships. Both formal leveraging strategies, with main focus on annotation precision, and heuristic alternatives, with main focus on scalability issues, have been described in literature. In this contribution, a factor graph approach to the hierarchical ensemble formulation of the automated GO annotation problem is presented. In this formal framework, a core factor graph is first built based on the GO structure and then enriched to take into account the noisy nature of GO-term predictions. Hence, starting from raw GO-term predictions, an iterative message passing algorithm between nodes of the factor graph is used to compute marginal probabilities of target GO-terms. Evaluations on Saccharomyces cerevisiae, Arabidopsis thaliana and Drosophila melanogaster protein sequences from the GO Molecular Function domain showed significant improvements over competing approaches, even when protein sequences were naively characterized by their physicochemical and secondary structure properties or when loose noisy annotation datasets were considered. Based on these promising results and using Arabidopsis thaliana annotation data, we extend our approach to the identification of most promising molecular function annotations for a set of proteins of unknown function in Solanum lycopersicum.Fil: Spetale, Flavio Ezequiel. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Krsticevic, Flavia Jorgelina. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Roda, Fernando. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; ArgentinaFil: Bulacio, Pilar Estela. Consejo Nacional de Investigaciones CientÃficas y Técnicas. Centro CientÃfico Tecnológico Conicet - Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas. Universidad Nacional de Rosario. Centro Internacional Franco Argentino de Ciencias de la Información y de Sistemas; Argentin
Probabilistic Graphical Model Representation in Phylogenetics
Recent years have seen a rapid expansion of the model space explored in
statistical phylogenetics, emphasizing the need for new approaches to
statistical model representation and software development. Clear communication
and representation of the chosen model is crucial for: (1) reproducibility of
an analysis, (2) model development and (3) software design. Moreover, a
unified, clear and understandable framework for model representation lowers the
barrier for beginners and non-specialists to grasp complex phylogenetic models,
including their assumptions and parameter/variable dependencies.
Graphical modeling is a unifying framework that has gained in popularity in
the statistical literature in recent years. The core idea is to break complex
models into conditionally independent distributions. The strength lies in the
comprehensibility, flexibility, and adaptability of this formalism, and the
large body of computational work based on it. Graphical models are well-suited
to teach statistical models, to facilitate communication among phylogeneticists
and in the development of generic software for simulation and statistical
inference.
Here, we provide an introduction to graphical models for phylogeneticists and
extend the standard graphical model representation to the realm of
phylogenetics. We introduce a new graphical model component, tree plates, to
capture the changing structure of the subgraph corresponding to a phylogenetic
tree. We describe a range of phylogenetic models using the graphical model
framework and introduce modules to simplify the representation of standard
components in large and complex models. Phylogenetic model graphs can be
readily used in simulation, maximum likelihood inference, and Bayesian
inference using, for example, Metropolis-Hastings or Gibbs sampling of the
posterior distribution
Graph Theory and Networks in Biology
In this paper, we present a survey of the use of graph theoretical techniques
in Biology. In particular, we discuss recent work on identifying and modelling
the structure of bio-molecular networks, as well as the application of
centrality measures to interaction networks and research on the hierarchical
structure of such networks and network motifs. Work on the link between
structural network properties and dynamics is also described, with emphasis on
synchronization and disease propagation.Comment: 52 pages, 5 figures, Survey Pape
Fine Structure of Viral dsDNA Encapsidation
In vivo configurations of dsDNA of bacteriophage viruses in a capsid are
known to form hexagonal chromonic liquid crystal phases. This article studies
the liquid crystal ordering of viral dsDNA in an icosahedral capsid, combining
the chromonic model with that of liquid crystals with variable degree of
orientation. The scalar order parameter of the latter allows us to distinguish
regions of the capsid with well-ordered DNA from the disordered central core.
We employ a state-of-the-art numerical algorithm based on the finite element
method to find equilibrium states of the encapsidated DNA and calculate the
corresponding pressure. With a data-oriented parameter selection strategy, the
method yields phase spaces of the pressure and the radius of the disordered
core, in terms of relevant dimensionless parameters, rendering the proposed
algorithm into a preliminary bacteriophage designing tool. The presence of the
order parameter also has the unique role of allowing for non-smooth capsid
domains as well as accounting for knot locations of the DNA
Assortative mixing in close-packed spatial networks
Background
In recent years, there is aroused interest in expressing complex systems as networks of interacting nodes. Using descriptors from graph theory, it has been possible to classify many diverse systems derived from social and physical sciences alike. In particular, folded proteins as examples of self-assembled complex molecules have also been investigated intensely using these tools. However, we need to develop additional measures to classify different systems, in order to dissect the underlying hierarchy.
Methodology and Principal Findings
In this study, a general analytical relation for the dependence of nearest neighbor degree correlations on degree is derived. Dependence of local clustering on degree is shown to be the sole determining factor of assortative versus disassortative mixing in networks. The characteristics of networks constructed from spatial atomic/molecular systems exemplified by self-organized residue networks built from folded protein structures and block copolymers, atomic clusters and well-compressed polymeric melts are studied. Distributions of statistical properties of the networks are presented. For these densely-packed systems, assortative mixing in the network construction is found to apply, and conditions are derived for a simple linear dependence.
Conclusions
Our analyses (i) reveal patterns that are common to close-packed clusters of atoms/molecules, (ii) identify the type of surface effects prominent in different close-packed systems, and (iii) associate fingerprints that may be used to classify networks with varying types of correlations
HID-1 controls formation of large dense core vesicles by influencing cargo sorting and trans-Golgi network acidification
Large dense core vesicles (LDCVs) mediate the regulated release of neuropeptides and peptide hormones. They form at the trans-Golgi network (TGN), where their soluble content aggregates to form a dense core, but the mechanisms controlling biogenesis are still not completely understood. Recent studies have implicated the peripheral membrane protein HID-1 in neuropeptide sorting and insulin secretion. Using CRISPR/Cas9, we generated HID-1 KO rat neuroendocrine cells, and we show that the absence of HID-1 results in specific defects in peptide hormone and monoamine storage and regulated secretion. Loss of HID-1 causes a reduction in the number of LDCVs and affects their morphology and biochemical properties, due to impaired cargo sorting and dense core formation. HID-1 KO cells also exhibit defects in TGN acidification together with mislocalization of the Golgi-enriched vacuolar H+-ATPase subunit isoform a2. We propose that HID-1 influences early steps in LDCV formation by controlling dense core formation at the TGN.</jats:p
Bridging Physics and Biology Teaching through Modeling
As the frontiers of biology become increasingly interdisciplinary, the
physics education community has engaged in ongoing efforts to make physics
classes more relevant to life sciences majors. These efforts are complicated by
the many apparent differences between these fields, including the types of
systems that each studies, the behavior of those systems, the kinds of
measurements that each makes, and the role of mathematics in each field.
Nonetheless, physics and biology are both sciences that rely on observations
and measurements to construct models of the natural world. In the present
theoretical article, we propose that efforts to bridge the teaching of these
two disciplines must emphasize shared scientific practices, particularly
scientific modeling. We define modeling using language common to both
disciplines and highlight how an understanding of the modeling process can help
reconcile apparent differences between the teaching of physics and biology. We
elaborate how models can be used for explanatory, predictive, and functional
purposes and present common models from each discipline demonstrating key
modeling principles. By framing interdisciplinary teaching in the context of
modeling, we aim to bridge physics and biology teaching and to equip students
with modeling competencies applicable across any scientific discipline.Comment: 10 pages, 2 figures, 3 table
- …