3,223 research outputs found
BacillOndex: An Integrated Data Resource for Systems and Synthetic Biology
BacillOndex is an extension of the Ondex data integration system, providing a semantically annotated, integrated knowledge base for the model Gram-positive bacterium Bacillus subtilis. This application allows a user to mine a variety of B. subtilis data sources, and analyse the resulting integrated dataset, which contains data about genes, gene products and their interactions. The data can be analysed either manually, by browsing using Ondex, or computationally via a Web services interface. We describe the process of creating a BacillOndex instance, and describe the use of the system for the analysis of single nucleotide polymorphisms in B. subtilis Marburg. The Marburg strain is the progenitor of the widely-used laboratory strain B. subtilis 168. We identified 27 SNPs with predictable phenotypic effects, including genetic traits for known phenotypes. We conclude that BacillOndex is a valuable tool for the systems-level investigation of, and hypothesis generation about, this important biotechnology workhorse. Such understanding contributes to our ability to construct synthetic genetic circuits in this organism
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
Infinite factorization of multiple non-parametric views
Combined analysis of multiple data sources has increasing application interest, in particular for distinguishing shared and source-specific aspects. We extend this rationale of classical canonical correlation analysis into a flexible, generative and non-parametric clustering
setting, by introducing a novel non-parametric hierarchical
mixture model. The lower level of the model describes each source with a flexible non-parametric mixture, and the top level combines these to describe commonalities of the sources. The lower-level clusters arise from hierarchical Dirichlet Processes, inducing an infinite-dimensional contingency table between the views. The commonalities between the sources are modeled by an infinite block
model of the contingency table, interpretable as non-negative factorization of infinite matrices, or as a prior for infinite contingency tables. With Gaussian mixture components plugged in for continuous measurements, the model is applied to two views of genes, mRNA expression and abundance of the produced proteins, to expose groups of genes that are co-regulated in either or both of the views.
Cluster analysis of co-expression is a standard simple way of screening for co-regulation, and the two-view analysis extends the approach to distinguishing between pre- and post-translational regulation
Recommended from our members
Systems biology in inflammatory bowel diseases
Purpose of review: Ulcerative colitis (UC) and Crohn’s Disease (CD) are the two predominant types of inflammatory bowel disease (IBD), affecting over 1.4 million individuals in the US. IBD results from complex interactions between pathogenic components, including genetic and epigenetic factors, the immune response and the microbiome through an unknown sequence of events. The purpose of this review is to describe a system biology approach to IBD as a novel and exciting methodology aiming at developing novel IBD therapeutics based on the integration of molecular and cellular "omics" data. Recent Findings: Recent evidence suggested the presence of genetic, epigenetic, transcriptomic, proteomic and metabolomic alterations in IBD patients. Furthermore, several studies have shown that different cell types, including fibroblasts, epithelial, immune and endothelial cells together with the intestinal microbiota are involved in IBD pathogenesis. Novel computational methodologies have been developed aiming to integrate high - throughput molecular data. Summary: A systems biology approach could potentially identify the central regulators (hubs) in the IBD interactome and improve our understanding of the molecular mechanisms involved in IBD pathogenesis. The future IBD therapeutics should be developed on the basis of targeting the central hubs in the IBD network
Telomeres in Evolution and Development from Biosemiotic Perspective
Telomeres identify natural chromosome ends being different from broken DNA through differences in their "molecular syntax" (M.Eigen) which determines the functions of reverse transcriptase and its integrated RNA template, telomerase. Although telomeres play a crucial role in the linear chromosome organization of eukaryotic cells, their molecular syntax descended from an ancient retroviral competence. This is an indicator for the early retroviral colonization of large double stranded DNA viruses, which are putative ancestors of the eukaryotic nucleus.
This talk will demonstrate certain advantages of the biosemiotic approach towards our evolutionary understanding of telomeres: focus on the genetic/genomic structures as language-like text which follows combinatorial (syntactic), context-sensitive (pragmatic) and
content-specific (semantic) semiotic rules. Genetic/genomic organization from the biosemiotic perspective is not seen any longer as an object of randomly derived alterations (mutations) but as functional innovation coherent with the broad variety of natural genome editing competences of viruses.

- …