63 research outputs found

    Emerging semantics to link phenotype and environment

    Get PDF
    abstract: Understanding the interplay between environmental conditions and phenotypes is a fundamental goal of biology. Unfortunately, data that include observations on phenotype and environment are highly heterogeneous and thus difficult to find and integrate. One approach that is likely to improve the status quo involves the use of ontologies to standardize and link data about phenotypes and environments. Specifying and linking data through ontologies will allow researchers to increase the scope and flexibility of large-scale analyses aided by modern computing methods. Investments in this area would advance diverse fields such as ecology, phylogenetics, and conservation biology. While several biological ontologies are well-developed, using them to link phenotypes and environments is rare because of gaps in ontological coverage and limits to interoperability among ontologies and disciplines. In this manuscript, we present (1) use cases from diverse disciplines to illustrate questions that could be answered more efficiently using a robust linkage between phenotypes and environments, (2) two proof-of-concept analyses that show the value of linking phenotypes to environments in fishes and amphibians, and (3) two proposed example data models for linking phenotypes and environments using the extensible observation ontology (OBOE) and the Biological Collections Ontology (BCO); these provide a starting point for the development of a data model linking phenotypes and environments.The final version of this article, as published in PeerJ, can be viewed online at: https://peerj.com/articles/1470

    Genome sequence of adherent-invasive Escherichia coli and comparative genomic analysis with other E. coli pathotypes

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Adherent and invasive <it>Escherichia coli </it>(AIEC) are commonly found in ileal lesions of Crohn's Disease (CD) patients, where they adhere to intestinal epithelial cells and invade into and survive in epithelial cells and macrophages, thereby gaining access to a typically restricted host niche. Colonization leads to strong inflammatory responses in the gut suggesting that AIEC could play a role in CD immunopathology. Despite extensive investigation, the genetic determinants accounting for the AIEC phenotype remain poorly defined. To address this, we present the complete genome sequence of an AIEC, revealing the genetic blueprint for this disease-associated <it>E. coli </it>pathotype.</p> <p>Results</p> <p>We sequenced the complete genome of <it>E. coli </it>NRG857c (O83:H1), a clinical isolate of AIEC from the ileum of a Crohn's Disease patient. Our sequence data confirmed a phylogenetic linkage between AIEC and extraintestinal pathogenic <it>E. coli </it>causing urinary tract infections and neonatal meningitis. The comparison of the NRG857c AIEC genome with other pathogenic and commensal <it>E. coli </it>allowed for the identification of unique genetic features of the AIEC pathotype, including 41 genomic islands, and unique genes that are found only in strains exhibiting the adherent and invasive phenotype.</p> <p>Conclusions</p> <p>Up to now, the virulence-like features associated with AIEC are detectable only phenotypically. AIEC genome sequence data will facilitate the identification of genetic determinants implicated in invasion and intracellular growth, as well as enable functional genomic studies of AIEC gene expression during health and disease.</p

    Systems biology approaches to a rational drug discovery paradigm

    Full text link
    The published manuscript is available at EurekaSelect via http://www.eurekaselect.com/openurl/content.php?genre=article&doi=10.2174/1568026615666150826114524.Prathipati P., Mizuguchi K.. Systems biology approaches to a rational drug discovery paradigm. Current Topics in Medicinal Chemistry, 16, 9, 1009. https://doi.org/10.2174/1568026615666150826114524

    Genome-wide expression profiling and bioinformatics analysis of diurnally regulated genes in the mouse prefrontal cortex

    Get PDF
    BACKGROUND: The prefrontal cortex is important in regulating sleep and mood. Diurnally regulated genes in the prefrontal cortex may be controlled by the circadian system, by sleep:wake states, or by cellular metabolism or environmental responses. Bioinformatics analysis of these genes will provide insights into a wide-range of pathways that are involved in the pathophysiology of sleep disorders and psychiatric disorders with sleep disturbances. RESULTS: We examined gene expression in the mouse prefrontal cortex at four time points during a 24 hour (12 hour light:12 hour dark) cycle using microarrays, and identified 3,890 transcripts corresponding to 2,927 genes with diurnally regulated expression patterns. We show that 16% of the genes identified in our study are orthologs of identified clock, clock controlled or sleep/wakefulness induced genes in the mouse liver and suprachiasmatic nucleus, rat cortex and cerebellum, or Drosophila head. The diurnal expression patterns were confirmed for 16 out of 18 genes in an independent set of RNA samples. The diurnal genes fall into eight temporal categories with distinct functional attributes, as assessed by Gene Ontology classification and analysis of enriched transcription factor binding sites. CONCLUSION: Our analysis demonstrates that approximately 10% of transcripts have diurnally regulated expression patterns in the mouse prefrontal cortex. Functional annotation of these genes will be important for the selection of candidate genes for behavioral mutants in the mouse and for genetic studies of disorders associated with anomalies in the sleep:wake cycle and circadian rhythm

    Next-generation information systems for genomics

    Get PDF
    NIH Grant no. HG00739The advent of next-generation sequencing technologies is transforming biology by enabling individual researchers to sequence the genomes of individual organisms or cells on a massive scale. In order to realize the translational potential of this technology we will need advanced information systems to integrate and interpret this deluge of data. These systems must be capable of extracting the location and function of genes and biological features from genomic data, requiring the coordinated parallel execution of multiple bioinformatics analyses and intelligent synthesis of the results. The resulting databases must be structured to allow complex biological knowledge to be recorded in a computable way, which requires the development of logic-based knowledge structures called ontologies. To visualise and manipulate the results, new graphical interfaces and knowledge acquisition tools are required. Finally, to help understand complex disease processes, these information systems must be equipped with the capability to integrate and make inferences over multiple data sets derived from numerous sources. RESULTS: Here I describe research, design and implementation of some of the components of such a next-generation information system. I first describe the automated pipeline system used for the annotation of the Drosophila genome, and the application of this system in genomic research. This was succeeded by the development of a flexible graphoriented database system called Chado, which relies on the use of ontologies for structuring data and knowledge. I also describe research to develop, restructure and enhance a number of biological ontologies, adding a layer of logical semantics that increases the computability of these key knowledge sources. The resulting database and ontology collection can be accessed through a suite of tools. Finally I describe how the combination of genome analysis, ontology-based database representation and powerful tools can be combined in order to make inferences about genotype-phenotype relationships within and across species. CONCLUSION: The large volumes of complex data generated by high-throughput genomic and systems biology technology threatens to overwhelm us, unless we can devise better computing tools to assist us with its analysis. Ontologies are key technologies, but many existing ontologies are not interoperable or lack features that make them computable. Here I have shown how concerted ontology, tool and database development can be applied to make inferences of value to translational research

    The evolutionary role of human-specific genomic events

    Get PDF
    In the short evolutionary time since the human-chimpanzee divergence, approximately 6.6 million years ago, humans have acquired a range of traits that are unique among primates. These include tripling brain size, enhanced cognitive abilities, complex culture, descended larynx structure that enables spoken language, longevity, specific diseases, inferior olfaction, and (in some human populations) adult lactase persistence. These traits were likely to have evolved through various genomic mechanisms, among them gene duplications and gene-culture co-evolution. Several studies have estimated the dates for some of these human lineage genomic events. However, no study to date has performed a genomewide estimate of the dates of all human gene duplications. Moreover, as many of these traits were likely to have evolved via gene-culture coevolutionary mechanisms, investigating the evolution of one of these human-specific traits – lactase persistence – provides a model example for in-depth future investigations of specific human phenotypes. In this study I have investigated an important class of human-specific genomic events – gene duplications (otherwise known as human inparalogues). I have developed a new bioinformatics approach for detecting human lineage-specific inparalogues and the duplication dates for those genes. I show that human-specific inparalogues are non-randomly distributed among biological function classes, and their duplication event dates are non-randomly distributed on a timeline between the date of the human-chimpanzee split and the present. I have also investigated the evolution of the human-specific polymorphic trait – lactase persistence. I have performed a worldwide correlation analysis comparing frequency data on all currently known lactase persistence-associated alleles and the distribution of the lactase persistence phenotype in different human populations. I have also performed a gene-culture co-evolution analysis, employing spatially explicit simulation and Approximate Bayesian Computation to condition simulations on genetic and archaeological data, in order to make inferences on the evolution of lactase persistence and dairying in Europe

    Systems Analytics and Integration of Big Omics Data

    Get PDF
    A “genotype"" is essentially an organism's full hereditary information which is obtained from its parents. A ""phenotype"" is an organism's actual observed physical and behavioral properties. These may include traits such as morphology, size, height, eye color, metabolism, etc. One of the pressing challenges in computational and systems biology is genotype-to-phenotype prediction. This is challenging given the amount of data generated by modern Omics technologies. This “Big Data” is so large and complex that traditional data processing applications are not up to the task. Challenges arise in collection, analysis, mining, sharing, transfer, visualization, archiving, and integration of these data. In this Special Issue, there is a focus on the systems-level analysis of Omics data, recent developments in gene ontology annotation, and advances in biological pathways and network biology. The integration of Omics data with clinical and biomedical data using machine learning is explored. This Special Issue covers new methodologies in the context of gene–environment interactions, tissue-specific gene expression, and how external factors or host genetics impact the microbiome
    corecore