389 research outputs found
Recommended from our members
Evolution and gene regulation of the genomic imprinting mechanism
Genomic imprinting describes an epigenetic mechanism by which genes are active or silent depending on their parental origin. Imprinting exists in plants and mammals, but how this monoallelic expression mechanism has evolved is not understood at the molecular level. Here I describe the mapping, sequencing and analysis of vertebrate orthologous imprinted regions spanning 11.5 Mb of genomic sequence from species with and without genomic imprinting. In eutherian (placental) mammals, imprinting can be regulated by differential DNA methylation, non-coding RNAs, enhancers and insulator elements. The systematic sequence comparison of the IGF2-H19 imprinting cluster, in eutherians and marsupials (tammar wallaby and opossum), has revealed the presence of the enigmatic non-coding RNA H19 in marsupials. Furthermore, we have characterised the marsupial H19 expression status and identified key regulatory elements required for the germline imprinting of the neighbouring IGF2 gene. All the major hallmarks of the imprinting mechanism of the IGF2-H19 locus were found to be conserved in therian mammals. In mammals, this imprinting system is therefore the most conserved germline derived epigenetic mechanism discovered so far.
The high-quality genomic sequences have provided early glimpses of the genomic landscapes for species such as the monotreme platypus and marsupial tammar wallaby for which little was previously known. Comparative sequence analysis was used to identify candidate regulatory elements in the neighbouring imprinting centre 1 and 2 regions of human chromosome 11p15.5. Nine novel enhancer elements were identified following in vitro gene-reporter assays and correlation of conserved sequences with recent ENCODE data revealed probable functions for a further 24 elements.
This project has led to the formation of the Sequence Analysis of Vertebrate Orthologous Imprinted Regions (SAVOIR) consortium and resources developed here are being used by the imprinting community to further our knowledge of the evolution of the genomic imprinting mechanism
δ13C tracing of dissolved inorganic carbon sources in major world rivers
δ13C tracing of dissolved inorganic carbon sources in major world river
FAIR principles and the IEDB: short-term improvements and a long-term vision of OBO-foundry mediated machine-actionable interoperability.
The Immune Epitope Database (IEDB), at www.iedb.org, has the mission to make published experimental data relating to the recognition of immune epitopes easily available to the scientific public. By presenting curated data in a searchable database, we have liberated it from the tables and figures of journal articles, making it more accessible and usable by immunologists. Recently, the principles of Findability, Accessibility, Interoperability and Reusability have been formulated as goals that data repositories should meet to enhance the usefulness of their data holdings. We here examine how the IEDB complies with these principles and identify broad areas of success, but also areas for improvement. We describe short-term improvements to the IEDB that are being implemented now, as well as a long-term vision of true 'machine-actionable interoperability', which we believe will require community agreement on standardization of knowledge representation that can be built on top of the shared use of ontologies
Petrological and geochemical characteristics of the mafic–ultramafic Americano do Brasil Complex, central Brazil, and the implications for its genesis
The Americano do Brasil Complex occurs in the Neoproterozoic Goias Magmatic Arc, central Brazil. It is composed of two mafic–ultramafic cumulate sequences, intruded into granodioritic gneisses. Although deformed and partially recrystallized by a regional metamorphic overprint, the complex still preserves relict igneous features, such as adcumulate to heteradcumulate textures. The Northern sequence is mostly composed of olivine and olivine-clinopyroxene cumulates, whereas the Southern consists mainly of two-pyroxene cumulate rocks, with plagioclase and olivine cumulates occurring in lesser amounts. The complex has three main orebodies, with textures that range from disseminated to massive sulfide breccias with durchbewegung texture. Thermodynamic modeling using a single picrite parental magma composition can predict cumulate rock compositions and mineral modes similar to all of the observed cumulate rock compositions of the Americano do Brasil Complex. Equilibrium crystallization of the liquid and assimilation-batch-crystallization involving up to 45 % of the host gneisses in the upper crust produces solids similar to the cumulates described in the Northern and Southern sequences, respectively. Modeled pressure–temperature emplacement conditions of the magma were c.a. 2.5 kbar and 1310 °C. Both sequences have similar incompatible trace element patterns which, together with the results of the modeling, imply a broadly comagmatic origin
FYPO: the fission yeast phenotype ontology.
MOTIVATION: To provide consistent computable descriptions of phenotype data, PomBase is developing a formal ontology of phenotypes observed in fission yeast. RESULTS: The fission yeast phenotype ontology (FYPO) is a modular ontology that uses several existing ontologies from the open biological and biomedical ontologies (OBO) collection as building blocks, including the phenotypic quality ontology PATO, the Gene Ontology and Chemical Entities of Biological Interest. Modular ontology development facilitates partially automated effective organization of detailed phenotype descriptions with complex relationships to each other and to underlying biological phenomena. As a result, FYPO supports sophisticated querying, computational analysis and comparison between different experiments and even between species. AVAILABILITY: FYPO releases are available from the Subversion repository at the PomBase SourceForge project page (https://sourceforge.net/p/pombase/code/HEAD/tree/phenotype_ontology/). The current version of FYPO is also available on the OBO Foundry Web site (http://obofoundry.org/)
Recommended from our members
Transforming the study of organisms: Phenomic data models and knowledge bases
The rapidly decreasing cost of gene sequencing has resulted in a deluge of genomic data from across the tree of life; however, outside a few model organism databases, genomic data are limited in their scientific impact because they are not accompanied by computable phenomic data. The majority of phenomic data are contained in countless small, heterogeneous phenotypic data sets that are very difficult or impossible to integrate at scale because of variable formats, lack of digitization, and linguistic problems. One powerful solution is to represent phenotypic data using data models with precise, computable semantics, but adoption of semantic standards for representing phenotypic data has been slow, especially in biodiversity and ecology. Some phenotypic and trait data are available in a semantic language from knowledge bases, but these are often not interoperable. In this review, we will compare and contrast existing ontology and data models, focusing on nonhuman phenotypes and traits. We discuss barriers to integration of phenotypic data and make recommendations for developing an operationally useful, semantically interoperable phenotypic data ecosystem
Meeting the cool neighbors. X. Ultracool dwarfs from the 2MASS All-Sky data reslease
Using data from the 2 Micron All Sky Survey All-Sky Point Source Catalogue, we have extended our census of nearby ultracool dwarfs to cover the full celestial sphere above Galactic latitude of 15°. Starting with an initial catalog of 2,139,484 sources, we have winnowed the sample to 467 candidate late-type M or L dwarfs within 20 pc of the Sun. Fifty-four of those sources already have spectroscopic observations confirming them as late-type dwarfs. We present optical spectroscopy of 376 of the remaining 413 sources, and identify 44 as ultracool dwarfs with spectroscopic distances less than 20 pc. Twenty-five of the 37 sources that lack optical data have near-infrared spectroscopy. Combining the present sample with our previous results and data from the literature, we catalog 94 L dwarf systems within 20 pc. We discuss the distribution of activity, as measured by Hα emission, in this volume-limited sample. We have coupled the present ultracool catalog with data for stars in the northern 8 pc sample and recent (incomplete) statistics for T dwarfs to provide a snapshot of the current 20 pc census as a function of spectral type
Meeting the Cool Neighbors VII: Spectroscopy of faint, red NLTT dwarfs
We present low-resolution optical spectroscopy and BVRI photometry of 453
candidate nearby stars drawn from the NLTT proper motion catalogue. The stars
were selected based on optical/near-infrared colours, derived by combining the
NLTT photographic data with photometry from the 2MASS Second Incremental Data
Release. Based on the derived photometric and spectroscopic parallaxes, we
identify 111 stars as lying within 20 parsecs of the Sun, including 9 stars
with formal distance estimates of less than 10 parsecs. A further 53 stars have
distance estimates within 1-sigma of our 20-parsec limit. Almost all of those
stars are additions to the nearby star census. In total, our NLTT-based survey
has so far identified 496 stars likely to be within 20 parsecs, of which 195
are additions to nearby-star catalogues. Most of the newly-identified nearby
stars have spectral types between M4 and M8.Comment: 41 pages, 7 figure
The Sol Genomics Network (solgenomics.net): growing tomatoes using Perl
The Sol Genomics Network (SGN; http://solgenomics.net/) is a clade-oriented database (COD) containing biological data for species in the Solanaceae and their close relatives, with data types ranging from chromosomes and genes to phenotypes and accessions. SGN hosts several genome maps and sequences, including a pre-release of the tomato (Solanum lycopersicum cv Heinz 1706) reference genome. A new transcriptome component has been added to store RNA-seq and microarray data. SGN is also an open source software project, continuously developing and improving a complex system for storing, integrating and analyzing data. All code and development work is publicly visible on GitHub (http://github.com). The database architecture combines SGN-specific schemas and the community-developed Chado schema (http://gmod.org/wiki/Chado) for compatibility with other genome databases. The SGN curation model is community-driven, allowing researchers to add and edit information using simple web tools. Currently, over a hundred community annotators help curate the database. SGN can be accessed at http://solgenomics.net/
KG-COVID-19: A Framework to Produce Customized Knowledge Graphs for COVID-19 Response.
Integrated, up-to-date data about SARS-CoV-2 and COVID-19 is crucial for the ongoing response to the COVID-19 pandemic by the biomedical research community. While rich biological knowledge exists for SARS-CoV-2 and related viruses (SARS-CoV, MERS-CoV), integrating this knowledge is difficult and time-consuming, since much of it is in siloed databases or in textual format. Furthermore, the data required by the research community vary drastically for different tasks; the optimal data for a machine learning task, for example, is much different from the data used to populate a browsable user interface for clinicians. To address these challenges, we created KG-COVID-19, a flexible framework that ingests and integrates heterogeneous biomedical data to produce knowledge graphs (KGs), and applied it to create a KG for COVID-19 response. This KG framework also can be applied to other problems in which siloed biomedical data must be quickly integrated for different research applications, including future pandemics
- …