172 research outputs found

    The International Nucleotide Sequence Database Collaboration

    Get PDF
    Under the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org), globally comprehensive public domain nucleotide sequence is captured, preserved and presented. The partners of this long-standing collaboration work closely together to provide data formats and conventions that enable consistent data submission to their databases and support regular data exchange around the globe. Clearly defined policy and governance in relation to free access to data and relationships with journal publishers have positioned INSDC databases as a key provider of the scientific record and a core foundation for the global bioinformatics data infrastructure. While growth in sequence data volumes comes no longer as a surprise to INSDC partners, the uptake of next-generation sequencing technology by mainstream science that we have witnessed in recent years brings a step-change to growth, necessarily making a clear mark on INSDC strategy. In this article, we introduce the INSDC, outline data growth patterns and comment on the challenges of increased growth

    RNA Interference Demonstrates a Role for nautilus in the Myogenic Conversion of Schneider Cells by daughterless

    Get PDF
    AbstractSchneider SL2 cells activate the myogenic program in response to the ectopic expression of daughterless alone, as indicated by exit from the cell cycle, syncytia formation, and the presence of muscle myosin fibrils. Myogenic conversion can be potentiated by the coexpression of DMEF2 and nautilus with daughterless. In RT-PCR assays Schneider cells express two mesodermal markers, nautilus and DMEF2 mRNAs, as well as very low levels of daughterless mRNA but no twist. Full-length RT-PCR products for nautilus and DMEF2 encode immunoprecipitable proteins. We used RNA-i to demonstrate that both endogenous nautilus expression and DMEF2 expression are required for the myogenic conversion of Schneider cells by daughterless. Coexpression of twist blocks conversion by daughterless but twist dsRNA has no effect. Our results indicate that Schneider cells are of mesodermal origin and that myogenic conversion with ectopic expression of daughterless occurs by raising the levels of daughterless protein sufficiently to allow the formation of nautilus/daughterless heterodimers. The effectiveness of RNA-i is dependent upon protein half-life. Genes encoding proteins with relatively short half-lives (10 h), such as nautilus or HSF, are efficiently silenced, whereas more stable proteins, such as cytoplasmic actin or β-galactosidase, are less amenable to the application of RNA-i. These results support the conclusion that nautilus is a myogenic factor in Drosophila tissue culture cells with a functional role similar to that of vertebrate MyoD. This is discussed with regard to the in vivo functions of nautilus

    GenBank

    Get PDF
    GenBank (R) is a comprehensive database that contains publicly available nucleotide sequences for more than 240 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the web-based BankIt or standalone Sequin programs and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through NCBI's retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI Homepage ()

    GenBank

    Get PDF
    GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 300 000 organisms named at the genus level or lower, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Molecular Biology Laboratory Nucleotide Sequence Database in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bi-monthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI homepage: www.ncbi.nlm.nih.gov

    GenBank

    Get PDF
    GenBank® is a comprehensive database that contains publicly available nucleotide sequences for more than 250 000 formally described species. These sequences are obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects, including whole-genome shotgun (WGS) and environmental sampling projects. Most submissions are made using the web-based BankIt or standalone Sequin programs, and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. GenBank is accessible through the NCBI Entrez retrieval system, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome, mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, begin at the NCBI home page: www.ncbi.nlm.nih.gov

    BioProject and BioSample databases at NCBI: facilitating capture and organization of metadata

    Get PDF
    As the volume and complexity of data sets archived at NCBI grow rapidly, so does the need to gather and organize the associated metadata. Although metadata has been collected for some archival databases, previously, there was no centralized approach at NCBI for collecting this information and using it across databases. The BioProject database was recently established to facilitate organization and classification of project data submitted to NCBI, EBI and DDBJ databases. It captures descriptive information about research projects that result in high volume submissions to archival databases, ties together related data across multiple archives and serves as a central portal by which to inform users of data availability. Concomitantly, the BioSample database is being developed to capture descriptive information about the biological samples investigated in projects. BioProject and BioSample records link to corresponding data stored in archival repositories. Submissions are supported by a web-based Submission Portal that guides users through a series of forms for input of rich metadata describing their projects and samples. Together, these databases offer improved ways for users to query, locate, integrate and interpret the masses of data held in NCBI's archival repositories. The BioProject and BioSample databases are available at http://www.ncbi.nlm.nih.gov/bioproject and http://www.ncbi.nlm.nih.gov/biosample, respectively

    The Biomolecular Interaction Network Database in PSI-MI 2.5

    Get PDF
    The Biomolecular Interaction Network Database (BIND) is a major source of curated biomolecular interactions, which has been unmaintained for the last few years, a trend which will eventually result in the loss of a significant amount of unique biomolecular interaction information, mostly as database identifiers become out of date. To help reverse this trend, we converted BIND to a standard format, Proteomics Standard Initiative-Molecular Interaction 2.5, starting from the last curated data release (from 2005) available in a custom XML format and made the core components (interactions and complexes) plus additional valuable curated information available for download (http://download.baderlab.org/BINDTranslation/). Major work during the conversion process was required to update out of date molecule identifiers resulting in a more comprehensive conversion of BIND, by measures including number of species and interactor types covered, than what is currently accessible elsewhere. This work also highlights issues of data modeling, controlled vocabulary adoption and data cleaning that can serve as a general case study on the future compatibility of interaction databases

    Genomic Standards Consortium projects

    Get PDF
    © The Author(s), 2014. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Standards in Genomic Sciences 9 (2014): 599-601, doi:10.4056/sigs.5559680.The Genomic Standards Consortium (GSC) is an open-membership community working towards the development, implementation and harmonization of standards in the field of genomics. The mission of the GSC is to improve digital descriptions of genomes, metagenomes and gene marker sequences. The GSC started in late 2005 with the defined task of establishing what is now termed the “Minimum Information about any Sequence” (MIxS) standard [1,2]. As an outgrowth of the activities surrounding the creation and implementation of the MixS standard there are now 18 projects within the GSC [3]. These efforts cover an ever widening range of standardization activities. Given the growth of projects and to promote transparency, participation and adoption the GSC has developed a “GSC Project Description Template”. A complete set of GSC Project Descriptions and the template are available on the GSC website. The GSC has an open policy of participation and continues to welcome new efforts. Any projects that facilitate the standard descriptions and exchange of data are potential candidates for inclusion under the GSC umbrella. Areas that expand the scope of the GSC are encouraged. Through these collective activities we hope to help foster the growth of the ‘bioinformatics standards’ community. For more information on the GSC and its range of projects, please see http://gensc.org/
    corecore