8,885 research outputs found

    RegenBase: a knowledge base of spinal cord injury biology for translational research.

    Get PDF
    Spinal cord injury (SCI) research is a data-rich field that aims to identify the biological mechanisms resulting in loss of function and mobility after SCI, as well as develop therapies that promote recovery after injury. SCI experimental methods, data and domain knowledge are locked in the largely unstructured text of scientific publications, making large scale integration with existing bioinformatics resources and subsequent analysis infeasible. The lack of standard reporting for experiment variables and results also makes experiment replicability a significant challenge. To address these challenges, we have developed RegenBase, a knowledge base of SCI biology. RegenBase integrates curated literature-sourced facts and experimental details, raw assay data profiling the effect of compounds on enzyme activity and cell growth, and structured SCI domain knowledge in the form of the first ontology for SCI, using Semantic Web representation languages and frameworks. RegenBase uses consistent identifier schemes and data representations that enable automated linking among RegenBase statements and also to other biological databases and electronic resources. By querying RegenBase, we have identified novel biological hypotheses linking the effects of perturbagens to observed behavioral outcomes after SCI. RegenBase is publicly available for browsing, querying and download.Database URL:http://regenbase.org

    WormBase 2017: Molting into a new stage

    Get PDF

    miRDB: An online database for prediction of functional microRNA targets

    Get PDF
    MicroRNAs (miRNAs) are small noncoding RNAs that act as master regulators in many biological processes. miRNAs function mainly by downregulating the expression of their gene targets. Thus, accurate prediction of miRNA targets is critical for characterization of miRNA functions. To this end, we have developed an online database, miRDB, for miRNA target prediction and functional annotations. Recently, we have performed major updates for miRDB. Specifically, by employing an improved algorithm for miRNA target prediction, we now present updated transcriptome-wide target prediction data in miRDB, including 3.5 million predicted targets regulated by 7000 miRNAs in five species. Further, we have implemented the new prediction algorithm into a web server, allowing custom target prediction with user-provided sequences. Another new database feature is the prediction of cell-specific miRNA targets. miRDB now hosts the expression profiles of over 1000 cell lines and presents target prediction data that are tailored for specific cell models. At last, a new web query interface has been added to miRDB for prediction of miRNA functions by integrative analysis of target prediction and Gene Ontology data. All data in miRDB are freely accessible at http://mirdb.org

    FAIR principles and the IEDB: short-term improvements and a long-term vision of OBO-foundry mediated machine-actionable interoperability.

    Get PDF
    The Immune Epitope Database (IEDB), at www.iedb.org, has the mission to make published experimental data relating to the recognition of immune epitopes easily available to the scientific public. By presenting curated data in a searchable database, we have liberated it from the tables and figures of journal articles, making it more accessible and usable by immunologists. Recently, the principles of Findability, Accessibility, Interoperability and Reusability have been formulated as goals that data repositories should meet to enhance the usefulness of their data holdings. We here examine how the IEDB complies with these principles and identify broad areas of success, but also areas for improvement. We describe short-term improvements to the IEDB that are being implemented now, as well as a long-term vision of true 'machine-actionable interoperability', which we believe will require community agreement on standardization of knowledge representation that can be built on top of the shared use of ontologies

    MisPred: a resource for identification of erroneous protein sequences in public databases

    Get PDF
    Correct prediction of the structure of protein-coding genes of higher eukaryotes is still a difficult task; therefore, public databases are heavily contaminated with mispredicted sequences. The high rate of misprediction has serious consequences because it significantly affects the conclusions that may be drawn from genome-scale sequence analyses of eukaryotic genomes. Here we present the MisPred database and computational pipeline that provide efficient means for the identification of erroneous sequences in public databases. The MisPred database contains a collection of abnormal, incomplete and mispredicted protein sequences from 19 metazoan species identified as erroneous by MisPred quality control tools in the UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, NCBI/RefSeq and EnsEMBL databases. Major releases of the database are automatically generated and updated regularly. The database (http://www.mispred.com) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats

    ANIA:ANnotation and Integrated Analysis of the 14-3-3 interactome

    Get PDF
    The dimeric 14-3-3 proteins dock onto pairs of phosphorylated Ser and Thr residues on hundreds of proteins, and thereby regulate many events in mammalian cells. To facilitate global analyses of these interactions, we developed a web resource named ANIA: ANnotation and Integrated Analysis of the 14-3-3 interactome, which integrates multiple data sets on 14-3-3-binding phosphoproteins. ANIA also pinpoints candidate 14-3-3-binding phosphosites using predictor algorithms, assisted by our recent discovery that the human 14-3-3-interactome is highly enriched in 2R-ohnologues. 2R-ohnologues are proteins in families of two to four, generated by two rounds of whole genome duplication at the origin of the vertebrate animals. ANIA identifies candidate ‘lynchpins’, which are 14-3-3-binding phosphosites that are conserved across members of a given 2R-ohnologue protein family. Other features of ANIA include a link to the catalogue of somatic mutations in cancer database to find cancer polymorphisms that map to 14-3-3-binding phosphosites, which would be expected to interfere with 14-3-3 interactions. We used ANIA to map known and candidate 14-3-3-binding enzymes within the 2R-ohnologue complement of the human kinome. Our projections indicate that 14-3-3s dock onto many more human kinases than has been realized. Guided by ANIA, PAK4, 6 and 7 (p21-activated kinases 4, 6 and 7) were experimentally validated as a 2R-ohnologue family of 14-3-3-binding phosphoproteins. PAK4 binding to 14-3-3 is stimulated by phorbol ester, and involves the ‘lynchpin’ site phosphoSer99 and a major contribution from Ser181. In contrast, PAK6 and PAK7 display strong phorbol ester-independent binding to 14-3-3, with Ser113 critical for the interaction with PAK6. These data point to differential 14-3-3 regulation of PAKs in control of cell morphology. Database URL: https://ania-1433.lifesci.dundee.ac.uk/prediction/webserver/index.p

    A golden age for working with public proteomics data

    Get PDF
    Data sharing in mass spectrometry (MS)-based proteomics is becoming a common scientific practice, as is now common in the case of other, more mature 'omics' disciplines like genomics and transcriptomics. We want to highlight that this situation, unprecedented in the field, opens a plethora of opportunities for data scientists. First, we explain in some detail some of the work already achieved, such as systematic reanalysis efforts. We also explain existing applications of public proteomics data, such as proteogenomics and the creation of spectral libraries and spectral archives. Finally, we discuss the main existing challenges and mention the first attempts to combine public proteomics data with other types of omics data sets

    Nencki Genomics Database—Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs

    Get PDF
    We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql –h database.nencki-genomics.org –u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface
    • …
    corecore