19 research outputs found

    Translational bioinformatics in the cloud: an affordable alternative

    Get PDF
    With the continued exponential expansion of publicly available genomic data and access to low-cost, high-throughput molecular technologies for profiling patient populations, computational technologies and informatics are becoming vital considerations in genomic medicine. Although cloud computing technology is being heralded as a key enabling technology for the future of genomic research, available case studies are limited to applications in the domain of high-throughput sequence data analysis. The goal of this study was to evaluate the computational and economic characteristics of cloud computing in performing a large-scale data integration and analysis representative of research problems in genomic medicine. We find that the cloud-based analysis compares favorably in both performance and cost in comparison to a local computational cluster, suggesting that cloud computing technologies might be a viable resource for facilitating large-scale translational research in genomic medicine

    BioWarehouse: a bioinformatics database warehouse toolkit

    Get PDF
    BACKGROUND: This article addresses the problem of interoperation of heterogeneous bioinformatics databases. RESULTS: We introduce BioWarehouse, an open source toolkit for constructing bioinformatics database warehouses using the MySQL and Oracle relational database managers. BioWarehouse integrates its component databases into a common representational framework within a single database management system, thus enabling multi-database queries using the Structured Query Language (SQL) but also facilitating a variety of database integration tasks such as comparative analysis and data mining. BioWarehouse currently supports the integration of a pathway-centric set of databases including ENZYME, KEGG, and BioCyc, and in addition the UniProt, GenBank, NCBI Taxonomy, and CMR databases, and the Gene Ontology. Loader tools, written in the C and JAVA languages, parse and load these databases into a relational database schema. The loaders also apply a degree of semantic normalization to their respective source data, decreasing semantic heterogeneity. The schema supports the following bioinformatics datatypes: chemical compounds, biochemical reactions, metabolic pathways, proteins, genes, nucleic acid sequences, features on protein and nucleic-acid sequences, organisms, organism taxonomies, and controlled vocabularies. As an application example, we applied BioWarehouse to determine the fraction of biochemically characterized enzyme activities for which no sequences exist in the public sequence databases. The answer is that no sequence exists for 36% of enzyme activities for which EC numbers have been assigned. These gaps in sequence data significantly limit the accuracy of genome annotation and metabolic pathway prediction, and are a barrier for metabolic engineering. Complex queries of this type provide examples of the value of the data warehousing approach to bioinformatics research. CONCLUSION: BioWarehouse embodies significant progress on the database integration problem for bioinformatics

    A survey of orphan enzyme activities

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Using computational database searches, we have demonstrated previously that no gene sequences could be found for at least 36% of enzyme activities that have been assigned an Enzyme Commission number. Here we present a follow-up literature-based survey involving a statistically significant sample of such "orphan" activities. The survey was intended to determine whether sequences for these enzyme activities are truly unknown, or whether these sequences are absent from the public sequence databases but can be found in the literature.</p> <p>Results</p> <p>We demonstrate that for ~80% of sampled orphans, the absence of sequence data is bona fide. Our analyses further substantiate the notion that many of these enzyme activities play biologically important roles.</p> <p>Conclusion</p> <p>This survey points toward significant scientific cost of having such a large fraction of characterized enzyme activities disconnected from sequence data. It also suggests that a larger effort, beginning with a comprehensive survey of all putative orphan activities, would resolve nearly 300 artifactual orphans and reconnect a wealth of enzyme research with modern genomics. For these reasons, we propose that a systematic effort to identify the cognate genes of orphan enzymes be undertaken.</p

    Variation sur couloirs

    Full text link

    Role and Regulation of Cadherin Expression during Skeletal Myoblast Differentiation

    Full text link
    Note:Using a polyclonal anti-cadherin serum, a cadherin was detected in the rat L6 myoblasts cell line. Levels of this cadherin peaked when myoblasts began to fuse together in vitro. BUdR, an inhibitor of the program of terminal myogenic differentiation, severely lowered levels of this cadherin. Blockade by anti-cadherin immunoglobulins inhibited myoblast fusion. These data suggest that this cadherin is regulated by the program of terminal differentiation and that it plays a role in myoblast fusion. […]À l’aide d'un anticorps polyclonal anti-cadhérine, j’ai détecte une protéine immunoréactive exprimée par les myoblastes squelettiques de la lignée cellulaire L6. Le niveau maximum de cette cadhérine fut observé lorsque les myoblastes commençaient A fusionner. Ce niveau fut sévèrement diminué par le BUdR (un inhibiteur de la différenciation myoblastique). Les immunoglobulines anti-cadhérine inhibèrent la fusion des myoblastes. Ces données suggèrent que cette cadhérine est contrôlée par le programme de différenciation terminale et qu'elle joue un rôle dans la fusion des myoblastes
    corecore