24 research outputs found
Toward the use of upper level ontologies for semantically interoperable systems: an emergency management use case
In the context of globalization and knowledge management, information technologies require an ample need of unprecedented levels of data exchange and sharing to allow collaboration between heterogeneous systems. Yet, understanding the semantics of the exchanged data is one of the major challenges. Semantic interoperability can be ensured by capturing knowledge from diverse sources by using ontologies and align these latter by using upper level ontologies to come up with a common shared vocabulary. In this paper, we aim in one hand to investigate the role of upper level ontologies as a mean for enabling the formalization and integration of heterogeneous sources of information and how it may support interoperability of systems. On the other hand, we present several upper level ontologies and how we chose and then used Basic Formal Ontology (BFO) as an upper level ontology and Common Core Ontology (CCO) as a mid-level ontology to develop a modular ontology that define emergency responders’ knowledge starting from firefighters’ module for a solution to the semantic interoperability problem in emergency management
The CAP cancer protocols – a case study of caCORE based data standards implementation to integrate with the Cancer Biomedical Informatics Grid
BACKGROUND: The Cancer Biomedical Informatics Grid (caBIG™) is a network of individuals and institutions, creating a world wide web of cancer research. An important aspect of this informatics effort is the development of consistent practices for data standards development, using a multi-tier approach that facilitates semantic interoperability of systems. The semantic tiers include (1) information models, (2) common data elements, and (3) controlled terminologies and ontologies. The College of American Pathologists (CAP) cancer protocols and checklists are an important reporting standard in pathology, for which no complete electronic data standard is currently available. METHODS: In this manuscript, we provide a case study of Cancer Common Ontologic Representation Environment (caCORE) data standard implementation of the CAP cancer protocols and checklists model – an existing and complex paper based standard. We illustrate the basic principles, goals and methodology for developing caBIG™ models. RESULTS: Using this example, we describe the process required to develop the model, the technologies and data standards on which the process and models are based, and the results of the modeling effort. We address difficulties we encountered and modifications to caCORE that will address these problems. In addition, we describe four ongoing development projects that will use the emerging CAP data standards to achieve integration of tissue banking and laboratory information systems. CONCLUSION: The CAP cancer checklists can be used as the basis for an electronic data standard in pathology using the caBIG™ semantic modeling methodology
Inferring causal molecular networks: empirical assessment through a community-based effort
Inferring molecular networks is a central challenge in computational biology. However, it has remained unclear whether causal, rather than merely correlational, relationships can be effectively inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge that focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results constitute the most comprehensive assessment of causal network inference in a mammalian setting carried out to date and suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess the causal validity of inferred molecular networks
Inferring causal molecular networks: empirical assessment through a community-based effort
It remains unclear whether causal, rather than merely correlational, relationships in molecular networks can be inferred in complex biological settings. Here we describe the HPN-DREAM network inference challenge, which focused on learning causal influences in signaling networks. We used phosphoprotein data from cancer cell lines as well as in silico data from a nonlinear dynamical model. Using the phosphoprotein data, we scored more than 2,000 networks submitted by challenge participants. The networks spanned 32 biological contexts and were scored in terms of causal validity with respect to unseen interventional data. A number of approaches were effective, and incorporating known biology was generally advantageous. Additional sub-challenges considered time-course prediction and visualization. Our results suggest that learning causal relationships may be feasible in complex settings such as disease states. Furthermore, our scoring approach provides a practical way to empirically assess inferred molecular networks in a causal sense
Recognition of tRNA^(Cys) by the E. coli cysteinyl-tRNA synthetase: in vivo and in vitro studies
A study of the recognition of tRNA^(Cys) by E. coli cysteinyl-tRNA synthetase using in vivo and in vitro methods was performed. All three anticodon nucleotides, the discriminator U73, and some element(s) within the
tertiary domain (the D stem/loop, the TΨC stem/loop and extra loop) are important for recognition; the anticodon stem and acceptor stem appear to contain no essential elements. A T7 RNA polymerase transcribed tRNA^(Cys) is a
5.5-fold worse substrate than native tRNA^(Cys)(in terms of the selectivity constant, k_cat/K_m) mainly due to an increase in K_m. This may reflect recognition of modified nucleotides or subtle effects on the folding of the tRNA. The greatest loss of specificity caused by mutation of a single nucleotide occurs when the discriminator U73 is changed; k_cat/K_m declines 3 to 4 orders of magnitude
depending on the substitution. Mutations in the wobble nucleotide of the anticodon also cause reductions in the selectivity constant of 3 orders of magnitude, while mutations in the other anticodon nucleotides caused lesser
effects. Interestingly, a C35A mutation had no effect on aminoacylation by the cysteinyl-tRNA synthetase. Several amber suppressor tRNAs were constructed whose in vivo identity did not correlate with their in vitro specificity, indicating the need for both types of experiments to understand the factor(s) which maintain tRNA specificity. Future in vitro experiments will attempt to explain the
in vivo discrimination between the glycine, phenylalanine, and cysteine tRNAs by the cysteinyl-tRNA synthetase. Finally, these results suggest that the notion that a small set of isoacceptor specific elements define tRNA identity (the socalled "second genetic code") is incorrect. A better model is based on competition between synthetases for tRNA substrates which contain differing amounts of partially overlapping identity determinants
Recognition of tRNA^(Cys) by Escherichia coli Cysteinyl-tRNA Synthetase
A study of the recognition of tRNA^(CyS) by Escherichia coli cysteinyl-tRNA synthetase using in vivo and in vitro methods was performed. All three anticodon nucleotides, the discriminator nucleotide (73), and some elements within the tertiary domain (the D stem/loop, the TΨC stem/loop, and the variable loop) are important for recognition; the anticodon stem and acceptor stem appear to contain no essential elements. A T7 RNA polymerase transcript corresponding to tRNA^(Cys) is only a 5.5-fold worse substrate than native tRNA^(Cys) (in terms of the specificity constant, K_(cat)/K_m), mainly due to an increase in the value of K_m for the transcript. The greatest loss of specificity caused by mutation of a single nucleotide occurs when the discriminator U73 ischanged; k_(cat)/K_m declines 3 4or ders of magnitude depending on the substitution. Mutations in the wobble nucleotide of the anticodon also cause reductions in the specificity constant of 3 orders of magnitude, while mutations in the other anticodon nucleotides caused lesser effects. Interestingly, a C35A mutation (with the phenylalanine anticodon GAA) had no effect on aminoacylation by the cysteinyl-tRNA synthetase. Several amber suppressor tRNAs were constructed whose in vivo identity did not correlate with their in vitro specificity, indicating the need for both types of experiments to understand the factors which maintain tRNA specificity
Statistics in molecular biology: An example from detection of chimeric 16S rRNA artifacts
Statistical methods have had wide application in molecular biology. Genetic mapping, physical mapping, DNA sequence determination, evolutionary history reconstructions, sequence alignments, and sequence database searches all have substantial statistical components. In this talk we consider 16S rRNA sequences that have been generated during PCR amplification from mixed populations. 16S ribosomal RNA (16S rRNA) have been selected to determine the phylogeny of orgainsms. It has the advantages of universality, reasonable size (about 1400 bases), and slow evolution, the latter imposed by its extensive secondary structure that maintains contacts between widely separated elements in the primary sequence. Some of data we consider comes from samples drawn from hot springs. Chimeric sequences have been found from these PCR amplifications; that is, part of the sequence is from one species, while another part of the sequence is from some entirely separate species. Clearly, such sequences will caus..
Chimeric Alignment By Dynamic Programming: Algorithm and biological uses
A new nearest-neighbor method for detecting chimeric 16S rRNA artifacts generated during PCR amplification from mixed populations has been developed. The method uses dynamic programming to generate an optimal chimeric alignment, defined as the highest scoring alignment between a query and a concatenation of a 5 0 and a 3 0 segment from two separate entries from a database of related sequences. Chimeras are detected by studying the scores and form of the chimeric and global sequence alignments. The chimeric alignment method was found to be marginally more effective than k-tuple based nearest-neighbor methods in simulation studies, but its most effective use is in concert with k-tuple methods