48 research outputs found

    Alignment of the UMLS semantic network with BioTop: Methodology and assessment

    Get PDF
    Motivation: For many years, the Unified Medical Language System (UMLS) semantic network (SN) has been used as an upper-level semantic framework for the categorization of terms from terminological resources in biomedicine. BioTop has recently been developed as an upper-level ontology for the biomedical domain. In contrast to the SN, it is founded upon strict ontological principles, using OWL DL as a formal representation language, which has become standard in the semantic Web. In order to make logic-based reasoning available for the resources annotated or categorized with the SN, a mapping ontology was developed aligning the SN with BioTop. Methods: The theoretical foundations and the practical realization of the alignment are being described, with a focus on the design decisions taken, the problems encountered and the adaptations of BioTop that became necessary. For evaluation purposes, UMLS concept pairs obtained from MEDLINE abstracts by a named entity recognition system were tested for possible semantic relationships. Furthermore, all semantic-type combinations that occur in the UMLS Metathesaurus were checked for satisfiability. Results: The effort-intensive alignment process required major design changes and enhancements of BioTop and brought up s

    Patterns of nucleotide diversity at the regions encompassing the Drosophila insulin-like peptide (dilp) genes: demography vs positive selection in Drosophila melanogaster.

    Get PDF
    In Drosophila, the insulin-signaling pathway controls some life history traits, such as fertility and lifespan, and it is considered to be the main metabolic pathway involved in establishing adult body size. Several observations concerning variation in body size in the Drosophila genus are suggestive of its adaptive character. Genes encoding proteins in this pathway are, therefore, good candidates to have experienced adaptive changes and to reveal the footprint of positive selection. The Drosophila insulin-like peptides (DILPs) are the ligands that trigger the insulin-signaling cascade. In Drosophila melanogaster, there are several peptides that are structurally similar to the single mammalian insulin peptide. The footprint of recent adaptive changes on nucleotide variation can be unveiled through the analysis of polymorphism and divergence. With this aim, we have surveyed nucleotide sequence variation at the dilp1-7 genes in a natural population of D. melanogaster. The comparison of polymorphism in D. melanogaster and divergence from D. simulans at different functional classes of the dilp genes provided no evidence of adaptive protein evolution after the split of the D. melanogaster and D. simulans lineages. However, our survey of polymorphism at the dilp gene regions of D. melanogaster has provided some evidence for the action of positive selection at or near these genes. The regions encompassing the dilp1-4 genes and the dilp6 gene stand out as likely affected by recent adaptive events

    Drosophila Genes That Affect Meiosis Duration Are among the Meiosis Related Genes That Are More Often Found Duplicated

    Get PDF
    Using a phylogenetic approach, the examination of 33 meiosis/meiosis-related genes in 12 Drosophila species, revealed nine independent gene duplications, involving the genes cav, mre11, meiS332, polo and mtrm. Evidence is provided that at least eight out of the nine gene duplicates are functional. Therefore, the rate at which Drosophila meiosis/meiosis-related genes are duplicated and retained is estimated to be 0.0012 per gene per million years, a value that is similar to the average for all Drosophila genes. It should be noted that by using a phylogenetic approach the confounding effect of concerted evolution, that is known to lead to overestimation of the duplication and retention rate, is avoided. This is an important issue, since even in our moderate size sample, evidence for long-term concerted evolution (lasting for more than 30 million years) was found for the meiS332 gene pair in species of the Drosophila subgenus. Most striking, in contrast to theoretical expectations, is the finding that genes that encode proteins that must follow a close stoichiometric balance, such as polo, mtrm and meiS332 have been found duplicated. The duplicated genes may be examples of gene neofunctionalization. It is speculated that meiosis duration may be a trait that is under selection in Drosophila and that it has different optimal values in different species

    Reuse of terminological resources for efficient ontological engineering in Life Sciences

    Get PDF
    This paper is intended to explore how to use terminological resources for ontology engineering. Nowadays there are several biomedical ontologies describing overlapping domains, but there is not a clear correspondence between the concepts that are supposed to be equivalent or just similar. These resources are quite precious but their integration and further development are expensive. Terminologies may support the ontological development in several stages of the lifecycle of the ontology; e.g. ontology integration. In this paper we investigate the use of terminological resources during the ontology lifecycle. We claim that the proper creation and use of a shared thesaurus is a cornerstone for the successful application of the Semantic Web technology within life sciences. Moreover, we have applied our approach to a real scenario, the Health-e-Child (HeC) project, and we have evaluated the impact of filtering and re-organizing several resources. As a result, we have created a reference thesaurus for this project, named HeCTh

    Modeling Structure-Function Relationships in Synthetic DNA Sequences using Attribute Grammars

    Get PDF
    Recognizing that certain biological functions can be associated with specific DNA sequences has led various fields of biology to adopt the notion of the genetic part. This concept provides a finer level of granularity than the traditional notion of the gene. However, a method of formally relating how a set of parts relates to a function has not yet emerged. Synthetic biology both demands such a formalism and provides an ideal setting for testing hypotheses about relationships between DNA sequences and phenotypes beyond the gene-centric methods used in genetics. Attribute grammars are used in computer science to translate the text of a program source code into the computational operations it represents. By associating attributes with parts, modifying the value of these attributes using rules that describe the structure of DNA sequences, and using a multi-pass compilation process, it is possible to translate DNA sequences into molecular interaction network models. These capabilities are illustrated by simple example grammars expressing how gene expression rates are dependent upon single or multiple parts. The translation process is validated by systematically generating, translating, and simulating the phenotype of all the sequences in the design space generated by a small library of genetic parts. Attribute grammars represent a flexible framework connecting parts with models of biological function. They will be instrumental for building mathematical models of libraries of genetic constructs synthesized to characterize the function of genetic parts. This formalism is also expected to provide a solid foundation for the development of computer assisted design applications for synthetic biology

    A genome-wide CRISPR screen identifies a restricted set of HIV host dependency factors

    Get PDF
    Host proteins are essential for HIV entry and replication and can be important nonviral therapeutic targets. Large-scale RNA interference (RNAi)-based screens have identified nearly a thousand candidate host factors, but there is little agreement among studies and few factors have been validated. Here we demonstrate that a genome-wide CRISPR-based screen identifies host factors in a physiologically relevant cell system. We identify five factors, including the HIV co-receptors CD4 and CCR5, that are required for HIV infection yet are dispensable for cellular proliferation and viability. Tyrosylprotein sulfotransferase 2 (TPST2) and solute carrier family 35 member B2 (SLC35B2) function in a common pathway to sulfate CCR5 on extracellular tyrosine residues, facilitating CCR5 recognition by the HIV envelope. Activated leukocyte cell adhesion molecule (ALCAM) mediates cell aggregation, which is required for cell-to-cell HIV transmission. We validated these pathways in primary human CD4 + T cells through Cas9-mediated knockout and antibody blockade. Our findings indicate that HIV infection and replication rely on a limited set of host-dispensable genes and suggest that these pathways can be studied for therapeutic intervention

    Holding it together: rapid evolution and positive selection in the synaptonemal complex of Drosophila

    Get PDF
    Background The synaptonemal complex (SC) is a highly conserved meiotic structure that functions to pair homologs and facilitate meiotic recombination in most eukaryotes. Five Drosophila SC proteins have been identified and localized within the complex: C(3)G, C(2)M, CONA, ORD, and the newly identified Corolla. The SC is required for meiotic recombination in Drosophila and absence of these proteins leads to reduced crossing over and chromosomal nondisjunction. Despite the conserved nature of the SC and the key role that these five proteins have in meiosis in D. melanogaster, they display little apparent sequence conservation outside the genus. To identify factors that explain this lack of apparent conservation, we performed a molecular evolutionary analysis of these genes across the Drosophila genus. Results For the five SC components, gene sequence similarity declines rapidly with increasing phylogenetic distance and only ORD and C(2)M are identifiable outside of the Drosophila genus. SC gene sequences have a higher dN/dS (ω) rate ratio than the genome wide average and this can in part be explained by the action of positive selection in almost every SC component. Across the genus, there is significant variation in ω for each protein. It further appears that ω estimates for the five SC components are in accordance with their physical position within the SC. Components interacting with chromatin evolve slowest and components comprising the central elements evolve the most rapidly. Finally, using population genetic approaches, we demonstrate that positive selection on SC components is ongoing. Conclusions SC components within Drosophila show little apparent sequence homology to those identified in other model organisms due to their rapid evolution. We propose that the Drosophila SC is evolving rapidly due to two combined effects. First, we propose that a high rate of evolution can be partly explained by low purifying selection on protein components whose function is to simply hold chromosomes together. We also propose that positive selection in the SC is driven by its sex-specificity combined with its role in facilitating both recombination and centromere clustering in the face of recurrent bouts of drive in female meiosis

    Evolutionary Diversification of Plant Shikimate Kinase Gene Duplicates

    Get PDF
    Shikimate kinase (SK; EC 2.7.1.71) catalyzes the fifth reaction of the shikimate pathway, which directs carbon from the central metabolism pool to a broad range of secondary metabolites involved in plant development, growth, and stress responses. In this study, we demonstrate the role of plant SK gene duplicate evolution in the diversification of metabolic regulation and the acquisition of novel and physiologically essential function. Phylogenetic analysis of plant SK homologs resolves an orthologous cluster of plant SKs and two functionally distinct orthologous clusters. These previously undescribed genes, shikimate kinase-like 1 (SKL1) and -2 (SKL2), do not encode SK activity, are present in all major plant lineages, and apparently evolved under positive selection following SK gene duplication over 400 MYA. This is supported by functional assays using recombinant SK, SKL1, and SKL2 from Arabidopsis thaliana (At) and evolutionary analyses of the diversification of SK-catalytic and -substrate binding sites based on theoretical structure models. AtSKL1 mutants yield albino and novel variegated phenotypes, which indicate SKL1 is required for chloroplast biogenesis. Extant SKL2 sequences show a strong genetic signature of positive selection, which is enriched in a protein–protein interaction module not found in other SK homologs. We also report the first kinetic characterization of plant SKs and show that gene expression diversification among the AtSK inparalogs is correlated with developmental processes and stress responses. This study examines the functional diversification of ancient and recent plant SK gene duplicates and highlights the utility of SKs as scaffolds for functional innovation
    corecore