1,680 research outputs found

    Ontology mapping with auxiliary resources

    Get PDF

    Ontology matching: state of the art and future challenges

    Get PDF
    shvaiko2013aInternational audienceAfter years of research on ontology matching, it is reasonable to consider several questions: is the field of ontology matching still making progress? Is this progress significant enough to pursue some further research? If so, what are the particularly promising directions? To answer these questions, we review the state of the art of ontology matching and analyze the results of recent ontology matching evaluations. These results show a measurable improvement in the field, the speed of which is albeit slowing down. We conjecture that significant improvements can be obtained only by addressing important challenges for ontology matching. We present such challenges with insights on how to approach them, thereby aiming to direct research into the most promising tracks and to facilitate the progress of the field

    Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans

    Full text link
    We have used whole genome paired-end Illumina sequence data to identify tandem duplications in 20 isofemale lines of D. yakuba, and 20 isofemale lines of D. simulans and performed genome wide validation with PacBio long molecule sequencing. We identify 1,415 tandem duplications that are segregating in D. yakuba as well as 975 duplications in D. simulans, indicating greater variation in D. yakuba. Additionally, we observe high rates of secondary deletions at duplicated sites, with 8% of duplicated sites in D. simulans and 17% of sites in D. yakuba modified with deletions. These secondary deletions are consistent with the action of the large loop mismatch repair system acting to remove polymorphic tandem duplication, resulting in rapid dynamics of gain and loss in duplicated alleles and a richer substrate of genetic novelty than has been previously reported. Most duplications are present in only single strains, suggesting deleterious impacts are common. D. simulans shows larger numbers of whole gene duplications in comparison to larger proportions of gene fragments in D. yakuba. D. simulans displays an excess of high frequency variants on the X chromosome, consistent with adaptive evolution through duplications on the D. simulans X or demographic forces driving duplicates to high frequency. We identify 78 chimeric genes in D. yakuba and 38 chimeric genes in D. simulans, as well as 143 cases of recruited non-coding sequence in D. yakuba and 96 in D. simulans, in agreement with rates of chimeric gene origination in D. melanogaster. Together, these results suggest that tandem duplications often result in complex variation beyond whole gene duplications that offers a rich substrate of standing variation that is likely to contribute both to detrimental phenotypes and disease, as well as to adaptive evolutionary change.Comment: Revised Version- Accepted at Molecular Biology and Evolutio

    An RNA-Seq bioinformatics pipeline for data processing of Arabidopsis thaliana datasets

    Get PDF
    Floral transition is a crucial event in the reproductive cycle of a flowering plant during which many genes are expressed that govern the transition phase and regulate the expression and functions of several other genes involved in the process. Identification of additional genes connected to flowering genes is vital since they may regulate flowering genes and vice versa. Through our study, expression values of these additional genes has been found similar to flowering genes FLC and LFY in the transition phase. The presented approach plays a crucial role in this discovery. An RNA-Seq computational pipeline was developed for identification of novel genes involved in floral transition from A. thaliana apical shoot meristem time-series data. By intersecting differentially expressed genes from Cuffdiff, DESeq and edgeR methods, 690 genes were identified. Using FDR cutoff of 0.05, we identified 30 genes involved in glucosinolate and glycosinolate biosynthetic processes as principle regulators in the transition phase which provide protection to plants from herbivores and pathogens during flowering. Additionally, expression profiles of highly connected genes in protein-protein interaction network analysis revealed 76 genes with non-functional association and high correlation to flowering genes FLC and LFY which suggests their potential and principal role in floral regulation not identified previously in any studies

    Cooperative Approach for Composite Ontology Mapping

    Get PDF
    This paper proposes a cooperative approach for composite ontology mapping. We first present an extended classification of automated ontology matching and propose an automatic composite solution for the matching problem based on cooperation. In our proposal, agents apply individual mapping algorithms and cooperate in order to change their individual results. We assume that the approaches are complementary to each other and their combination produces better results than the individual ones. Next, we compare our model with three state of the art matching systems. The results are promising specially for what concerns precision and recall. Finally, we propose an argumentation formalism as an extension of our initial model. We compare our argumentation model with the matching systems, showing improvements on the results

    The computational analysis of post-translational modifications

    Get PDF
    The post-translational modification (PTMs) of proteins presents a means to increase the proteome size and diversity of an organism through the inclusion of structural elements not encoded at the sequence-level alone. Their erroneous inclusion or exclusion has been linked to a variety of diseases and disorders thus their characterisation has the potential to present viable drug targets. The proliferation of newer high-throughput methods, such as mass spectrometry, to identify such modifications has led to a rapid increase in the number of databases and tools to display and analyse such vast amounts of data effectively. This study covers the development of one such tool; PTM Browser, and the construction of the underlying database that it is based upon. This new database was initially seeded with annotations from the Swiss-Prot and Phospho.ELM resources. The initial database of PTMs was then expanded to include a large repertoire of previously unannotated proteins for a selection of topical species (e.g. Danio rerio and Tetraodon nigroviridis). Orthologue assignments have also been added to the database – to allow for queries to be performed regarding the conservation of modifications between homologous proteins. The PTM Browser tool allows for a full exploration of this new database of PTMs – with a special focus on allowing users to identify modifications that are both shared between and are specific to particular species. This tool is freely available for non-commercial use at the following URL: http://www.ptmbrowser.org. An analysis is presented on the conservation of modifications between members of the tumour suppressor family, p53, using this new tool. This tool has also been used to analysis the conservation of modifications between super-kingdoms and Eukaryote species

    Predicting protein interface residues using easily accessible on-line resources

    Get PDF
    © The Author 2015. Published by Oxford University Press. It has beenmore than a decade since the completion of the Human Genome Project that provided us with a complete list of human proteins. The next obvious task is to figure out how various parts interact with each other. On that account, we re- view 10methods for protein interface prediction, which are freely available as web servers. In addition, we comparatively evaluate their performance on a common data set comprising different quality target structures. We find that using experi- mental structures and high-quality homology models, structure-basedmethods outperformthose using only protein se- quences, with global template-based approaches providing the best performance. Formoderate-qualitymodels, sequence- basedmethods often performbetter than those structure-based techniques that rely on fine atomic details. We note that post-processing protocols implemented in severalmethods quantitatively improve the results only for experimental struc- tures, suggesting that these procedures should be tuned up for computer-generatedmodels. Finally, we anticipate that advancedmeta-prediction protocols are likely to enhance interface residue prediction. Notwithstanding further improve- ments, easily accessible web servers already provide the scientific community with convenient resources for the identifica- tion of protein-protein interaction sites

    EST-derived SSR markers used as anchor loci for the construction of a consensus linkage map in ryegrass (Lolium spp.)

    Get PDF
    BACKGROUND: Genetic markers and linkage mapping are basic prerequisites for marker-assisted selection and map-based cloning. In the case of the key grassland species Lolium spp., numerous mapping populations have been developed and characterised for various traits. Although some genetic linkage maps of these populations have been aligned with each other using publicly available DNA markers, the number of common markers among genetic maps is still low, limiting the ability to compare candidate gene and QTL locations across germplasm. RESULTS: A set of 204 expressed sequence tag (EST)-derived simple sequence repeat (SSR) markers has been assigned to map positions using eight different ryegrass mapping populations. Marker properties of a subset of 64 EST-SSRs were assessed in six to eight individuals of each mapping population and revealed 83% of the markers to be polymorphic in at least one population and an average number of alleles of 4.88. EST-SSR markers polymorphic in multiple populations served as anchor markers and allowed the construction of the first comprehensive consensus map for ryegrass. The integrated map was complemented with 97 SSRs from previously published linkage maps and finally contained 284 EST-derived and genomic SSR markers. The total map length was 742 centiMorgan (cM), ranging for individual chromosomes from 70 cM of linkage group (LG) 6 to 171 cM of LG 2. CONCLUSIONS: The consensus linkage map for ryegrass based on eight mapping populations and constructed using a large set of publicly available Lolium EST-SSRs mapped for the first time together with previously mapped SSR markers will allow for consolidating existing mapping and QTL information in ryegrass. Map and markers presented here will prove to be an asset in the development for both molecular breeding of ryegrass as well as comparative genetics and genomics within grass species
    corecore