17 research outputs found

    PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes

    Get PDF
    Motivation: PSORTb has remained the most precise bacterial protein subcellular localization (SCL) predictor since it was first made available in 2003. However, the recall needs to be improved and no accurate SCL predictors yet make predictions for archaea, nor differentiate important localization subcategories, such as proteins targeted to a host cell or bacterial hyperstructures/organelles. Such improvements should preferably be encompassed in a freely available web-based predictor that can also be used as a standalone program

    Supervised ontology to document interlinking

    Get PDF
    The value from the growing availability of online documents and ontologies will increase significantly once these two resources become deeply interlinked at the semantic level. We focus our investigation on the automated identification and the linking of concepts and relations mentioned in a document that are (or should be) in a domain-specific ontology. Such semantic information can allow for improved navigation of the information space: users can more quickly retrieve documents that mention the relations sought; Ontology engineers can enhance concepts with relations extracted from the literature; and more advanced natural language-based applications such as text summarization, textual entailment, and machine reading become ever more possible. In this thesis, we present the task of supervised semantic interlinking of documents to an ontology. We also propose a supervised algorithm that identifies and links concept mentions that are (or should be) in the ontology, and also identify mentions of binary relations that are (or should be) in the ontology. The resulting system, SDOI, is tested on a novel corpus and ontology from the data mining field on intrinsic measures such as accuracy, and extrinsic measures such time saved by the annotator in the annotation process. One day many high-value documents and ontologies will be interlinked to each other. This thesis presents a principled step towards that outcome

    Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts

    No full text
    Background Research into semantic relation recognition from text has focused on the identification of binary relations that are contained within one sentence. In the domain of biomedical documents however relations of interest can have more than two arguments and can also have their entity mentions located on different sentences. An example of this scenario is the ternary relation of ā€œsubcellular localizationā€ which relates whether an organismā€™s (O) protein (P) has subcellular location (L) as one of its target destinations. Empirical evidence suggests that approximately one half of the mentions for this ternary relation reside on multi-sentence passages. Results We introduce a relation recognition algorithm that can detect n-ary relations across multiple sentences in a document, and use the subcellular localization relation as a motivating example. The approach uses a text-graph representation of the entire document that is based on intrasentential edges derived from each sentenceā€™s predicted syntactic parse trees, and on intersentential edges based on either the linking of adjacent sentences or the linking of coreferents, if reliable coreference predictions are available

    Championing of an LTV model at LTC

    No full text
    corecore