62 research outputs found

    A Maximum-Entropy approach for accurate document annotation in the biomedical domain

    Get PDF
    The increasing number of scientific literature on the Web and the absence of efficient tools used for classifying and searching the documents are the two most important factors that influence the speed of the search and the quality of the results. Previous studies have shown that the usage of ontologies makes it possible to process document and query information at the semantic level, which greatly improves the search for the relevant information and makes one step further towards the Semantic Web. A fundamental step in these approaches is the annotation of documents with ontology concepts, which can also be seen as a classification task. In this paper we address this issue for the biomedical domain and present a new automated and robust method, based on a Maximum Entropy approach, for annotating biomedical literature documents with terms from the Medical Subject Headings (MeSH)

    Biomedical word sense disambiguation with ontologies and metadata: automation meets accuracy

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Ontology term labels can be ambiguous and have multiple senses. While this is no problem for human annotators, it is a challenge to automated methods, which identify ontology terms in text. Classical approaches to word sense disambiguation use co-occurring words or terms. However, most treat ontologies as simple terminologies, without making use of the ontology structure or the semantic similarity between terms. Another useful source of information for disambiguation are metadata. Here, we systematically compare three approaches to word sense disambiguation, which use ontologies and metadata, respectively.</p> <p>Results</p> <p>The 'Closest Sense' method assumes that the ontology defines multiple senses of the term. It computes the shortest path of co-occurring terms in the document to one of these senses. The 'Term Cooc' method defines a log-odds ratio for co-occurring terms including co-occurrences inferred from the ontology structure. The 'MetaData' approach trains a classifier on metadata. It does not require any ontology, but requires training data, which the other methods do not. To evaluate these approaches we defined a manually curated training corpus of 2600 documents for seven ambiguous terms from the Gene Ontology and MeSH. All approaches over all conditions achieve 80% success rate on average. The 'MetaData' approach performed best with 96%, when trained on high-quality data. Its performance deteriorates as quality of the training data decreases. The 'Term Cooc' approach performs better on Gene Ontology (92% success) than on MeSH (73% success) as MeSH is not a strict is-a/part-of, but rather a loose is-related-to hierarchy. The 'Closest Sense' approach achieves on average 80% success rate.</p> <p>Conclusion</p> <p>Metadata is valuable for disambiguation, but requires high quality training data. Closest Sense requires no training, but a large, consistently modelled ontology, which are two opposing conditions. Term Cooc achieves greater 90% success given a consistently modelled ontology. Overall, the results show that well structured ontologies can play a very important role to improve disambiguation.</p> <p>Availability</p> <p>The three benchmark datasets created for the purpose of disambiguation are available in Additional file <supplr sid="S1">1</supplr>.</p> <suppl id="S1"> <title> <p>Additional file 1</p> </title> <text> <p><b>Benchmark datasets used in the experiments.</b> The three corpora (High quality/Low quantity corpus; Medium quality/Medium quantity corpus; Low quality/High quantity corpus) are given in the form of PubMed identifiers (PMID) for True/False cases for the 7 ambiguous terms examined (GO/MeSH/UMLS identifiers are also given).</p> </text> <file name="1471-2105-10-28-S1.txt"> <p>Click here for file</p> </file> </suppl

    "It's a revolving door": Ego-depletion among prisoners with injecting drug use histories as a barrier to post-release success

    Full text link
    Background: People who inject drugs (PWID) are overrepresented among prisoner populations worldwide. This qualitative study used the psychological concept of “ego-depletion” as an exploratory framework to better understand the disproportionate rates of reincarceration among people with injecting drug use histories. The aim was to illuminate mechanisms by which prospects for positive post-release outcomes for PWID are enhanced or constricted. Methods: Participants were recruited from a longitudinal cohort study, SuperMIX, in Victoria, Australia. Eligible participants were invited to participate in an in-depth interview. Inclusion criteria were: aged 18+; lifetime history of injecting drug use; incarcerated for >three months and released from custody <12 months previously. Analysis of 48 interviews examined how concepts relevant to the ego-depletion framework (self-regulation; standards; consequences and mitigators of ego-depletion) manifested in participants’ narratives. Results: Predominantly, participants aimed to avoid a return to problematic drug use and recidivism, and engaged in effortful self-regulation to pursue their post-release goals. Post-release environments were found to diminish self-regulation resources, leading to states of ego-depletion and compromising the capacity to self-regulate according to their ideals. Fatalism, stress, and fatigue associated with the transition period exacerbated ego-depletion. Strategies that mitigated ego-depletion included avoidance of triggering environments; reducing stress through opioid agonist therapy; and fostering positive affect through supportive relationships. Conclusions: Post-release environments are ego-depleting and inconducive to sustaining behavioural changes for PWID leaving prison. Corrections’ behaviourist paradigms take insufficient account of the socio-structural factors impacting on an individual's self-regulation capacities in the context of drug dependence and desistance processes. Breaking the cycles of reincarceration among PWID requires new approaches that moderate ego-depletion and facilitate long-term goal-pursuit

    GoPubMed: Exploring Pubmed with Ontological Background Knowledge

    Get PDF
    With the ever increasing size of scientific literature, finding relevant documents and answering questions has become even more of a challenge. Recently, ontologies - hierarchical, controlled vocabularies - have been introduced to annotate genomic data. They can also improve the question answering and the selection of relevant documents in the literature search. Search engines such as GoPubMed.org use ontological background knowledge to give an overview over large query results and to help answering questions. We review the problems and solutions underlying these next generation intelligent search engines and give examples of the power of this new search paradigm

    Restructuring Linear Discussions in Mind Maps by Crowdsourcing

    No full text
    corecore