1,761 research outputs found

    GeneRIF indexing: sentence selection based on machine learning

    Get PDF

    Leveraging Machine Learning to Understand How Emotions Influence Equity Related Education: Quasi-Experimental Study

    Get PDF
    Background: Teaching and learning about topics such as bias are challenging due to the emotional nature of bias-related discourse. However, emotions can be challenging to study in health professions education for numerous reasons. With the emergence of machine learning and natural language processing, sentiment analysis (SA) has the potential to bridge the gap. Objective: To improve our understanding of the role of emotions in bias-related discourse, we developed and conducted a SA of bias-related discourse among health professionals. Methods: We conducted a 2-stage quasi-experimental study. First, we developed a SA (algorithm) within an existing archive of interviews with health professionals about bias. SA refers to a mechanism of analysis that evaluates the sentiment of textual data by assigning scores to textual components and calculating and assigning a sentiment value to the text. Next, we applied our SA algorithm to an archive of social media discourse on Twitter that contained equity-related hashtags to compare sentiment among health professionals and the general population. Results: When tested on the initial archive, our SA algorithm was highly accurate compared to human scoring of sentiment. An analysis of bias-related social media discourse demonstrated that health professional tweets (n=555) were less neutral than the general population (n=6680) when discussing social issues on professionally associated accounts (x2 [2, n=555)]=35.455; P\u3c.001), suggesting that health professionals attach more sentiment to their posts on Twitter than seen in the general population. Conclusions: The finding that health professionals are more likely to show and convey emotions regarding equity-related issues on social media has implications for teaching and learning about sensitive topics related to health professions education. Such emotions must therefore be considered in the design, delivery, and evaluation of equity and bias-related education

    Gene Ontology density estimation and discourse analysis for automatic GeneRiF extraction

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper describes and evaluates a sentence selection engine that extracts a GeneRiF (Gene Reference into Functions) as defined in ENTREZ-Gene based on a MEDLINE record. Inputs for this task include both a gene and a pointer to a MEDLINE reference. In the suggested approach we merge two independent sentence extraction strategies. The first proposed strategy (LASt) uses argumentative features, inspired by discourse-analysis models. The second extraction scheme (GOEx) uses an automatic text categorizer to estimate the density of Gene Ontology categories in every sentence; thus providing a full ranking of all possible candidate GeneRiFs. A combination of the two approaches is proposed, which also aims at reducing the size of the selected segment by filtering out non-content bearing rhetorical phrases.</p> <p>Results</p> <p>Based on the TREC-2003 Genomics collection for GeneRiF identification, the LASt extraction strategy is already competitive (52.78%). When used in a combined approach, the extraction task clearly shows improvement, achieving a Dice score of over 57% (+10%).</p> <p>Conclusions</p> <p>Argumentative representation levels and conceptual density estimation using Gene Ontology contents appear complementary for functional annotation in proteomics.</p

    Cues disseminated by professional associations that represent 5 health care professions across 5 nations : lexical analysis of tweets

    Get PDF
    Background: Collaboration across health care professions is critical in efficiently and effectively managing complex and chronic health conditions, yet interprofessional care does not happen automatically. Professional associations have a key role in setting a profession’s agenda, maintaining professional identity, and establishing priorities. The associations’ external communication is commonly undertaken through social media platforms, such as Twitter. Despite the valuable insights potentially available into professional associations through such communication, to date, their messaging has not been examined. Objective: This study aimed to identify the cues disseminated by professional associations that represent 5 health care professions spanning 5 nations. Methods: Using a back-iterative application programming interface methodology, public tweets were sourced from professional associations that represent 5 health care professions that have key roles in community-based health care: general practice, nursing, pharmacy, physiotherapy, and social work. Furthermore, the professional associations spanned Australia, Canada, New Zealand, the United Kingdom, and the United States. A lexical analysis was conducted of the tweets using Leximancer (Leximancer Pty Ltd) to clarify relationships within the discourse. Results: After completing a lexical analysis of 50,638 tweets, 7 key findings were identified. First, the discourse was largely devoid of references to interprofessional care. Second, there was no explicit discourse pertaining to physiotherapists. Third, although all the professions represented in this study support patients, discourse pertaining to general practitioners was most likely to be connected with that pertaining to patients. Fourth, tweets pertaining to pharmacists were most likely to be connected with discourse pertaining to latest and research. Fifth, tweets about social workers were unlikely to be connected with discourse pertaining to health or care. Sixth, notwithstanding a few exceptions, the findings across the different nations were generally similar, suggesting their generality. Seventh and last, tweets pertaining to physiotherapists were most likely to refer to discourse pertaining to profession. Conclusions: The findings indicate that health care professional associations do not use Twitter to disseminate cues that reinforce the importance of interprofessional care. Instead, they largely use this platform to emphasize what they individually deem to be important and advance the interests of their respective professions. Therefore, there is considerable opportunity for professional associations to assert how the profession they represent complements other health care professions and how the professionals they represent can enact interprofessional care for the benefit of patients and carers

    Medical WordNet: A new methodology for the construction and validation of information resources for consumer health

    Get PDF
    A consumer health information system must be able to comprehend both expert and non-expert medical vocabulary and to map between the two. We describe an ongoing project to create a new lexical database called Medical WordNet (MWN), consisting of medically relevant terms used by and intelligible to non-expert subjects and supplemented by a corpus of natural-language sentences that is designed to provide medically validated contexts for MWN terms. The corpus derives primarily from online health information sources targeted to consumers, and involves two sub-corpora, called Medical FactNet (MFN) and Medical BeliefNet (MBN), respectively. The former consists of statements accredited as true on the basis of a rigorous process of validation, the latter of statements which non-experts believe to be true. We summarize the MWN / MFN / MBN project, and describe some of its applications

    Automatically Recognizing Medication and Adverse Event Information From Food and Drug Administration\u27s Adverse Event Reporting System Narratives

    Get PDF
    BACKGROUND: The Food and Drug Administration\u27s (FDA) Adverse Event Reporting System (FAERS) is a repository of spontaneously-reported adverse drug events (ADEs) for FDA-approved prescription drugs. FAERS reports include both structured reports and unstructured narratives. The narratives often include essential information for evaluation of the severity, causality, and description of ADEs that are not present in the structured data. The timely identification of unknown toxicities of prescription drugs is an important, unsolved problem. OBJECTIVE: The objective of this study was to develop an annotated corpus of FAERS narratives and biomedical named entity tagger to automatically identify ADE related information in the FAERS narratives. METHODS: We developed an annotation guideline and annotate medication information and adverse event related entities on 122 FAERS narratives comprising approximately 23,000 word tokens. A named entity tagger using supervised machine learning approaches was built for detecting medication information and adverse event entities using various categories of features. RESULTS: The annotated corpus had an agreement of over .9 Cohen\u27s kappa for medication and adverse event entities. The best performing tagger achieves an overall performance of 0.73 F1 score for detection of medication, adverse event and other named entities. C ONCLUSIONS: In this study, we developed an annotated corpus of FAERS narratives and machine learning based models for automatically extracting medication and adverse event information from the FAERS narratives. Our study is an important step towards enriching the FAERS data for postmarketing pharmacovigilance

    Semi-supervised prediction of protein interaction sentences exploiting semantically encoded metrics

    Get PDF
    Protein-protein interaction (PPI) identification is an integral component of many biomedical research and database curation tools. Automation of this task through classification is one of the key goals of text mining (TM). However, labelled PPI corpora required to train classifiers are generally small. In order to overcome this sparsity in the training data, we propose a novel method of integrating corpora that do not contain relevance judgements. Our approach uses a semantic language model to gather word similarity from a large unlabelled corpus. This additional information is integrated into the sentence classification process using kernel transformations and has a re-weighting effect on the training features that leads to an 8% improvement in F-score over the baseline results. Furthermore, we discover that some words which are generally considered indicative of interactions are actually neutralised by this process

    Semantic metadata annotation. Tagging Medline abstracts for enhanced information access.

    Get PDF
    International audiencePurpose - The object of this study is to develop methods for automatically annotating the argumentative role of sentences in scientific abstracts. Working from Medline abstracts, sentences were classified into four major argumentative roles: objective, method, result, and conclusion. The idea is that, if the role of each sentence can be marked up, then these metadata can be used during information retrieval to seek particular types of information such as novelty, conclusions, methodologies, aims/goals of a scientific piece of work. Design/methodology/approach - Two approaches were tested: linguistic cues and positional heuristics. Linguistic cues are lexico-syntactic patterns modelled as regular expressions implemented in a linguistic parser. Positional heuristics make use of the relative position of a sentence in the abstract to deduce its argumentative class. Findings - The experiments showed that positional heuristics attained a much higher degree of accuracy on Medline abstracts with an F-score of 64 per cent, whereas the linguistic cues only attained an F-score of 12 per cent. This is mostly because sentences from different argumentative roles are not always announced by surface linguistic cues. Research limitations/implications - A limitation to the study was the inability to test other methods to perform this task such as machine learning techniques which have been reported to perform better on Medline abstracts. Also, to compare the results of the study with earlier studies using Medline abstracts, the different argumentative roles present in Medline had to be mapped on to four major argumentative roles. This may have favourably biased the performance of the sentence classification by positional heuristics. Originality/value - To the best of one's knowledge, this study presents the first instance of evaluating linguistic cues and positional heuristics on the same corpus

    CREATING A BIOMEDICAL ONTOLOGY INDEXED SEARCH ENGINE TO IMPROVE THE SEMANTIC RELEVANCE OF RETREIVED MEDICAL TEXT

    Get PDF
    Medical Subject Headings (MeSH) is a controlled vocabulary used by the National Library of Medicine to index medical articles, abstracts, and journals contained within the MEDLINE database. Although MeSH imposes uniformity and consistency in the indexing process, it has been proven that using MeSH indices only result in a small increase in precision over free-text indexing. Moreover, studies have shown that the use of controlled vocabularies in the indexing process is not an effective method to increase semantic relevance in information retrieval. To address the need for semantic relevance, we present an ontology-based information retrieval system for the MEDLINE collection that result in a 37.5% increase in precision when compared to free-text indexing systems. The presented system focuses on the ontology to: provide an alternative to text-representation for medical articles, finding relationships among co-occurring terms in abstracts, and to index terms that appear in text as well as discovered relationships. The presented system is then compared to existing MeSH and Free-Text information retrieval systems. This dissertation provides a proof-of-concept for an online retrieval system capable of providing increased semantic relevance when searching through medical abstracts in MEDLINE
    • …
    corecore