120 research outputs found

    Dynamic summarization of bibliographic-based data

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Traditional information retrieval techniques typically return excessive output when directed at large bibliographic databases. Natural Language Processing applications strive to extract salient content from the excessive data. Semantic MEDLINE, a National Library of Medicine (NLM) natural language processing application, highlights relevant information in PubMed data. However, Semantic MEDLINE implements manually coded schemas, accommodating few information needs. Currently, there are only five such schemas, while many more would be needed to realistically accommodate all potential users. The aim of this project was to develop and evaluate a statistical algorithm that automatically identifies relevant bibliographic data; the new algorithm could be incorporated into a dynamic schema to accommodate various information needs in Semantic MEDLINE, and eliminate the need for multiple schemas.</p> <p>Methods</p> <p>We developed a flexible algorithm named Combo that combines three statistical metrics, the Kullback-Leibler Divergence (KLD), Riloff's RlogF metric (RlogF), and a new metric called PredScal, to automatically identify salient data in bibliographic text. We downloaded citations from a PubMed search query addressing the genetic etiology of bladder cancer. The citations were processed with SemRep, an NLM rule-based application that produces semantic predications. SemRep output was processed by Combo, in addition to the standard Semantic MEDLINE genetics schema and independently by the two individual KLD and RlogF metrics. We evaluated each summarization method using an existing reference standard within the task-based context of genetic database curation.</p> <p>Results</p> <p>Combo asserted 74 genetic entities implicated in bladder cancer development, whereas the traditional schema asserted 10 genetic entities; the KLD and RlogF metrics individually asserted 77 and 69 genetic entities, respectively. Combo achieved 61% recall and 81% precision, with an F-score of 0.69. The traditional schema achieved 23% recall and 100% precision, with an F-score of 0.37. The KLD metric achieved 61% recall, 70% precision, with an F-score of 0.65. The RlogF metric achieved 61% recall, 72% precision, with an F-score of 0.66.</p> <p>Conclusions</p> <p>Semantic MEDLINE summarization using the new Combo algorithm outperformed a conventional summarization schema in a genetic database curation task. It potentially could streamline information acquisition for other needs without having to hand-build multiple saliency schemas.</p

    Clustering cliques for graph-based summarization of the biomedical research literature

    Get PDF
    BACKGROUND: Graph-based notions are increasingly used in biomedical data mining and knowledge discovery tasks. In this paper, we present a clique-clustering method to automatically summarize graphs of semantic predications produced from PubMed citations (titles and abstracts). RESULTS: SemRep is used to extract semantic predications from the citations returned by a PubMed search. Cliques were identified from frequently occurring predications with highly connected arguments filtered by degree centrality. Themes contained in the summary were identified with a hierarchical clustering algorithm based on common arguments shared among cliques. The validity of the clusters in the summaries produced was compared to the Silhouette-generated baseline for cohesion, separation and overall validity. The theme labels were also compared to a reference standard produced with major MeSH headings. CONCLUSIONS: For 11 topics in the testing data set, the overall validity of clusters from the system summary was 10% better than the baseline (43% versus 33%). While compared to the reference standard from MeSH headings, the results for recall, precision and F-score were 0.64, 0.65, and 0.65 respectively

    A critical review of PASBio's argument structures for biomedical verbs

    Get PDF
    BACKGROUND: Propositional representations of biomedical knowledge are a critical component of most aspects of semantic mining in biomedicine. However, the proper set of propositions has yet to be determined. Recently, the PASBio project proposed a set of propositions and argument structures for biomedical verbs. This initial set of representations presents an opportunity for evaluating the suitability of predicate-argument structures as a scheme for representing verbal semantics in the biomedical domain. Here, we quantitatively evaluate several dimensions of the initial PASBio propositional structure repository. RESULTS: We propose a number of metrics and heuristics related to arity, role labelling, argument realization, and corpus coverage for evaluating large-scale predicate-argument structure proposals. We evaluate the metrics and heuristics by applying them to PASBio 1.0. CONCLUSION: PASBio demonstrates the suitability of predicate-argument structures for representing aspects of the semantics of biomedical verbs. Metrics related to theta-criterion violations and to the distribution of arguments are able to detect flaws in semantic representations, given a set of predicate-argument structures and a relatively small corpus annotated with them

    Subtle changes in the flavour and texture of a drink enhance expectations of satiety

    Get PDF
    Background: The consumption of liquid calories has been implicated in the development of obesity and weight gain. Energy-containing drinks are often reported to have a weak satiety value: one explanation for this is that because of their fluid texture they are not expected to have much nutritional value. It is important to consider what features of these drinks can be manipulated to enhance their expected satiety value. Two studies investigated the perception of subtle changes in a drink’s viscosity, and the extent to which thick texture and creamy flavour contribute to the generation of satiety expectations. Participants in the first study rated the sensory characteristics of 16 fruit yogurt drinks of increasing viscosity. In study two, a new set of participants evaluated eight versions of the fruit yogurt drink, which varied in thick texture, creamy flavour and energy content, for sensory and hedonic characteristics and satiety expectations. Results: In study one, participants were able to perceive small changes in drink viscosity that were strongly related to the actual viscosity of the drinks. In study two, the thick versions of the drink were expected to be more filling and have a greater expected satiety value, independent of the drink’s actual energy content. A creamy flavour enhanced the extent to which the drink was expected to be filling, but did not affect its expected satiety. Conclusions: These results indicate that subtle manipulations of texture and creamy flavour can increase expectations that a fruit yogurt drink will be filling and suppress hunger, irrespective of the drink’s energy content. A thicker texture enhanced expectations of satiety to a greater extent than a creamier flavour, and may be one way to improve the anticipated satiating value of energy-containing beverages

    Automation of a problem list using natural language processing

    Get PDF
    BACKGROUND: The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. METHODS: For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular). We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP) to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. RESULTS: The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients), but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. CONCLUSION: The global aim of our project is to automate the process of creating and maintaining a problem list for hospitalized patients and thereby help to guarantee the timeliness, accuracy and completeness of this information

    Intracellular chloride concentration influences the GABAA receptor subunit composition

    Get PDF
    GABAA receptors (GABAARs) exist as different subtype variants showing unique functional properties and defined spatio-temporal expression pattern. The molecular mechanisms underlying the developmental expression of different GABAAR are largely unknown. The intracellular concentration of chloride ([Cl−]i), the main ion permeating through GABAARs, also undergoes considerable changes during maturation, being higher at early neuronal stages with respect to adult neurons. Here we investigate the possibility that [Cl−]i could modulate the sequential expression of specific GABAARs subtypes in primary cerebellar neurons. We show that [Cl−]i regulates the expression of α3-1 and δ-containing GABAA receptors, responsible for phasic and tonic inhibition, respectively. Our findings highlight the role of [Cl−]i in tuning the strength of GABAergic responses by acting as an intracellular messenger

    Are decision trees a feasible knowledge representation to guide extraction of critical information from randomized controlled trial reports?

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>This paper proposes the use of decision trees as the basis for automatically extracting information from published randomized controlled trial (RCT) reports. An exploratory analysis of RCT abstracts is undertaken to investigate the feasibility of using decision trees as a semantic structure. Quality-of-paper measures are also examined.</p> <p>Methods</p> <p>A subset of 455 abstracts (randomly selected from a set of 7620 retrieved from Medline from 1998 – 2006) are examined for the quality of RCT reporting, the identifiability of RCTs from abstracts, and the completeness and complexity of RCT abstracts with respect to key decision tree elements. Abstracts were manually assigned to 6 sub-groups distinguishing whether they were primary RCTs versus other design types. For primary RCT studies, we analyzed and annotated the reporting of intervention comparison, population assignment and outcome values. To measure completeness, the frequencies by which complete intervention, population and outcome information are reported in abstracts were measured. A qualitative examination of the reporting language was conducted.</p> <p>Results</p> <p>Decision tree elements are manually identifiable in the majority of primary RCT abstracts. 73.8% of a random subset was primary studies with a single population assigned to two or more interventions. 68% of these primary RCT abstracts were structured. 63% contained pharmaceutical interventions. 84% reported the total number of study subjects. In a subset of 21 abstracts examined, 71% reported numerical outcome values.</p> <p>Conclusion</p> <p>The manual identifiability of decision tree elements in the abstract suggests that decision trees could be a suitable construct to guide machine summarisation of RCTs. The presence of decision tree elements could also act as an indicator for RCT report quality in terms of completeness and uniformity.</p

    Effects of sulfate starvation on agar polysaccharides of Gracilaria species (Gracilariaceae, Rhodophyta) from Morib, Malaysia

    Get PDF
    The effects of sulfate starvation on the agar characteristics of Gracilaria species was investigated by culturing two red algae from Morib, Malaysia, Gracilaria changii and Gracilaria salicornia in sulfate-free artificial seawater for 5 days. The seaweed samples were collected in October 2012 and March 2013, periods which have significant variation in the amount of rainfall. The agar yields were shown to be independent of sulfate availability, with only 0.60–1.20 % increment in treated G. changii and 0.31–1.40 % increment in treated G. salicornia while their gel strengths did not increase significantly (approximately 5–7 %) after sulfate starvation for both species. The gelling and melting temperatures did not vary between control and treated samples from both species, except for the treated G. changii collected in March 2013. The gel syneresis index of G. salicornia collected in March 2013 increased significantly after sulfate deprivation. Sulfate starvation introduced some variations in the content of 3, 6-anhydrogalactose and total sulfate esters, but the changes did not have a pronounced effect on the physical properties of agar
    corecore