285 research outputs found

    Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

    Get PDF
    The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

    Multilingual Word Sense Induction to Improve Web Search Result Clustering

    Get PDF
    In [12] a novel approach to Web search result clustering based on Word Sense Induction, i.e. the automatic discovery of word senses from raw text was presented; key to the proposed approach is the idea of, first, automatically in- ducing senses for the target query and, second, clustering the search results based on their semantic similarity to the word senses induced. In [1] we proposed an innovative Word Sense Induction method based on multilingual data; key to our approach was the idea that a multilingual context representation, where the context of the words is expanded by considering its translations in different languages, may im- prove the WSI results; the experiments showed a clear per- formance gain. In this paper we give some preliminary ideas to exploit our multilingual Word Sense Induction method to Web search result clustering

    Establishing an EU-China consortium on traditional Chinese medicine research.

    Get PDF
    Traditional Chinese medicine (TCM) is widely used in the European Union (EU) and attracts intense research interests from European scientists. As an emerging area in Europe, TCM research requires collaboration and coordination of actions. Good Practice in Traditional Chinese Medicine Research in the Post-genomic Era, also known as GP-TCM, is the first ever EU-funded 7th Framework Programme (FP7) coordination action, aiming to inform the best practice and harmonise research on the safety and efficacy of TCM through interdisciplinary exchange of experience and expertise among clinicians and scientists. With its increasingly large pool of expertise across 19 countries including 13 EU member states, Australia, Canada, China, Norway, Thailand and the USA, the consortium provides forums and collaboration platforms on quality control, extraction technology, component analysis, toxicology, pharmacology and regulatory issues of Chinese herbal medicine (CHM), as well as on acupuncture studies, with a particular emphasis on the application of a functional genomics approach. The project officially started in May 2009 and by the time of its conclusion in April 2012 a Europe-based academic society dedicated to TCM research will be founded to carry on the mission of GP-TCM.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    De-identification of primary care electronic medical records free-text data in Ontario, Canada

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Electronic medical records (EMRs) represent a potentially rich source of health information for research but the free-text in EMRs often contains identifying information. While de-identification tools have been developed for free-text, none have been developed or tested for the full range of primary care EMR data</p> <p>Methods</p> <p>We used <it>deid </it>open source de-identification software and modified it for an Ontario context for use on primary care EMR data. We developed the modified program on a training set of 1000 free-text records from one group practice and then tested it on two validation sets from a random sample of 700 free-text EMR records from 17 different physicians from 7 different practices in 5 different cities and 500 free-text records from a group practice that was in a different city than the group practice that was used for the training set. We measured the sensitivity/recall, precision, specificity, accuracy and F-measure of the modified tool against manually tagged free-text records to remove patient and physician names, locations, addresses, medical record, health card and telephone numbers.</p> <p>Results</p> <p>We found that the modified training program performed with a sensitivity of 88.3%, specificity of 91.4%, precision of 91.3%, accuracy of 89.9% and F-measure of 0.90. The validations sets had sensitivities of 86.7% and 80.2%, specificities of 91.4% and 87.7%, precisions of 91.1% and 87.4%, accuracies of 89.0% and 83.8% and F-measures of 0.89 and 0.84 for the first and second validation sets respectively.</p> <p>Conclusion</p> <p>The <it>deid </it>program can be modified to reasonably accurately de-identify free-text primary care EMR records while preserving clinical content.</p

    Text Mining the History of Medicine

    Get PDF
    Historical text archives constitute a rich and diverse source of information, which is becoming increasingly readily accessible, due to large-scale digitisation efforts. However, it can be difficult for researchers to explore and search such large volumes of data in an efficient manner. Text mining (TM) methods can help, through their ability to recognise various types of semantic information automatically, e.g., instances of concepts (places, medical conditions, drugs, etc.), synonyms/variant forms of concepts, and relationships holding between concepts (which drugs are used to treat which medical conditions, etc.). TM analysis allows search systems to incorporate functionality such as automatic suggestions of synonyms of user-entered query terms, exploration of different concepts mentioned within search results or isolation of documents in which concepts are related in specific ways. However, applying TM methods to historical text can be challenging, according to differences and evolutions in vocabulary, terminology, language structure and style, compared to more modern text. In this article, we present our efforts to overcome the various challenges faced in the semantic analysis of published historical medical text dating back to the mid 19th century. Firstly, we used evidence from diverse historical medical documents from different periods to develop new resources that provide accounts of the multiple, evolving ways in which concepts, their variants and relationships amongst them may be expressed. These resources were employed to support the development of a modular processing pipeline of TM tools for the robust detection of semantic information in historical medical documents with varying characteristics. We applied the pipeline to two large-scale medical document archives covering wide temporal ranges as the basis for the development of a publicly accessible semantically-oriented search system. The novel resources are available for research purposes, while the processing pipeline and its modules may be used and configured within the Argo TM platform

    Enzyme structure dynamics of xylanase I from Trichoderma longibrachiatum

    Get PDF
    BACKGROUND: Enzyme dynamics has recently been shown to be crucial for structure-function relationship. Among various structure dynamics analysis platforms, HDX (hydrogen deuterium exchange) mass spectrometry stands out as an efficient and high-throughput way to analyze protein dynamics upon ligand binding. Despite the potential, limited research has employed the HDX mass spec platform to probe regional structure dynamics of enzymes. In particular, the technique has never been used for analyzing cell wall degrading enzymes. We hereby used xylanase as a model to explore the potential of HDX mass spectrometry for studying cell wall degrading enzymes. RESULTS: HDX mass spectrometry revealed significant intrinsic dynamics for the xylanase enzyme. Different regions of the enzymes are differentially stabilized in the apo enzyme. The comparison of substrate-binding enzymes revealed that xylohexaose can significantly stabilize the enzyme. Several regions including those near the reaction centres were significantly stabilized during the xylohexaose binding. As compared to xylohexaose, xylan induced relatively less protection in the enzyme, which may be due to the insolubility of the substrate. The structure relevance of the enzyme dynamics was discussed with reference to the three dimensional structure of the enzyme. HDX mass spectrometry revealed strong dynamics-function relevance and such relevance can be explored for the future enzyme improvement. CONCLUSION: Ligand-binding can lead to the significant stabilization at both regional and global level for enzymes like xylanase. HDX mass spectrometry is a powerful high-throughput platform to identify the key regions protected during the ligand binding and to explore the molecular mechanisms of the enzyme function. The HDX mass spectrometry analysis of cell wall degrading enzymes has provided a novel platform to guide the rational design of enzymes

    Evaluation of Negation and Uncertainty Detection and its Impact on Precision and Recall in Search

    Get PDF
    Radiology reports contain information that can be mined using a search engine for teaching, research, and quality assurance purposes. Current search engines look for exact matches to the search term, but they do not differentiate between reports in which the search term appears in a positive context (i.e., being present) from those in which the search term appears in the context of negation and uncertainty. We describe RadReportMiner, a context-aware search engine, and compare its retrieval performance with a generic search engine, Google Desktop. We created a corpus of 464 radiology reports which described at least one of five findings (appendicitis, hydronephrosis, fracture, optic neuritis, and pneumonia). Each report was classified by a radiologist as positive (finding described to be present) or negative (finding described to be absent or uncertain). The same reports were then classified by RadReportMiner and Google Desktop. RadReportMiner achieved a higher precision (81%), compared with Google Desktop (27%; p < 0.0001). RadReportMiner had a lower recall (72%) compared with Google Desktop (87%; p = 0.006). We conclude that adding negation and uncertainty identification to a word-based radiology report search engine improves the precision of search results over a search engine that does not take this information into account. Our approach may be useful to adopt into current report retrieval systems to help radiologists to more accurately search for radiology reports

    Using theatre in education in a traditional lecture oriented medical curriculum

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Lectures supported by theatrical performance may enhance learning and be an attractive alternative to traditional lectures. This study describes our experience with using theatre in education for medical students since 2001.</p> <p>Methods</p> <p>The volunteer students, coached by experienced students, were given a two-week preparation period to write and prepare different dramatized headache scenarios during three supervised meetings. A theatrical performance was followed by a student presentation about history taking and clinical findings in diagnosing headache. Finally, a group discussion led by students dealt with issues raised in the performance. The evaluation of the theatre in education lecture "A Primary Care Approach to Headache" was based on feedback from students.</p> <p>Results</p> <p>More than 90% of 43 responding students fully agreed with the statement "Theatrical performance made it easier to understand the topic". More than 90% disagreed with the statements "Lecture halls were not appropriate for this kind of interaction" and "Students as teachers were not appropriate". Open-ended questions showed that the lesson was thought of as fun, good and useful by most students. The headache questions in the final exam showed results that were similar to average exam results for other questions.</p> <p>Conclusion</p> <p>Using theatrical performance in medical education was appreciated by most students and may facilitate learning and enhance empathy and team work communication skills.</p
    corecore