86 research outputs found

    Foundation, Implementation and Evaluation of the MorphoSaurus System: Subword Indexing, Lexical Learning and Word Sense Disambiguation for Medical Cross-Language Information Retrieval

    Get PDF
    Im medizinischen Alltag, zu welchem viel Dokumentations- und Recherchearbeit gehört, ist mittlerweile der überwiegende Teil textuell kodierter Information elektronisch verfügbar. Hiermit kommt der Entwicklung leistungsfähiger Methoden zur effizienten Recherche eine vorrangige Bedeutung zu. Bewertet man die Nützlichkeit gängiger Textretrievalsysteme aus dem Blickwinkel der medizinischen Fachsprache, dann mangelt es ihnen an morphologischer Funktionalität (Flexion, Derivation und Komposition), lexikalisch-semantischer Funktionalität und der Fähigkeit zu einer sprachübergreifenden Analyse großer Dokumentenbestände. In der vorliegenden Promotionsschrift werden die theoretischen Grundlagen des MorphoSaurus-Systems (ein Akronym für Morphem-Thesaurus) behandelt. Dessen methodischer Kern stellt ein um Morpheme der medizinischen Fach- und Laiensprache gruppierter Thesaurus dar, dessen Einträge mittels semantischer Relationen sprachübergreifend verknüpft sind. Darauf aufbauend wird ein Verfahren vorgestellt, welches (komplexe) Wörter in Morpheme segmentiert, die durch sprachunabhängige, konzeptklassenartige Symbole ersetzt werden. Die resultierende Repräsentation ist die Basis für das sprachübergreifende, morphemorientierte Textretrieval. Neben der Kerntechnologie wird eine Methode zur automatischen Akquise von Lexikoneinträgen vorgestellt, wodurch bestehende Morphemlexika um weitere Sprachen ergänzt werden. Die Berücksichtigung sprachübergreifender Phänomene führt im Anschluss zu einem neuartigen Verfahren zur Auflösung von semantischen Ambiguitäten. Die Leistungsfähigkeit des morphemorientierten Textretrievals wird im Rahmen umfangreicher, standardisierter Evaluationen empirisch getestet und gängigen Herangehensweisen gegenübergestellt

    Towards a system of concepts for Family Medicine. Multilingual indexing in General Practice/ Family Medicine in the era of Semantic Web

    Get PDF
    UNIVERSITY OF LIÈGE, BELGIUM Executive Summary Faculty of Medicine Département Universitaire de Médecine Générale. Unité de recherche Soins Primaires et Santé Doctor in biomedical sciences Towards a system of concepts for Family Medicine. Multilingual indexing in General Practice/ Family Medicine in the era of SemanticWeb by Dr. Marc JAMOULLE Introduction This thesis is about giving visibility to the often overlooked work of family physicians and consequently, is about grey literature in General Practice and Family Medicine (GP/FM). It often seems that conference organizers do not think of GP/FM as a knowledge-producing discipline that deserves active dissemination. A conference is organized, but not much is done with the knowledge shared at these meetings. In turn, the knowledge cannot be reused or reapplied. This these is also about indexing. To find knowledge back, indexing is mandatory. We must prepare tools that will automatically index the thousands of abstracts that family doctors produce each year in various languages. And finally this work is about semantics1. It is an introduction to health terminologies, ontologies, semantic data, and linked open data. All are expressions of the next step: Semantic Web for health care data. Concepts, units of thought expressed by terms, will be our target and must have the ability to be expressed in multiple languages. In turn, three areas of knowledge are at stake in this study: (i) Family Medicine as a pillar of primary health care, (ii) computational linguistics, and (iii) health information systems. Aim • To identify knowledge produced by General practitioners (GPs) by improving annotation of grey literature in Primary Health Care • To propose an experimental indexing system, acting as draft for a standardized table of content of GP/GM • To improve the searchability of repositories for grey literature in GP/GM. 1For specific terms, see the Glossary page 257 x Methods The first step aimed to design the taxonomy by identifying relevant concepts in a compiled corpus of GP/FM texts. We have studied the concepts identified in nearly two thousand communications of GPs during conferences. The relevant concepts belong to the fields that are focusing on GP/FM activities (e.g. teaching, ethics, management or environmental hazard issues). The second step was the development of an on-line, multilingual, terminological resource for each category of the resulting taxonomy, named Q-Codes. We have designed this terminology in the form of a lightweight ontology, accessible on-line for readers and ready for use by computers of the semantic web. It is also fit for the Linked Open Data universe. Results We propose 182 Q-Codes in an on-line multilingual database (10 languages) (www.hetop.eu/Q) acting each as a filter for Medline. Q-Codes are also available under the form of Unique Resource Identifiers (URIs) and are exportable in Web Ontology Language (OWL). The International Classification of Primary Care (ICPC) is linked to Q-Codes in order to form the Core Content Classification in General Practice/Family Medicine (3CGP). So far, 3CGP is in use by humans in pedagogy, in bibliographic studies, in indexing congresses, master theses and other forms of grey literature in GP/FM. Use by computers is experimented in automatic classifiers, annotators and natural language processing. Discussion To the best of our knowledge, this is the first attempt to expand the ICPC coding system with an extension for family physician contextual issues, thus covering non-clinical content of practice. It remains to be proven that our proposed terminology will help in dealing with more complex systems, such as MeSH, to support information storage and retrieval activities. However, this exercise is proposed as a first step in the creation of an ontology of GP/FM and as an opening to the complex world of Semantic Web technologies. Conclusion We expect that the creation of this terminological resource for indexing abstracts and for facilitating Medline searches for general practitioners, researchers and students in medicine will reduce loss of knowledge in the domain of GP/FM. In addition, through better indexing of the grey literature (congress abstracts, master’s and doctoral theses), we hope to enhance the accessibility of research results and give visibility to the invisible work of family physicians

    Medical Informatics

    Get PDF
    Information technology has been revolutionizing the everyday life of the common man, while medical science has been making rapid strides in understanding disease mechanisms, developing diagnostic techniques and effecting successful treatment regimen, even for those cases which would have been classified as a poor prognosis a decade earlier. The confluence of information technology and biomedicine has brought into its ambit additional dimensions of computerized databases for patient conditions, revolutionizing the way health care and patient information is recorded, processed, interpreted and utilized for improving the quality of life. This book consists of seven chapters dealing with the three primary issues of medical information acquisition from a patient's and health care professional's perspective, translational approaches from a researcher's point of view, and finally the application potential as required by the clinicians/physician. The book covers modern issues in Information Technology, Bioinformatics Methods and Clinical Applications. The chapters describe the basic process of acquisition of information in a health system, recent technological developments in biomedicine and the realistic evaluation of medical informatics

    Mapping of electronic health records in Spanish to the unified medical language system metathesaurus

    Get PDF
    [EN] This work presents a preliminary approach to annotate Spanish electronic health records with concepts of the Unified Medical Language System Metathesaurus. The prototype uses Apache Lucene R to index the Metathesaurus and generate mapping candidates from input text. In addition, it relies on UKB to resolve ambiguities. The tool has been evaluated by measuring its agreement with MetaMap in two English-Spanish parallel corpora, one consisting of titles and abstracts of papers in the clinical domain, and the other of real electronic health record excerpts.[EU] Lan honetan, espainieraz idatzitako mediku-txosten elektronikoak Unified Medical Languge System Metathesaurus deituriko terminologia biomedikoarekin etiketatzeko lehen urratsak eman dira. Prototipoak Apache Lucene R erabiltzen du Metathesaurus-a indexatu eta mapatze hautagaiak sortzeko. Horrez gain, anbiguotasunak UKB bidez ebazten ditu. Ebaluazioari dagokionez, prototipoaren eta MetaMap-en arteko adostasuna neurtu da bi ingelera-gaztelania corpus paralelotan. Corpusetako bat artikulu zientifikoetako izenburu eta laburpenez osatutako dago. Beste corpusa mediku-txosten pasarte batzuez dago osatuta

    Contributions to information extraction for spanish written biomedical text

    Get PDF
    285 p.Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue andscope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field

    Front-Line Physicians' Satisfaction with Information Systems in Hospitals

    Get PDF
    Day-to-day operations management in hospital units is difficult due to continuously varying situations, several actors involved and a vast number of information systems in use. The aim of this study was to describe front-line physicians' satisfaction with existing information systems needed to support the day-to-day operations management in hospitals. A cross-sectional survey was used and data chosen with stratified random sampling were collected in nine hospitals. Data were analyzed with descriptive and inferential statistical methods. The response rate was 65 % (n = 111). The physicians reported that information systems support their decision making to some extent, but they do not improve access to information nor are they tailored for physicians. The respondents also reported that they need to use several information systems to support decision making and that they would prefer one information system to access important information. Improved information access would better support physicians' decision making and has the potential to improve the quality of decisions and speed up the decision making process.Peer reviewe
    corecore