2 research outputs found

    Contributions to information extraction for spanish written biomedical text

    Get PDF
    285 p.Healthcare practice and clinical research produce vast amounts of digitised, unstructured data in multiple languages that are currently underexploited, despite their potential applications in improving healthcare experiences, supporting trainee education, or enabling biomedical research, for example. To automatically transform those contents into relevant, structured information, advanced Natural Language Processing (NLP) mechanisms are required. In NLP, this task is known as Information Extraction. Our work takes place within this growing field of clinical NLP for the Spanish language, as we tackle three distinct problems. First, we compare several supervised machine learning approaches to the problem of sensitive data detection and classification. Specifically, we study the different approaches and their transferability in two corpora, one synthetic and the other authentic. Second, we present and evaluate UMLSmapper, a knowledge-intensive system for biomedical term identification based on the UMLS Metathesaurus. This system recognises and codifies terms without relying on annotated data nor external Named Entity Recognition tools. Although technically naive, it performs on par with more evolved systems, and does not exhibit a considerable deviation from other approaches that rely on oracle terms. Finally, we present and exploit a new corpus of real health records manually annotated with negation and uncertainty information: NUBes. This corpus is the basis for two sets of experiments, one on cue andscope detection, and the other on assertion classification. Throughout the thesis, we apply and compare techniques of varying levels of sophistication and novelty, which reflects the rapid advancement of the field

    Making Certain: Information and Social Reality

    Get PDF
    This dissertation identifies and explains the phenomenon of the production of certainty in information systems. I define this phenomenon pragmatically as instances where practices of justification end upon information systems or their contents. Cases where information systems seem able to produce social reality without reference to the external world indicate that these systems contain facts for determining truth, rather than propositions rendered true or false by the world outside the system. The No Fly list is offered as a running example that both clearly exemplifies the phenomenon and announces the stakes of my project. After an operationalization of key terms and a review of relevant literature, I articulate a research program aimed at characterizing the phenomenon,its major components, and its effects. Notable contributions of the dissertation include: • the identification of the production of certainty as a unitary, trans-disciplinary phenomenon; • the synthesis of a sociolinguistic method capable of unambiguously identifying a) the presence of this phenomenon and b) distinguishing the respective contributions of systemic and social factors to it; and • the development of a taxonomy of certainty that can distinguish between types of certainty production and/or certainty-producing systems.The analysis of certainty proposed and advanced here is a potential compliment to several existing methods of sociotechnical research. This is demonstrated by applying the analysis of certainty to the complex assemblage of computational timekeeping alongside a more traditional infrastructural inversion. Three subsystems, the tz database, Network Time Protocol, and International Atomic Time, are selected from the assemblage of computational timekeeping for analysis. Each system employs a distinct memory practice, in Bowker’s sense, which licenses the forgetting inherent in the production of the information it contains. The analysis of certainty expands upon the insights provided by infrastructural inversion to show how the production of certainty through modern computational timekeeping practices shapes the social reality of time. This analysis serves as an example for scholars who encounter the phenomenon of the production of certainty in information systems to use the proposed theoretical framework to more easily account for, understand, and engage with it in their work. The dissertation concludes by identifying other sites amenable to this kind of analysis, including the algorithmic assemblages commonly referred to as Artificial Intelligence.Doctor of Philosoph
    corecore