2 research outputs found

    Synonym extraction and abbreviation expansion with ensembles of semantic spaces

    Full text link

    Initial Results in the Development of SCAN A Swedish Clinical Abbreviation Normalizer

    No full text
    Abstract. Abbreviations are common in clinical documentation, as this type of text is written under time-pressure and serves mostly for internal communication. This study attempts to apply and extend existing rule-based algorithms that have been developed for English and Swedish abbreviation detection, in order to create an abbreviation detection algorithm for Swedish clinical texts that can identify and suggest definitions for abbreviations and acronyms. This can be used as a pre-processing step for further information extraction and text mining models, as well as for readability solutions. Through a literature review, a number of heuristics were defined for automatic abbreviation detection. These were used in the construction of the Swedish Clinical Abbreviation Normalizer (SCAN). The heuristics were: a) freely available external resources: a dictionary of general Swedish, a dictionary of medical terms and a dictionary of known Swedish medical abbreviations, b) maximum word lengths (from three to eight characters), and c) heuristics for handling common patterns such as hyphenation. For each token in the text, the algorith
    corecore