1 research outputs found

    Personalizing Retrieval of Journal Articles for Patient Care

    No full text
    this paper and other work in the context of PERSIVAL, we collected a corpus of 29,784 medical articles in full text, either from the web with an automated crawler or via a licensing agreement with Ovid Technologies. The articles appeared in HTML format; we transformed them into XML using a pipeline we developed on the basis of publicly available XML tools. The corpus contains articles from 20 journals in cardiology from 1993 to 2000, comprising roughly 85 million word tokens (cf. Figure 2
    corecore