2 research outputs found

    Modelling Relevance towards Multiple Inclusion Criteria when Ranking Patients

    Get PDF
    In the medical domain, information retrieval systems can be used for identifying cohorts (i.e. patients) required for clinical studies. However, a challenge faced by such search systems is to retrieve the cohorts whose medical histories cover the inclusion criteria specified in a query, which are often complex and include multiple medical conditions. For example, a query may aim to find patients with both 'lupus nephritis' and 'thrombotic thrombocytopenic purpura'. In a typical best-match retrieval setting, any patient exhibiting all of the inclusion criteria should naturally be ranked higher than a patient that only exhibits a subset, or none, of the criteria. In this work, we extend the two main existing models for ranking patients to take into account the coverage of the inclusion criteria by adapting techniques from recent research into coverage-based diversification. We propose a novel approach for modelling the coverage of the query inclusion criteria within the records of a particular patient, and thereby rank highly those patients whose medical records are likely to cover all of the specified criteria. In particular, our proposed approach estimates the relevance of a patient, based on the mixture of the probability that the patient is retrieved by a patient ranking model for a given query, and the likelihood that the patient's records cover the query criteria. The latter is measured using the relevance towards each of the criteria stated in the query, represented in the form of sub-queries. We thoroughly evaluate our proposed approach using the test collection provided by the TREC 2011 and 2012 Medical Records track. Our results show significant improvements over existing strong baselines
    corecore