867 research outputs found

    Prioritising references for systematic reviews with RobotAnalyst: A user study

    Get PDF
    Screening references is a time-consuming step necessary for systematic reviews and guideline development. Previous studies have shown that human effort can be reduced by using machine learning software to prioritise large reference collections such that most of the relevant references are identified before screening is completed. We describe and evaluate RobotAnalyst, a Web-based software system that combines text-mining and machine learning algorithms for organising references by their content and actively prioritising them based on a relevancy classification model trained and updated throughout the process. We report an evaluation over 22 reference collections (most are related to public health topics) screened using RobotAnalyst with a total of 43 610 abstract-level decisions. The number of references that needed to be screened to identify 95% of the abstract-level inclusions for the evidence review was reduced on 19 of the 22 collections. Significant gains over random sampling were achieved for all reviews conducted with active prioritisation, as compared with only two of five when prioritisation was not used. RobotAnalyst's descriptive clustering and topic modelling functionalities were also evaluated by public health analysts. Descriptive clustering provided more coherent organisation than topic modelling, and the content of the clusters was apparent to the users across a varying number of clusters. This is the first large-scale study using technology-assisted screening to perform new reviews, and the positive results provide empirical evidence that RobotAnalyst can accelerate the identification of relevant studies. The results also highlight the issue of user complacency and the need for a stopping criterion to realise the work savings

    Report on the call for feedback about The Scope of the European guidelines for breast cancer screening and diagnosis: European Commission Initiative on Breast Cancer

    Get PDF
    In 2015, the European Commission Initiative on Breast Cancer (ECIBC) started the development of the European guidelines for breast cancer screening and diagnosis (henceforth the European Breast Guidelines) under the auspices of the Directorate-General for Health and Food Safety (DG SANTE) and the technical and scientific coordination of the Directorate-General Joint Research Centre (JRC). To support the JRC in this task, a Guidelines Development Group (GDG), consisting of independent experts and individuals, was established. The European Breast Guidelines’ scope (The Scope) represented the first output of the development process of the European Breast Guidelines. Via a public call for feedback, stakeholders and individual citizens were invited to provide their feedback on The Scope. The call for feedback was open from 18 December 2015 to 17 January 2016 and an online questionnaire was made available on the ECIBC web hub via the EU Survey platform. The JRC received a total of 82 valid responses, from 40 individuals from 18 different countries and from 42 organisations from 20 different countries. During a meeting held in Varese (Italy) in March 2016, the GDG discussed the new version of The Scope which was prepared taking into account the results of the call for feedback. The Scope was finalised and approved by the GDG after some minor editing on 6 September 2016 and was later made publicly available together with this report.JRC.F.1-Health in Societ

    PICO entity extraction for preclinical animal literature

    Get PDF
    BACKGROUND: Natural language processing could assist multiple tasks in systematic reviews to reduce workflow, including the extraction of PICO elements such as study populations, interventions, comparators and outcomes. The PICO framework provides a basis for the retrieval and selection for inclusion of evidence relevant to a specific systematic review question, and automatic approaches to PICO extraction have been developed particularly for reviews of clinical trial findings. Considering the difference between preclinical animal studies and clinical trials, developing separate approaches is necessary. Facilitating preclinical systematic reviews will inform the translation from preclinical to clinical research. METHODS: We randomly selected 400 abstracts from the PubMed Central Open Access database which described in vivo animal research and manually annotated these with PICO phrases for Species, Strain, methods of Induction of disease model, Intervention, Comparator and Outcome. We developed a two-stage workflow for preclinical PICO extraction. Firstly we fine-tuned BERT with different pre-trained modules for PICO sentence classification. Then, after removing the text irrelevant to PICO features, we explored LSTM-, CRF- and BERT-based models for PICO entity recognition. We also explored a self-training approach because of the small training corpus. RESULTS: For PICO sentence classification, BERT models using all pre-trained modules achieved an F1 score of over 80%, and models pre-trained on PubMed abstracts achieved the highest F1 of 85%. For PICO entity recognition, fine-tuning BERT pre-trained on PubMed abstracts achieved an overall F1 of 71% and satisfactory F1 for Species (98%), Strain (70%), Intervention (70%) and Outcome (67%). The score of Induction and Comparator is less satisfactory, but F1 of Comparator can be improved to 50% by applying self-training. CONCLUSIONS: Our study indicates that of the approaches tested, BERT pre-trained on PubMed abstracts is the best for both PICO sentence classification and PICO entity recognition in the preclinical abstracts. Self-training yields better performance for identifying comparators and strains

    Research prioritisation exercises related to the care of children and young people with life-limiting conditions, their parents, and all those who care for them : a systematic scoping review

    Get PDF
    Background: In planning high quality research in any aspect of care for children and young people with life-limiting conditions it is important to prioritise resources in the most appropriate areas. Aim: To map research priorities identified from existing research prioritisation exercises relevant to infants, children, and young people with life-limiting conditions, in order to inform future research. Design: We undertook a systematic scoping review to identify existing research prioritisation exercises; the protocol is publicly available on the project website. Data sources: The bibliographic databases ASSIA, CINAHL, MEDLINE/MEDLINE In Process and Embase were searched from 2000. Relevant reference lists and websites were hand searched. Included were any consultations aimed at identifying research for the benefit of neonates, infants, children and/or young people (birth to age 25 years) with life-limiting, -threatening or -shortening conditions; their family, parents, carers; and/or the professional staff caring for them. Results: Twenty four research prioritisation exercises met the inclusion criteria, from which 279 research questions or priority areas for health research were identified. The priorities were iteratively mapped onto an evolving framework, informed by WHO classifications. This resulted in identification of 16 topic areas, 55 sub-topics and 12 sub-sub-topics. Conclusions: There are numerous similar and overlapping research prioritisation exercises related to children and young people with life-limiting conditions. By mapping existing research priorities in the context in which they were set, we highlight areas to focus research efforts on. Further priority setting is not required at this time unless devoted to ascertaining families’ perspectives

    Data extraction methods for systematic review (semi)automation: A living systematic review [version 1; peer review: awaiting peer review]

    Get PDF
    Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search MEDLINE, Institute of Electrical and Electronics Engineers (IEEE), arXiv, and the dblp computer science bibliography databases. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This iteration of the living review includes publications up to a cut-off date of 22 April 2020. Results: In total, 53 publications are included in this version of our review. Of these, 41 (77%) of the publications addressed extraction of data from abstracts, while 14 (26%) used full texts. A total of 48 (90%) publications developed and evaluated classifiers that used randomised controlled trials as the main target texts. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. A description of their datasets was provided by 49 publications (94%), but only seven (13%) made the data publicly available. Code was made available by 10 (19%) publications, and five (9%) implemented publicly available tools. Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of systematic review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. The lack of publicly available gold-standard data for evaluation, and lack of application thereof, makes it difficult to draw conclusions on which is the best-performing system for each data extraction target. With this living review we aim to review the literature continually

    Data extraction methods for systematic review (semi)automation: Update of a living systematic review [version 2; peer review: 3 approved]

    Get PDF
    Background: The reliable and usable (semi)automation of data extraction can support the field of systematic review by reducing the workload required to gather information about the conduct and results of the included studies. This living systematic review examines published approaches for data extraction from reports of clinical studies. Methods: We systematically and continually search PubMed, ACL Anthology, arXiv, OpenAlex via EPPI-Reviewer, and the dblp computer science bibliography. Full text screening and data extraction are conducted within an open-source living systematic review application created for the purpose of this review. This living review update includes publications up to December 2022 and OpenAlex content up to March 2023. Results: 76 publications are included in this review. Of these, 64 (84%) of the publications addressed extraction of data from abstracts, while 19 (25%) used full texts. A total of 71 (93%) publications developed classifiers for randomised controlled trials. Over 30 entities were extracted, with PICOs (population, intervention, comparator, outcome) being the most frequently extracted. Data are available from 25 (33%), and code from 30 (39%) publications. Six (8%) implemented publicly available tools Conclusions: This living systematic review presents an overview of (semi)automated data-extraction literature of interest to different types of literature review. We identified a broad evidence base of publications describing data extraction for interventional reviews and a small number of publications extracting epidemiological or diagnostic accuracy data. Between review updates, trends for sharing data and code increased strongly: in the base-review, data and code were available for 13 and 19% respectively, these numbers increased to 78 and 87% within the 23 new publications. Compared with the base-review, we observed another research trend, away from straightforward data extraction and towards additionally extracting relations between entities or automatic text summarisation. With this living review we aim to review the literature continually
    corecore