34 research outputs found

    From Raw Data to FAIR Data: The FAIRification Workflow for Health Research

    Get PDF
    BackgroundFAIR (findability, accessibility, interoperability, and reusability) guidingprinciples seek the reuse of data and other digital research input, output, and objects(algorithms, tools, and workflows that led to that data) making themfindable, accessible,interoperable, and reusable. GO FAIR - a bottom-up, stakeholder driven and self-governedinitiative-defined a seven-step FAIRificationprocessfocusingondata,butalsoindicatingtherequired work for metadata. This FAIRification process aims at addressing the translation ofraw datasets into FAIR datasets in a general way, without considering specific requirementsand challenges that may arise when dealing with some particular types of data.This work was performed in the scope of FAIR4Healthproject. FAIR4Health has received funding from the European Union’s Horizon 2020 research and innovationprogramme under grant agreement number 824666

    Challenges and opportunities beyond structured data in analysis of electronic health records

    Get PDF
    Electronic health records (EHR) contain a lot of valuable information about individual patients and the whole population. Besides structured data, unstructured data in EHRs can provide extra, valuable information but the analytics processes are complex, time-consuming, and often require excessive manual effort. Among unstructured data, clinical text and images are the two most popular and important sources of information. Advanced statistical algorithms in natural language processing, machine learning, deep learning, and radiomics have increasingly been used for analyzing clinical text and images. Although there exist many challenges that have not been fully addressed, which can hinder the use of unstructured data, there are clear opportunities for well-designed diagnosis and decision support tools that efficiently incorporate both structured and unstructured data for extracting useful information and provide better outcomes. However, access to clinical data is still very restricted due to data sensitivity and ethical issues. Data quality is also an important challenge in which methods for improving data completeness, conformity and plausibility are needed. Further, generalizing and explaining the result of machine learning models are important problems for healthcare, and these are open challenges. A possible solution to improve data quality and accessibility of unstructured data is developing machine learning methods that can generate clinically relevant synthetic data, and accelerating further research on privacy preserving techniques such as deidentification and pseudonymization of clinical text

    Automated Transformation of Semi-Structured Text Elements

    Get PDF
    Interconnected systems, such as electronic health records (EHR), considerably improved the handling and processing of health information while keeping the costs at a controlled level. Since the EHR virtually stores all data in digitized form, personal medical documents are easily and swiftly available when needed. However, multiple formats and differences in the health documents managed by various health care providers severely reduce the efficiency of the data sharing process. This paper presents a rule-based transformation system that converts semi-structured (annotated) text into standardized formats, such as HL7 CDA. It identifies relevant information in the input document by analyzing its structure as well as its content and inserts the required elements into corresponding reusable CDA templates, where the templates are selected according to the CDA document type-specific requirements

    Data Infrastructure for Medical Research

    Get PDF
    While we are witnessing rapid growth in data across the sciences and in many applications, this growth is particularly remarkable in the medical domain, be it because of higher resolution instruments and diagnostic tools (e.g. MRI), new sources of structured data like activity trackers, the wide-spread use of electronic health records and many others. The sheer volume of the data is not, however, the only challenge to be faced when using medical data for research. Other crucial challenges include data heterogeneity, data quality, data privacy and so on. In this article, we review solutions addressing these challenges by discussing the current state of the art in the areas of data integration, data cleaning, data privacy, scalable data access and processing in the context of medical data. The techniques and tools we present will give practitioners — computer scientists and medical researchers alike — a starting point to understand the challenges and solutions and ultimately to analyse medical data and gain better and quicker insights

    Bringing AI to the clinic: blueprint for a vendor-neutral AI deployment infrastructure

    Get PDF
    AI provides tremendous opportunities for improving patient care, but at present there is little evidence of real-world uptake. An important barrier is the lack of well-designed, vendor-neutral and future-proof infrastructures for deployment. Because current AI algorithms are very narrow in scope, it is expected that a typical hospital will deploy many algorithms concurrently. Managing stand-alone point solutions for all of these algorithms will be unmanageable. A solution to this problem is a dedicated platform for deployment of AI. Here we describe a blueprint for such a platform and the high-level design and implementation considerations of such a system that can be used clinically as well as for research and development. Close collaboration between radiologists, data scientists, software developers and experts in hospital IT as well as involvement of patients is crucial in order to successfully bring AI to the clinic

    New implementation of data standards for AI research in precision oncology. Experience from EuCanImage

    Get PDF
    An unprecedented amount of personal health data, with the potential to revolutionise precision medicine, is generated at healthcare institutions worldwide. The exploitation of such data using artificial intelligence relies on the ability to combine heterogeneous, multicentric, multimodal and multiparametric data, as well as thoughtful representation of knowledge and data availability. Despite these possibilities, significant methodological challenges and ethico-legal constraints still impede the real-world implementation of data models. The EuCanImage is an international consortium aimed at developing AI algorithms for precision medicine in oncology and enabling secondary use of the data based on necessary ethical approvals. The use of well-defined clinical data standards to allow interoperability was a central element within the initiative. The consortium is focused on three different cancer types and addresses seven unmet clinical needs. This article synthesises our experience and procedures for healthcare data interoperability and standardisation.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 952103.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThis study describes a new process to harmonize and standardize clinical data. The data will be available upon request to the authors

    New implementation of data standards for AI research in precision oncology. Experience from EuCanImage

    Get PDF
    An unprecedented amount of personal health data, with the potential to revolutionise precision medicine, is generated at healthcare institutions worldwide. The exploitation of such data using artificial intelligence relies on the ability to combine heterogeneous, multicentric, multimodal and multiparametric data, as well as thoughtful representation of knowledge and data availability. Despite these possibilities, significant methodological challenges and ethico-legal constraints still impede the real-world implementation of data models. The EuCanImage is an international consortium aimed at developing AI algorithms for precision medicine in oncology and enabling secondary use of the data based on necessary ethical approvals. The use of well-defined clinical data standards to allow interoperability was a central element within the initiative. The consortium is focused on three different cancer types and addresses seven unmet clinical needs. This article synthesises our experience and procedures for healthcare data interoperability and standardisation.Competing Interest StatementThe authors have declared no competing interest.Funding StatementThis project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 952103.Author DeclarationsI confirm all relevant ethical guidelines have been followed, and any necessary IRB and/or ethics committee approvals have been obtained.YesI confirm that all necessary patient/participant consent has been obtained and the appropriate institutional forms have been archived, and that any patient/participant/sample identifiers included were not known to anyone (e.g., hospital staff, patients or participants themselves) outside the research group so cannot be used to identify individuals.YesI understand that all clinical trials and any other prospective interventional studies must be registered with an ICMJE-approved registry, such as ClinicalTrials.gov. I confirm that any such study reported in the manuscript has been registered and the trial registration ID is provided (note: if posting a prospective study registered retrospectively, please provide a statement in the trial ID field explaining why the study was not registered in advance).YesI have followed all appropriate research reporting guidelines, such as any relevant EQUATOR Network research reporting checklist(s) and other pertinent material, if applicable.YesThis study describes a new process to harmonize and standardize clinical data. The data will be available upon request to the authors

    CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital.

    Get PDF
    BACKGROUND: Traditional health information systems are generally devised to support clinical data collection at the point of care. However, as the significance of the modern information economy expands in scope and permeates the healthcare domain, there is an increasing urgency for healthcare organisations to offer information systems that address the expectations of clinicians, researchers and the business intelligence community alike. Amongst other emergent requirements, the principal unmet need might be defined as the 3R principle (right data, right place, right time) to address deficiencies in organisational data flow while retaining the strict information governance policies that apply within the UK National Health Service (NHS). Here, we describe our work on creating and deploying a low cost structured and unstructured information retrieval and extraction architecture within King's College Hospital, the management of governance concerns and the associated use cases and cost saving opportunities that such components present. RESULTS: To date, our CogStack architecture has processed over 300 million lines of clinical data, making it available for internal service improvement projects at King's College London. On generated data designed to simulate real world clinical text, our de-identification algorithm achieved up to 94% precision and up to 96% recall. CONCLUSION: We describe a toolkit which we feel is of huge value to the UK (and beyond) healthcare community. It is the only open source, easily deployable solution designed for the UK healthcare environment, in a landscape populated by expensive proprietary systems. Solutions such as these provide a crucial foundation for the genomic revolution in medicine
    corecore