Towards the development of data governance standards for using clinical free-text data in health research: a position paper

Abstract

Background: Free-text clinical data (such as outpatient letters or nursing notes) represent a vast, untapped source of rich information that, if more accessible for research, would clarify and supplement information coded in structured data fields. Data usually need to be de-identified or anonymised before they can be reused for research, but there is a lack of established guidelines to govern effective de-identification and use of free-text information and avoid damaging data utility as a by-product. / Objective: We set out to work towards data governance standards to integrate with existing frameworks for personal data use, to enable free-text data to be used safely for research for patient/public benefit. / Methods: We outlined (UK) data protection legislation and regulations for context, and conducted a rapid literature review and UK-based case studies to explore data governance models used in working with free-text data. We also engaged with stakeholders including text mining researchers and the general public to explore perceived barriers and solutions in working with clinical free-text. / Results: We propose a set of recommendations, including the need: for authoritative guidance on data governance for the reuse of free-text data; to ensure public transparency in data flows and uses; to treat de-identified free-text as potentially identifiable with use limited to accredited data safe-havens; and, to commit to a culture of continuous improvement to understand the relationships between efficacy of de-identification and re-identification risks, so this can be communicated to all stakeholders. / Conclusions: By drawing together the findings of a combination of activities, our unique study has added new knowledge towards the development of data governance standards for the reuse of clinical free-text data for secondary purposes. Whilst working in accord with existing data governance frameworks, there is a need for further work to take forward the recommendations we have proposed, with commitment and investment, to assure and expand the safe reuse of clinical free-text data for public benefit

    Similar works