Electronic patient data are associated with many potential benefits, e.g. data sharing, quality assessment, research, and management of patient care. The degree to which patient data are currently available electronically varies. To harvest the potential benefits of electronic data, the data must also be available in a structured format to enable processing by computer applications. Narrative data are typically recorded as free text. As a result, researchers still have to perform the labor-intensive task of reading and interpreting free text in individual electronic medical records. Structuring the medical narrative poses a significant challenge: content and level of detail are often unpredictable and vary per domain (and even per clinician). In an attempt to support structured recording of medical narratives we have developed OpenSDE (SDE: structured data entry). OpenSDE is intended for use in both care and research. Therefore, OpenSDE is designed to accommodate the structured recording of data in settings where content and order of data entry can often not be predicted.
The aim of this research project is to investigate the feasibility of using data recorded with OpenSDE, for research purposes. Consistency and accuracy of collected data are pivotal for research, and are especially challenging if data will be collected over long periods of time and by different users. This Ph.D. project, therefore, focuses on pitfalls for data extraction for research purposes, and aims to formulate strategies to improve uniformity in data entry to enhance the reliability of data retrieval.
In this research project we studied:
• The possibility of extracting data recorded with OpenSDE and representing the extracted data in a manner suitable for research purposes.
• The uniformity of recorded data when OpenSDE is used to transcribe data from the same source.
• The origin of differences in representation of semantically identical information.
• Strategies that can improve uniformity in data entry