3 research outputs found
Using openEHR Archetypes for Automated Extraction of Numerical Information from Clinical Narratives
Up to 80% of medical information is documented by unstructured data such as clinical reports written in natural language. Such data is called unstructured because the information it contains cannot be retrieved automatically as straightforward as from structured data. However, we assume that the use of this flexible kind of documentation will remain a substantial part of a patient’s medical record, so that clinical information systems have to deal appropriately with this type of information description. On the other hand, there are efforts to achieve semantic interoperability between clinical application systems through information modelling concepts like HL7 FHIR or openEHR. Considering this, we propose an approach to transform unstructured documented information into openEHR archetypes. Furthermore, we aim to support the field of clinical text mining by recognizing and publishing the connections between openEHR archetypes and heterogeneous phrasings. We have evaluated our method by extracting the values to three openEHR archetypes from unstructured documents in English and German language
Integration of Unstructured Data into a Clinical Data Warehouse for Kidney Transplant Screening – Challenges & Solutions
After kidney transplantation graft rejection must be prevented. Therefore, a multitude of parameters of the patient is observed pre- and postoperatively. To support this process, the Screen Reject research project is developing a data warehouse optimized for kidney rejection diagnostics. In the course of this project it was discovered that important information are only available in form of free texts instead of structured data and can therefore not be processed by standard ETL tools, which is necessary to establish a digital expert system for rejection diagnostics. Due to this reason, data integration has been improved by a combination of methods from natural language processing and methods from image processing. Based on state-of-the-art data warehousing technologies (Microsoft SSIS), a generic data integration tool has been developed. The tool was evaluated by extracting Banff-classification from 218 pathology reports and extracting HLA mismatches from about 1700 PDF files, both written in german language
Artificially-generated consolidations and balanced augmentation increase performance of U-net for lung parenchyma segmentation on MR images.
PurposeTo improve automated lung segmentation on 2D lung MR images using balanced augmentation and artificially-generated consolidations for training of a convolutional neural network (CNN).Materials and methodsFrom 233 healthy volunteers and 100 patients, 1891 coronal MR images were acquired. Of these, 1666 images without consolidations were used to build a binary semantic CNN for lung segmentation and 225 images (187 without consolidations, 38 with consolidations) were used for testing. To increase CNN performance of segmenting lung parenchyma with consolidations, balanced augmentation was performed and artificially-generated consolidations were added to all training images. The proposed CNN (CNNBal/Cons) was compared to two other CNNs: CNNUnbal/NoCons-without balanced augmentation and artificially-generated consolidations and CNNBal/NoCons-with balanced augmentation but without artificially-generated consolidations. Segmentation results were assessed using Sørensen-Dice coefficient (SDC) and Hausdorff distance coefficient.ResultsRegarding the 187 MR test images without consolidations, the mean SDC of CNNUnbal/NoCons (92.1 ± 6% (mean ± standard deviation)) was significantly lower compared to CNNBal/NoCons (94.0 ± 5.3%, P = 0.0013) and CNNBal/Cons (94.3 ± 4.1%, P = 0.0001). No significant difference was found between SDC of CNNBal/Cons and CNNBal/NoCons (P = 0.54). For the 38 MR test images with consolidations, SDC of CNNUnbal/NoCons (89.0 ± 7.1%) was not significantly different compared to CNNBal/NoCons (90.2 ± 9.4%, P = 0.53). SDC of CNNBal/Cons (94.3 ± 3.7%) was significantly higher compared to CNNBal/NoCons (P = 0.0146) and CNNUnbal/NoCons (P = 0.001).ConclusionsExpanding training datasets via balanced augmentation and artificially-generated consolidations improved the accuracy of CNNBal/Cons, especially in datasets with parenchymal consolidations. This is an important step towards a robust automated postprocessing of lung MRI datasets in clinical routine