43 research outputs found
Recommended from our members
Development and Evaluation of an Open Source Software Tool for Deidentification of Pathology Reports
Background: Electronic medical records, including pathology reports, are often used for research purposes. Currently, there are few programs freely available to remove identifiers while leaving the remainder of the pathology report text intact. Our goal was to produce an open source, Health Insurance Portability and Accountability Act (HIPAA) compliant, deidentification tool tailored for pathology reports. We designed a three-step process for removing potential identifiers. The first step is to look for identifiers known to be associated with the patient, such as name, medical record number, pathology accession number, etc. Next, a series of pattern matches look for predictable patterns likely to represent identifying data; such as dates, accession numbers and addresses as well as patient, institution and physician names. Finally, individual words are compared with a database of proper names and geographic locations. Pathology reports from three institutions were used to design and test the algorithms. The software was improved iteratively on training sets until it exhibited good performance. 1800 new pathology reports were then processed. Each report was reviewed manually before and after deidentification to catalog all identifiers and note those that were not removed. Results: 1254 (69.7 %) of 1800 pathology reports contained identifiers in the body of the report. 3439 (98.3%) of 3499 unique identifiers in the test set were removed. Only 19 HIPAA-specified identifiers (mainly consult accession numbers and misspelled names) were missed. Of 41 non-HIPAA identifiers missed, the majority were partial institutional addresses and ages. Outside consultation case reports typically contain numerous identifiers and were the most challenging to deidentify comprehensively. There was variation in performance among reports from the three institutions, highlighting the need for site-specific customization, which is easily accomplished with our tool. Conclusion: We have demonstrated that it is possible to create an open-source deidentification program which performs well on free-text pathology reports
Development and evaluation of an open source software tool for deidentification of pathology reports
BACKGROUND: Electronic medical records, including pathology reports, are often used for research purposes. Currently, there are few programs freely available to remove identifiers while leaving the remainder of the pathology report text intact. Our goal was to produce an open source, Health Insurance Portability and Accountability Act (HIPAA) compliant, deidentification tool tailored for pathology reports. We designed a three-step process for removing potential identifiers. The first step is to look for identifiers known to be associated with the patient, such as name, medical record number, pathology accession number, etc. Next, a series of pattern matches look for predictable patterns likely to represent identifying data; such as dates, accession numbers and addresses as well as patient, institution and physician names. Finally, individual words are compared with a database of proper names and geographic locations. Pathology reports from three institutions were used to design and test the algorithms. The software was improved iteratively on training sets until it exhibited good performance. 1800 new pathology reports were then processed. Each report was reviewed manually before and after deidentification to catalog all identifiers and note those that were not removed. RESULTS: 1254 (69.7 %) of 1800 pathology reports contained identifiers in the body of the report. 3439 (98.3%) of 3499 unique identifiers in the test set were removed. Only 19 HIPAA-specified identifiers (mainly consult accession numbers and misspelled names) were missed. Of 41 non-HIPAA identifiers missed, the majority were partial institutional addresses and ages. Outside consultation case reports typically contain numerous identifiers and were the most challenging to deidentify comprehensively. There was variation in performance among reports from the three institutions, highlighting the need for site-specific customization, which is easily accomplished with our tool. CONCLUSION: We have demonstrated that it is possible to create an open-source deidentification program which performs well on free-text pathology reports
Availability and quality of paraffin blocks identified in pathology archives: A multi-institutional study by the Shared Pathology Informatics Network (SPIN)
BACKGROUND: Shared Pathology Informatics Network (SPIN) is a tissue resource initiative that utilizes clinical reports of the vast amount of paraffin-embedded tissues routinely stored by medical centers. SPIN has an informatics component (sending tissue-related queries to multiple institutions via the internet) and a service component (providing histopathologically annotated tissue specimens for medical research). This paper examines if tissue blocks, identified by localized computer searches at participating institutions, can be retrieved in adequate quantity and quality to support medical researchers. METHODS: Four centers evaluated pathology reports (1990–2005) for common and rare tumors to determine the percentage of cases where suitable tissue blocks with tumor were available. Each site generated a list of 100 common tumor cases (25 cases each of breast adenocarcinoma, colonic adenocarcinoma, lung squamous carcinoma, and prostate adenocarcinoma) and 100 rare tumor cases (25 cases each of adrenal cortical carcinoma, gastro-intestinal stromal tumor [GIST], adenoid cystic carcinoma, and mycosis fungoides) using a combination of Tumor Registry, laboratory information system (LIS) and/or SPIN-related tools. Pathologists identified the slides/blocks with tumor and noted first 3 slides with largest tumor and availability of the corresponding block. RESULTS: Common tumors cases (n = 400), the institutional retrieval rates (all blocks) were 83% (A), 95% (B), 80% (C), and 98% (D). Retrieval rate (tumor blocks) from all centers for common tumors was 73% with mean largest tumor size of 1.49 cm; retrieval (tumor blocks) was highest-lung (84%) and lowest-prostate (54%). Rare tumors cases (n = 400), each institution's retrieval rates (all blocks) were 78% (A), 73% (B), 67% (C), and 84% (D). Retrieval rate (tumor blocks) from all centers for rare tumors was 66% with mean largest tumor size of 1.56 cm; retrieval (tumor blocks) was highest for GIST (72%) and lowest for adenoid cystic carcinoma (58%). CONCLUSION: Assessment shows availability and quality of archival tissue blocks that are retrievable and associated electronic data that can be of value for researchers. This study serves to compliment the data from which uniform use of the SPIN query tools by all four centers will be measured to assure and highlight the usefulness of archival material for obtaining tumor tissues for research
Protective Unfolded Protein Response in Human Pancreatic Beta Cells Transplanted into Mice
Background: There is great interest about the possible contribution of ER stress to the apoptosis of pancreatic beta cells in the diabetic state and with islet transplantation. Methods and Findings: Expression of genes involved in ER stress were examined in beta cell enriched tissue obtained with laser capture microdissection (LCM) from frozen sections of pancreases obtained from non-diabetic subjects at surgery and from human islets transplanted into ICR-SCID mice for 4 wk. Because mice have higher glucose levels than humans, the transplanted beta cells were exposed to mild hyperglycemia and the abnormal environment of the transplant site. RNA was extracted from the LCM specimens, amplified and then subjected to microarray analysis. The transplanted beta cells showed an unfolded protein response (UPR). There was activation of many genes of the IRE-1 pathway that provide protection against the deleterious effects of ER stress, increased expression of ER chaperones and ERAD (ER-associated protein degradation) proteins. The other two arms of ER stress, PERK and ATF-6, had many down regulated genes. Downregulation of EIF2A could protect by inhibiting protein synthesis. Two genes known to contribute to apoptosis, CHOP and JNK, were downregulated. Conclusions: Human beta cells in a transplant site had UPR changes in gene expression that protect against the proapoptotic effects of unfolded proteins
Recommended from our members
Digital pathology and computational image analysis in nephropathology
The emergence of digital pathology - an image-based environment for the acquisition, management and interpretation of pathology information supported by computational techniques for data extraction and analysis - is changing the pathology ecosystem. In particular, by virtue of our new-found ability to generate and curate digital libraries, the field of machine vision can now be effectively applied to histopathological subject matter by individuals who do not have deep expertise in machine vision techniques. Although these novel approaches have already advanced the detection, classification, and prognostication of diseases in the fields of radiology and oncology, renal pathology is just entering the digital era, with the establishment of consortia and digital pathology repositories for the collection, analysis and integration of pathology data with other domains. The development of machine-learning approaches for the extraction of information from image data, allows for tissue interrogation in a way that was not previously possible. The application of these novel tools are placing pathology centre stage in the process of defining new, integrated, biologically and clinically homogeneous disease categories, to identify patients at risk of progression, and shifting current paradigms for the treatment and prevention of kidney diseases