Search CORE

36 research outputs found

Using Case-Level Context to Classify Cancer Pathology Reports

Author: Alawad Mohammed
Coyle Linda
Durbin Eric B.
Gao Shang
Penberthy Lynne
Ramanathan Arvind
Schaefferkoetter Noah
Tourassi Georgia
Wu Xiao-Cheng
Publication venue: UKnowledge
Publication date: 01/01/2020
Field of study

Individual electronic health records (EHRs) and clinical reports are often part of a larger sequence-for example, a single patient may generate multiple reports over the trajectory of a disease. In applications such as cancer pathology reports, it is necessary not only to extract information from individual reports, but also to capture aggregate information regarding the entire cancer case based off case-level context from all reports in the sequence. In this paper, we introduce a simple modular add-on for capturing case-level context that is designed to be compatible with most existing deep learning architectures for text classification on individual reports. We test our approach on a corpus of 431,433 cancer pathology reports, and we show that incorporating case-level context significantly boosts classification accuracy across six classification tasks-site, subsite, laterality, histology, behavior, and grade. We expect that with minimal modifications, our add-on can be applied towards a wide range of other clinical text-based tasks

Directory of Open Access Journals

University of Kentucky

Deep Active Learning for Classifying Cancer Pathology Reports

Author: Alawad Mohammed
Coyle Linda
De Angeli Kevin
Doherty Jennifer
Durbin Eric B.
Gao Shang
Penberthy Lynne
Schaeferkoetter Noah
Stroup Antoinette
Tourassi Georgia
Wu Xiao‑Cheng
Yoon Hong‑Jun
Publication venue: UKnowledge
Publication date: 09/03/2021
Field of study

Background: Automated text classification has many important applications in the clinical setting; however, obtaining labelled data for training machine learning and deep learning models is often difficult and expensive. Active learning techniques may mitigate this challenge by reducing the amount of labelled data required to effectively train a model. In this study, we analyze the effectiveness of 11 active learning algorithms on classifying subsite and histology from cancer pathology reports using a Convolutional Neural Network as the text classification model. Results: We compare the performance of each active learning strategy using two differently sized datasets and two different classification tasks. Our results show that on all tasks and dataset sizes, all active learning strategies except diversity-sampling strategies outperformed random sampling, i.e., no active learning. On our large dataset (15K initial labelled samples, adding 15K additional labelled samples each iteration of active learning), there was no clear winner between the different active learning strategies. On our small dataset (1K initial labelled samples, adding 1K additional labelled samples each iteration of active learning), marginal and ratio uncertainty sampling performed better than all other active learning techniques. We found that compared to random sampling, active learning strongly helps performance on rare classes by focusing on underrepresented classes. Conclusions: Active learning can save annotation cost by helping human annotators efficiently and intelligently select which samples to label. Our results show that a dataset constructed using effective active learning techniques requires less than half the amount of labelled data to achieve the same performance as a dataset constructed using random sampling

University of Kentucky

Health related quality of life in sickle cell patients: The PiSCES project

Author: Donna K Mcclish
Imoigele P Aisiku
James L Levenson
John D Roberts
Lynne T Penberthy
Susan D Roseff
Susan D Roseff
Viktor E Bovbjerg
Wally R Smith
Wally R Smith Open Access
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Sickle cell disease (SCD) is a chronic disease associated with high degrees of morbidity and increased mortality. Health-related quality of life (HRQOL) among adults with sickle cell disease has not been widely reported. METHODS: We administered the Medical Outcomes Study 36-item Short-Form to 308 patients in the Pain in Sickle Cell Epidemiology Study (PiSCES) to assess HRQOL. Scales included physical function, physical and emotional role function, bodily pain, vitality, social function, mental health, and general health. We compared scores with national norms using t-tests, and with three chronic disease cohorts: asthma, cystic fibrosis and hemodialysis patients using analysis of variance and Dunnett's test for comparison with a control. We also assessed whether SCD specific variables (genotype, pain, crisis and utilization) were independently predictive of SF-36 subscales, controlling for socio-demographic variables using regression. RESULTS: Patients with SCD scored significantly worse than national norms on all subscales except mental health. Patients with SCD had lower HRQOL than cystic fibrosis patients except for mental health. Scores were similar for physical function, role function and mental health as compared to asthma patients, but worse for bodily pain, vitality, social function and general health subscales. Compared to dialysis patients, sickle cell disease patients scored similarly on physical role and emotional role function, social functioning and mental health, worse on bodily pain, general health and vitality and better on physical functioning. Surprisingly, genotype did not influence HRQOL except for vitality. However, scores significantly decreased as pain levels increased. CONCLUSION: SCD patients experience health related quality of life worse than the general population, and in general, their scores were most similar to patients undergoing hemodialysis. Practitioners should regard their HRQOL as severely compromised. Interventions in SCD should consider improvements in health related quality of life as important outcomes

CiteSeerX

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

VCU Scholars Compass

Pain site frequency and location in sickle cell disease: The PiSCES project

Author: Aisiku Imoigele P.
Bovbjerg Viktor E.
Dahman Bassam A.
Levenson James L.
McClish Donna K.
Penberthy Lynne T.
Roberts John D.
Roseff Susan D.
Smith Wally R.
Publication venue
Publication date: 01/01/2009
Field of study

Treatment options for sickle cell disease (SCD) pain could be tailored to pain locations. But few epidemiologic descriptions of SCD pain location exist; these are based on few subjects over short time periods. We examined whether SCD pain locations vary by disease genotype, gender, age, frequency of pain, depression, pain crisis or healthcare utilization

PubMed Central

Carolina Digital Repository

Breast-Cancer-Specific Mortality in Patients Treated Based on the 21-Gene Assay: A SEER Population-Based Study

Author: Baehner Frederick L.
Cress Rosemary
Cronin Kathleen
Deapen Dennis
Glaser Sally L.
Gliner Nathan
Hernandez Brenda Y.
Howe Will
Howlader Nadia
Lynch Charles F.
Miller Dave P.
Mueller Lloyd
Penberthy Lynne
Petkov Valentina I.
Schussler Nicola
Schwartz Ann G.
Schwartz Stephen M.
Shak Steven
Stroup Antoinette
Sweeney Carol
Tucker Thomas C.
Ward Kevin C.
Wiggins Charles
Wu Xiao-Cheng
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The 21-gene Recurrence Score assay is validated to predict recurrence risk and chemotherapy benefit in hormone-receptor-positive (HR+) invasive breast cancer. To determine prospective breast-cancer-specific mortality (BCSM) outcomes by baseline Recurrence Score results and clinical covariates, the National Cancer Institute collaborated with Genomic Health and 14 population-based registries in the the Surveillance, Epidemiology, and End Results (SEER) Program to electronically supplement cancer surveillance data with Recurrence Score results. The prespecified primary analysis cohort was 40–84 years of age, and had node-negative, HR+, HER2-negative, nonmetastatic disease diagnosed between January 2004 and December 2011 in the entire SEER population, and Recurrence Score results (N = 38,568). Unadjusted 5-year BCSM were 0.4% (n = 21,023; 95% confidence interval (CI), 0.3–0.6%), 1.4% (n = 14,494; 95% CI, 1.1–1.7%), and 4.4% (n = 3,051; 95% CI, 3.4–5.6%) for Recurrence Score \u3c 18, 18–30, and ≥ 31 groups, respectively (P \u3c 0.001). In multivariable analysis adjusted for age, tumor size, grade, and race, the Recurrence Score result predicted BCSM (P \u3c 0.001). Among patients with node-positive disease (micrometastases and up to three positive nodes; N = 4,691), 5-year BCSM (unadjusted) was 1.0% (n = 2,694; 95% CI, 0.5–2.0%), 2.3% (n = 1,669; 95% CI, 1.3–4.1%), and 14.3% (n = 328; 95% CI, 8.4–23.8%) for Recurrence Score \u3c 18, 18–30, ≥ 31 groups, respectively (P \u3c 0.001). Five-year BCSM by Recurrence Score group are reported for important patient subgroups, including age, race, tumor size, grade, and socioeconomic status. This SEER study represents the largest report of prospective BCSM outcomes based on Recurrence Score results for patients with HR+, HER2-negative, node-negative, or node-positive breast cancer, including subgroups often under-represented in clinical trials

University of Kentucky

eScholarship - University of California