427 research outputs found

    A Machine Learning Trainable Model to Assess the Accuracy of Probabilistic Record Linkage

    Get PDF
    Record linkage (RL) is the process of identifying and linking data that relates to the same physical entity across multiple heterogeneous data sources. Deterministic linkage methods rely on the presence of common uniquely identifying attributes across all sources while probabilistic approaches use non-unique attributes and calculates similarity indexes for pair wise comparisons. A key component of record linkage is accuracy assessment — the process of manually verifying and validating matched pairs to further refine linkage parameters and increase its overall effectiveness. This process however is time-consuming and impractical when applied to large administrative data sources where millions of records must be linked. Additionally, it is potentially biased as the gold standard used is often the reviewer’s intuition. In this paper, we present an approach for assessing and refining the accuracy of probabilistic linkage based on different supervised machine learning methods (decision trees, naïve Bayes, logistic regression, random forest, linear support vector machines and gradient boosted trees). We used data sets extracted from huge Brazilian socioeconomic and public health care data sources. These models were evaluated using receiver operating characteristic plots, sensitivity, specificity and positive predictive values collected from a 10-fold cross-validation method. Results show that logistic regression outperforms other classifiers and enables the creation of a generalized, very accurate model to validate linkage results

    Cancer Carepartners: Improving patients' symptom management by engaging informal caregivers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Previous studies have found that cancer patients undergoing chemotherapy can effectively manage their own symptoms when given tailored advice. This approach, however, may challenge patients with poor performance status and/or emotional distress. Our goal is to test an automated intervention that engages a friend or family member to support a patient through chemotherapy.</p> <p>Methods/Design</p> <p>We describe the design and rationale of a randomized, controlled trial to assess the efficacy of 10 weeks of web-based caregiver alerts and tailored advice for helping a patient manage symptoms related to chemotherapy. The study aims to test the primary hypothesis that patients whose caregivers receive alerts and tailored advice will report less frequent and less severe symptoms at 10 and 14 weeks when compared to patients in the control arm; similarly, they will report better physical function, fewer outpatient visits and hospitalizations related to symptoms, and greater adherence to chemotherapy. 300 patients with solid tumors undergoing chemotherapy at two Veteran Administration oncology clinics reporting any symptom at a severity of ≥4 and a willing informal caregiver will be assigned to either 10 weeks of automated telephonic symptom assessment (ATSA) alone, or 10 weeks of ATSA plus web-based notification of symptom severity and problem solving advice to their chosen caregiver. Patients and caregivers will be surveyed at intake, 10 weeks and 14 weeks. Both groups will receive standard oncology, hospice, and palliative care.</p> <p>Discussion</p> <p>Patients undergoing chemotherapy experience many symptoms that they may be able to manage with the support of an activated caregiver. This intervention uses readily available technology to improve patient caregiver communication about symptoms and caregiver knowledge of symptom management. If successful, it could substantially improve the quality of life of veterans and their families during the stresses of chemotherapy without substantially increasing the cost of care.</p> <p>Trial Registration</p> <p><a href="http://www.clinicaltrials.gov/ct2/show/NCT00983892">NCT00983892</a></p

    Common Variants at 10 Genomic Loci Influence Hemoglobin A(1C) Levels via Glycemic and Nonglycemic Pathways

    Get PDF
    OBJECTIVE Glycated hemoglobin (HbA1c), used to monitor and diagnose diabetes, is influenced by average glycemia over a 2- to 3-month period. Genetic factors affecting expression, turnover, and abnormal glycation of hemoglobin could also be associated with increased levels of HbA1c. We aimed to identify such genetic factors and investigate the extent to which they influence diabetes classification based on HbA1c levels. RESEARCH DESIGN AND METHODS We studied associations with HbA1c in up to 46,368 nondiabetic adults of European descent from 23 genome-wide association studies (GWAS) and 8 cohorts with de novo genotyped single nucleotide polymorphisms (SNPs). We combined studies using inverse-variance meta-analysis and tested mediation by glycemia using conditional analyses. We estimated the global effect of HbA1c loci using a multilocus risk score, and used net reclassification to estimate genetic effects on diabetes screening. RESULTS Ten loci reached genome-wide significant association with HbA1c, including six new loci near FN3K (lead SNP/P value, rs1046896/P = 1.6 × 10−26), HFE (rs1800562/P = 2.6 × 10−20), TMPRSS6 (rs855791/P = 2.7 × 10−14), ANK1 (rs4737009/P = 6.1 × 10−12), SPTA1 (rs2779116/P = 2.8 × 10−9) and ATP11A/TUBGCP3 (rs7998202/P = 5.2 × 10−9), and four known HbA1c loci: HK1 (rs16926246/P = 3.1 × 10−54), MTNR1B (rs1387153/P = 4.0 × 10−11), GCK (rs1799884/P = 1.5 × 10−20) and G6PC2/ABCB11 (rs552976/P = 8.2 × 10−18). We show that associations with HbA1c are partly a function of hyperglycemia associated with 3 of the 10 loci (GCK, G6PC2 and MTNR1B). The seven nonglycemic loci accounted for a 0.19 (% HbA1c) difference between the extreme 10% tails of the risk score, and would reclassify ∼2% of a general white population screened for diabetes with HbA1c. CONCLUSIONS GWAS identified 10 genetic loci reproducibly associated with HbA1c. Six are novel and seven map to loci where rarer variants cause hereditary anemias and iron storage disorders. Common variants at these loci likely influence HbA1c levels via erythrocyte biology, and confer a small but detectable reclassification of diabetes diagnosis by HbA1c
    corecore