22 research outputs found
Comparison of algorithm-based estimates of occupational diesel exhaust exposure to those of multiple independent raters in a population-based case-control study
Objectives: Algorithm-based exposure assessments based on patterns in questionnaire responses and professional judgment can readily apply transparent exposure decision rules to thousands of jobs quickly. However, we need to better understand how algorithms compare to a one-by-one job review by an exposure assessor. We compared algorithm-based estimates of diesel exhaust exposure to those of three independent raters within the New England Bladder Cancer Study, a population-based case-control study, and identified conditions under which disparities occurred in the assessments of the algorithm and the raters.Methods: Occupational diesel exhaust exposure was assessed previously using an algorithm and a single rater for all 14 983 jobs reported by 2631 study participants during personal interviews conducted from 2001 to 2004. Two additional raters independently assessed a random subset of 324 jobs that were selected based on strata defined by the cross-tabulations of the algorithm and the first rater's probability assessments for each job, oversampling their disagreements. The algorithm and each rater assessed the probability, intensity and frequency of occupational diesel exhaust exposure, as well as a confidence rating for each metric. Agreement among the raters, their aggregate rating (average of the three raters' ratings) and the algorithm were evaluated using proportion of agreement, kappa and weighted kappa (κw). Agreement analyses on the subset used inverse probability weighting to extrapolate the subset to estimate agreement for all jobs. Classification and Regression Tree (CART) models were used to identify patterns in questionnaire responses that predicted disparities in exposure status (i.e., unexposed versus exposed) between the first rater and the algorithm-based estimates.Results: For the probability, intensity and frequency exposure metrics, moderate to moderately high agreement was observed among raters (κw = 0.50-0.76) and between the algorithm and the individual raters (κw = 0.58-0.81). For these metrics, the algorithm estimates had consistently higher agreement with the aggregate rating (κw = 0.82) than with the individual raters. For all metrics, the agreement between the algorithm and the aggregate ratings was highest for the unexposed category (90-93%) and was poor to moderate for the exposed categories (9-64%). Lower agreement was observed for jobs with a start year <1965 versus ≥1965. For the confidence metrics, the agreement was poor to moderate among raters (κw = 0.17-0.45) and between the algorithm and the individual raters (κw = 0.24-0.61). CART models identified patterns in the questionnaire responses that predicted a fair-to-moderate (33-89%) proportion of the disagreements between the raters' and the algorithm estimates.Discussion: The agreement between any two raters was similar to the agreement between an algorithm-based approach and individual raters, providing additional support for using the more efficient and transparent algorithm-based approach. CART models identified some patterns in disagreements between the first rater and the algorithm. Given the absence of a gold standard for estimating exposure, these patterns can be reviewed by a team of exposure assessors to determine whether the algorithm should be revised for future studies. © 2012 The Author
Combining Decision Rules from Classification Tree Models and Expert Assessment to Estimate Occupational Exposure to Diesel Exhaust for a Case-Control Study
OBJECTIVES: To efficiently and reproducibly assess occupational diesel exhaust exposure in a Spanish case-control study, we examined the utility of applying decision rules that had been extracted from expert estimates and questionnaire response patterns using classification tree (CT) models from a similar US study. METHODS: First, previously extracted CT decision rules were used to obtain initial ordinal (0-3) estimates of the probability, intensity, and frequency of occupational exposure to diesel exhaust for the 10 182 jobs reported in a Spanish case-control study of bladder cancer. Second, two experts reviewed the CT estimates for 350 jobs randomly selected from strata based on each CT rule's agreement with the expert ratings in the original study [agreement rate, from 0 (no agreement) to 1 (perfect agreement)]. Their agreement with each other and with the CT estimates was calculated using weighted kappa (κ w) and guided our choice of jobs for subsequent expert review. Third, an expert review comprised all jobs with lower confidence (low-to-moderate agreement rates or discordant assignments, n = 931) and a subset of jobs with a moderate to high CT probability rating and with moderately high agreement rates (n = 511). Logistic regression was used to examine the likelihood that an expert provided a different estimate than the CT estimate based on the CT rule agreement rates, the CT ordinal rating, and the availability of a module with diesel-related questions. RESULTS: Agreement between estimates made by two experts and between estimates made by each of the experts and the CT estimates was very high for jobs with estimates that were determined by rules with high CT agreement rates (κ w: 0.81-0.90). For jobs with estimates based on rules with lower agreement rates, moderate agreement was observed between the two experts (κ w: 0.42-0.67) and poor-to-moderate agreement was observed between the experts and the CT estimates (κ w: 0.09-0.57). In total, the expert review of 1442 jobs changed 156 probability estimates, 128 intensity estimates, and 614 frequency estimates. The expert was more likely to provide a different estimate when the CT rule agreement rate was <0.8, when the CT ordinal ratings were low to moderate, or when a module with diesel questions was available. CONCLUSIONS: Our reliability assessment provided important insight into where to prioritize additional expert review; as a result, only 14% of the jobs underwent expert review, substantially reducing the exposure assessment burden. Overall, we found that we could efficiently, reproducibly, and reliably apply CT decision rules from one study to assess exposure in another study