39 research outputs found

    Physics-Informed Deep Learning to Reduce the Bias in Joint Prediction of Nitrogen Oxides

    Full text link
    Atmospheric nitrogen oxides (NOx) primarily from fuel combustion have recognized acute and chronic health and environmental effects. Machine learning (ML) methods have significantly enhanced our capacity to predict NOx concentrations at ground-level with high spatiotemporal resolution but may suffer from high estimation bias since they lack physical and chemical knowledge about air pollution dynamics. Chemical transport models (CTMs) leverage this knowledge; however, accurate predictions of ground-level concentrations typically necessitate extensive post-calibration. Here, we present a physics-informed deep learning framework that encodes advection-diffusion mechanisms and fluid dynamics constraints to jointly predict NO2 and NOx and reduce ML model bias by 21-42%. Our approach captures fine-scale transport of NO2 and NOx, generates robust spatial extrapolation, and provides explicit uncertainty estimation. The framework fuses knowledge-driven physicochemical principles of CTMs with the predictive power of ML for air quality exposure, health, and policy applications. Our approach offers significant improvements over purely data-driven ML methods and has unprecedented bias reduction in joint NO2 and NOx prediction

    Exposure measurement error in air pollution studies: the impact of shared, multiplicative measurement error on epidemiological health risk estimates

    No full text
    Spatiotemporal air pollution models are increasingly being used to estimate health effects in epidemiological studies. Although such exposure prediction models typically result in improved spatial and temporal resolution of air pollution predictions, they remain subject to shared measurement error, a type of measurement error common in spatiotemporal exposure models which occurs when measurement error is not independent of exposures. A fundamental challenge of exposure measurement error in air pollution assessment is the strong correlation and sometimes identical (shared) error of exposure estimates across geographic space and time. When exposure estimates with shared measurement error are used to estimate health risk in epidemiological analyses, complex errors are potentially introduced, resulting in biased epidemiological conclusions. We demonstrate the influence of using a three-stage spatiotemporal exposure prediction model and introduce formal methods of shared, multiplicative measurement error (SMME) correction of epidemiological health risk estimates. Using our three-stage, ensemble learning based nitrogen oxides (NOx) exposure prediction model, we quantified SMME. We conducted an epidemiological analysis of wheeze risk in relation to NOx exposure among school-aged children. To demonstrate the incremental influence of exposure modeling stage, we iteratively estimated the health risk using assigned exposure predictions from each stage of the NOx model. We then determined the impact of SMME on the variance of the health risk estimates under various scenarios. Depending on the stage of the spatiotemporal exposure model used, we found that wheeze odds ratio ranged from 1.16 to 1.28 for an interquartile range increase in NOx. With each additional stage of exposure modeling, the health effect estimate moved further away from the null (OR=1). When corrected for observed SMME, the health effects confidence intervals slightly lengthened, but our epidemiological conclusions were not altered. When the variance estimate was corrected for the potential "worst case scenario" of SMME, the standard error further increased, having a meaningful influence on epidemiological conclusions. Our framework can be expanded and used to understand the implications of using exposure predictions subject to shared measurement error in future health investigations

    W-TSS: A Wavelet-Based Algorithm for Discovering Time Series Shapelets

    No full text
    Many approaches to time series classification rely on machine learning methods. However, there is growing interest in going beyond black box prediction models to understand discriminatory features of the time series and their associations with outcomes. One promising method is time-series shapelets (TSS), which identifies maximally discriminative subsequences of time series. For example, in environmental health applications TSS could be used to identify short-term patterns in exposure time series (shapelets) associated with adverse health outcomes. Identification of candidate shapelets in TSS is computationally intensive. The original TSS algorithm used exhaustive search. Subsequent algorithms introduced efficiencies by trimming/aggregating the set of candidates or training candidates from initialized values, but these approaches have limitations. In this paper, we introduce Wavelet-TSS (W-TSS) a novel intelligent method for identifying candidate shapelets in TSS using wavelet transformation discovery. We tested W-TSS on two datasets: (1) a synthetic example used in previous TSS studies and (2) a panel study relating exposures from residential air pollution sensors to symptoms in participants with asthma. Compared to previous TSS algorithms, W-TSS was more computationally efficient, more accurate, and was able to discover more discriminative shapelets. W-TSS does not require pre-specification of shapelet length
    corecore