A random forest approach to estimate daily particulate matter, nitrogen dioxide, and ozone at fine spatial resolution in Sweden

Abstract

Air pollution is one of the leading causes of mortality worldwide. An accurate assessment of its spatial and temporal distribution is mandatory to conduct epidemiological studies able to estimate long-term (e.g., annual) and short-term (e.g., daily) health effects. While spatiotemporal models for particulate matter (PM) have been developed in several countries, estimates of daily nitrogen dioxide (NO 2 ) and ozone (O 3 ) concentrations at high spatial resolution are lacking, and no such models have been developed in Sweden. We collected data on daily air pollutant concentrations from routine monitoring networks over the period 2005-2016 and matched them with satellite data, dispersion models, meteorological parameters, and land-use variables. We developed a machine-learning approach, the random forest (RF), to estimate daily concentrations of PM 10 (PM<10 microns), PM 2.5 (PM<2.5 microns), PM 2.5-10 (PM between 2.5 and 10 microns), NO 2 , and O 3 for each squared kilometer of Sweden over the period 2005-2016. Our models were able to describe between 64% (PM 10 ) and 78% (O 3 ) of air pollutant variability in held-out observations, and between 37% (NO 2 ) and 61% (O 3 ) in held-out monitors, with no major differences across years and seasons and better performance in larger cities such as Stockholm. These estimates will allow to investigate air pollution effects across the whole of Sweden, including suburban and rural areas, previously neglected by epidemiological investigation

    Similar works