74 research outputs found

    Calibration of low-cost air pollutant sensors using machine learning techniques

    Get PDF
    Nowadays concern about air pollution has risen due to the effects of the climate change.The application of machine learning methods for the calibration of low-cost sensors is studied. The short-term, long-term, sensor fusion and training set size needed are analyzed. Thus,considering real scenarios

    Using covariates for improving the minimum redundancy maximum relevance feature selection method

    Get PDF
    Maximizing the joint dependency with a minimum size of variables is generally the main task of feature selection. For obtaining a minimal subset, while trying to maximize the joint dependency with the target variable, the redundancy among selected variables must be reduced to a minimum. In this paper, we propose a method based on recently popular minimum Redundancy-Maximum Relevance (mRMR) criterion. The experimental results show that instead of feeding the features themselves into mRMR, feeding the covariates improves the feature selection capability and provides more expressive variable subsets

    Application of machine learning techniques to analyse the effects of physical exercise in ventricular fibrillation

    Get PDF
    This work presents the application of machine learning techniques to analyze the influence of physical exercise in the heart's physiological properties, during ventricular fibrillation. With that purpose, different kinds of classifiers (linear and neural models) were used to classify between trained and sedentary rabbit hearts. These classifiers were used to perform knowledge extraction through a wrapper feature selection algorithm. The obtained results showed the higher performance of the neural models compared to the linear classifier (higher performance measures and higher dimensionality reduction). The most relevant features to describe the benefits of physical exercise are those related to myocardial heterogeneity, mean activation rate and activation complexity

    Modelling atmospheric ozone concentration using machine learning algorithms

    Get PDF
    Air quality monitoring is one of several important tasks carried out in the area of environmental science and engineering. Accordingly, the development of air quality predictive models can be very useful as such models can provide early warnings of pollution levels increasing to unsatisfactory levels. The literature review conducted within the research context of this thesis revealed that only a limited number of widely used machine learning algorithms have been employed for the modelling of the concentrations of atmospheric gases such as ozone, nitrogen oxides etc. Despite this observation the research and technology area of machine learning has recently advanced significantly with the introduction of ensemble learning techniques, convolutional and deep neural networks etc. Given these observations the research presented in this thesis aims to investigate the effective use of ensemble learning algorithms with optimised algorithmic settings and the appropriate choice of base layer algorithms to create effective and efficient models for the prediction and forecasting of specifically, ground level ozone (O3). Three main research contributions have been made by this thesis in the application area of modelling O3 concentrations. As the first contribution, the performance of several ensemble learning (Homogeneous and Heterogonous) algorithms were investigated and compared with all popular and widely used single base learning algorithms. The results have showed impressive prediction performance improvement obtainable by using meta learning (Bagging, Stacking, and Voting) algorithms. The performances of the three investigated meta learning algorithms were similar in nature giving an average 0.91 correlation coefficient, in prediction accuracy. Thus as a second contribution, the effective use of feature selection and parameter based optimisation was carried out in conjunction with the application of Multilayer Perceptron, Support Vector Machines, Random Forest and Bagging based learning techniques providing significant improvements in prediction accuracy. The third contribution of research presented in this thesis includes the univariate and multivariate forecasting of ozone concentrations based of optimised Ensemble Learning algorithms. The results reported supersedes the accuracy levels reported in forecasting Ozone concentration variations based on widely used, single base learning algorithms. In summary the research conducted within this thesis bridges an existing research gap in big data analytics related to environment pollution modelling, prediction and forecasting where present research is largely limited to using standard learning algorithms such as Artificial Neural Networks and Support Vector Machines often available within popular commercial software packages

    Causal impacts of transport interventions on air quality

    Get PDF
    The transport sector is one of the main sources of air pollution emissions, particularly for carbon monoxide, nitrogen oxides, and particulate matter. Evaluating the effectiveness of transport interventions on improving air quality is essential to informing future policy. However, a comparison of air quality observations before and after an intervention can be biased by various factors, such as weather conditions and seasonality effects. Causal inference methods generally have advantages in intervention evaluation in terms of data requirements, model building, and the interpretation of effect estimates. Causality goes beyond statistical association in the sense that it seeks to measure the net effect of an intervention on an outcome through all possible pathways directing from the intervention to the outcome. Causal inference methods have been applied to address the same question, however, the important confounders (such as weather conditions) are commonly controlled for by including variables in the causal inference model and assuming a parametric relationship. The thesis focuses on understanding the causal impacts of transport interventions on air quality. A novel ex-post policy evaluation framework, combining meteorological normalisation, change point detection, and causal inferencing, is proposed to overcome the limitations of previous approaches, and it is applied to three distinct transport interventions: improving public transport supply (Jubilee Line Extension), tightening road traffic emission standards (London Ultra Low Emission Zone), and restricting both transport activities and supply (COVID-19 lockdown). The Jubilee Line extension led to only small (< 1%) or insignificant changes in air pollution on average in London. The Ultra Low Emission Zone showed an average reduction of less than 3% for NO2 concentrations and insignificant effects on O3 and PM2.5 concentrations. The lockdown reduced the NO2 concentrations in London by less than 12% on average, and it had an insignificant effect on O3, PM10, and PM2.5. Therefore, the empirical results of the thesis consistently highlight the necessity of a multi-faceted set of policies that aim to reduce emissions across sectors with coordination among local, regional, and national government in order to achieve long-term improvements in air quality in cities.Open Acces

    Development of a regional feature selection-based machine learning system (RFSML v1.0) for air pollution forecasting over China

    Get PDF
    With the explosive growth of atmospheric data, machine learning models have achieved great success in air pollution forecasting because of their higher computational efficiency than the traditional chemical transport models. However, in previous studies, new prediction algorithms have only been tested at stations or in a small region; a large-scale air quality forecasting model remains lacking to date. Huge dimensionality also means that redundant input data may lead to increased complexity and therefore the over-fitting of machine learning models. Feature selection is a key topic in machine learning development, but it has not yet been explored in atmosphere-related applications. In this work, a regional feature selection-based machine learning (RFSML) system was developed, which is capable of predicting air quality in the short term with high accuracy at the national scale. Ensemble-Shapley additive global importance analysis is combined with the RFSML system to extract significant regional features and eliminate redundant variables at an affordable computational expense. The significance of the regional features is also explained physically. Compared with a standard machine learning system fed with relative features, the RFSML system driven by the selected key features results in superior interpretability, less training time, and more accurate predictions. This study also provides insights into the difference in interpretability among machine learning models (i.e., random forest, gradient boosting, and multi-layer perceptron models).</p

    Air pollution exposure assessment in sparsely monitored settings; applying machine-learning methods with remote sensing data in South Africa.

    Get PDF
    Air pollution is one of the leading environmental risk factors to human health – Both short and long-term exposure to air pollution impact human health accounting for over 4 million deaths. Although the risk of exposure to air pollution has been quantified in different settings and countries of the world. The majority of these studies are from high-income countries with historical air pollutant measurement data and corresponding health outcomes data to conduct such epidemiological studies. Air pollution exposure levels in these high-income settings are lower than the exposure levels in low-income countries. The exposure level in sub-Saharan Africa (SSA) countries has continued to increase due to rapid industrialization and urbanization. In addition, the underlying susceptibility profile of SSA population is different from the profiles of the population in high-income settings. However, a major limitation to conducting epidemiological studies to quantify the exposure-response relationship between air pollution and adverse health outcomes in SSA is the paucity of historical air pollution measurement data to inform such epidemiological studies. South Africa an SSA country with some air quality monitoring stations especially in areas classified as air pollution priority areas have historical particulate matter less than or equal to 10 micrometres in aerodynamic diameter (PM10 μg/m3) measurement data. PM10 is one of the most monitored criteria for air pollutants in South Africa. The availability of satellite-derived aerosol optical depth (AOD) at high spatial and temporal resolutions provides information about how particles in the atmosphere can prevent sunlight from reaching the ground. This satellite product has been used as a proxy variable to explain ground-level air pollution levels in different settings. This thesis main objective was to use satellite-derived AOD to bridge the gap in ground-monitored PM10 across four provinces of South Africa (Gauteng, Mpumalanga, KwaZulu-Natal and Western Cape). We collected PM10 ground monitor measurement data from the South Africa Weather Services across the four provinces for the years 2010 – 2017. Due to the gaps in the daily PM10 across the sites and years. In study I, we compared methods for imputing daily ground-level PM10 data at sites across the four provinces for the years 2010 – 2017 using random forest (RF) models. The reliability of air pollution exposure models depends on how well the models capture the spatial and temporal variation of air pollution. Thus, study II explored the spatial and temporal variations in ground monitor PM10 across the four provinces for the years 2010 – 2017. To explore the feasibility of using satellite-derived AOD and other spatial and temporal predictor variables, Study III used an ensemble machine-learning framework of RF, extreme gradient boosting (XGBoost) and support vector regression (SVR) to calibrate daily ground-level PM10 at 1 × 1 km spatial resolution across the four provinces for the year 2016. In conclusion, we developed a spatiotemporal model to predict daily PM10 concentrations across four provinces of South Africa at 1 × 1 km spatial resolution for 2016. This model is the first attempt to use a satellite-derived product to fill the gap in ground monitor air pollution data in SSA

    Spatiotemporal and temporal forecasting of ambient air pollution levels through data-intensive hybrid artificial neural network models

    Get PDF
    Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites.Outdoor air pollution (AP) is a serious public threat which has been linked to severe respiratory and cardiovascular illnesses, and premature deaths especially among those residing in highly urbanised cities. As such, there is a need to develop early-warning and risk management tools to alleviate its effects. The main objective of this research is to develop AP forecasting models based on Artificial Neural Networks (ANNs) according to an identified model-building protocol from existing related works. Plain, hybrid and ensemble ANN model architectures were developed to estimate the temporal and spatiotemporal variability of hourly NO2 levels in several locations in the Greater London area. Wavelet decomposition was integrated with Multilayer Perceptron (MLP) and Long Short-term Memory (LSTM) models to address the issue of high variability of AP data and improve the estimation of peak AP levels. Block-splitting and crossvalidation procedures have been adapted to validate the models based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Willmott’s index of agreement (IA). The results of the proposed models present better performance than those from the benchmark models. For instance, the proposed wavelet-based hybrid approach provided 39.15% and 28.58% reductions in RMSE and MAE indices, respectively, on the performance of the benchmark MLP model results for the temporal forecasting of NO2 levels. The same approach reduced the RMSE and MAE indices of the benchmark LSTM model results by 12.45% and 20.08%, respectively, for the spatiotemporal estimation of NO2 levels in one site at Central London. The proposed hybrid deep learning approach offers great potential to be operational in providing air pollution forecasts in areas without a reliable database. The model-building protocol adapted in this thesis can also be applied to studies using measurements from other sites

    The FORUM end-to-end simulator project: architecture and results

    Get PDF
    FORUM (Far-infrared Outgoing Radiation Understanding and Monitoring) will fly as the ninth ESA's Earth Explorer mission, and an end-to-end simulator (E2ES) has been developed as a support tool for the mission selection process and the subsequent development phases. The current status of the FORUM E2ES project is presented together with the characterization of the capabilities of a full physics retrieval code applied to FORUM data. We show how the instrument characteristics and the observed scene conditions impact on the spectrum measured by the instrument, accounting for the main sources of error related to the entire acquisition process, and the consequences on the retrieval algorithm. Both homogeneous and heterogeneous case studies are simulated in clear and cloudy conditions, validating the E2ES against appropriate well-established correlative codes. The performed tests show that the performance of the retrieval algorithm is compliant with the project requirements both in clear and cloudy conditions. The far-infrared (FIR) part of the FORUM spectrum is shown to be sensitive to surface emissivity, in dry atmospheric conditions, and to cirrus clouds, resulting in improved performance of the retrieval algorithm in these conditions. The retrieval errors increase with increasing the scene heterogeneity, both in terms of surface characteristics and in terms of fractional cloud cover of the scene
    • …
    corecore