12 research outputs found

    The Paradox of Noise: An Empirical Study of Noise-Infusion Mechanisms to Improve Generalization, Stability, and Privacy in Federated Learning

    Full text link
    In a data-centric era, concerns regarding privacy and ethical data handling grow as machine learning relies more on personal information. This empirical study investigates the privacy, generalization, and stability of deep learning models in the presence of additive noise in federated learning frameworks. Our main objective is to provide strategies to measure the generalization, stability, and privacy-preserving capabilities of these models and further improve them. To this end, five noise infusion mechanisms at varying noise levels within centralized and federated learning settings are explored. As model complexity is a key component of the generalization and stability of deep learning models during training and evaluation, a comparative analysis of three Convolutional Neural Network (CNN) architectures is provided. The paper introduces Signal-to-Noise Ratio (SNR) as a quantitative measure of the trade-off between privacy and training accuracy of noise-infused models, aiming to find the noise level that yields optimal privacy and accuracy. Moreover, the Price of Stability and Price of Anarchy are defined in the context of privacy-preserving deep learning, contributing to the systematic investigation of the noise infusion strategies to enhance privacy without compromising performance. Our research sheds light on the delicate balance between these critical factors, fostering a deeper understanding of the implications of noise-based regularization in machine learning. By leveraging noise as a tool for regularization and privacy enhancement, we aim to contribute to the development of robust, privacy-aware algorithms, ensuring that AI-driven solutions prioritize both utility and privacy

    Imbalanced Learning with Parametric Linear Programming Support Vector Machine For Weather Data Application

    Get PDF
    Learning from imbalanced data sets is one of the aspects of predictive modeling and machine learning that has taken a lot of attention in the last decade. Multiple research projects have been carried out to adjust the existing algorithms for accurate predictions of both classes. The model proposed in this thesis is a linear Support Vector Machine model with L1-norm objective function with applications on weather data collected from the Bureau of Meteorology system in Australia. Apart from model selection and modifications we have also introduced a parametric modeling algorithm based on a novel parametric simplex approach for parameter tuning of Support Vector Machine. The combination of the two proposed approaches has yielded a significant improvement in predicting the minority class and decrease the model’s bias towards the majority class as is seen in most machine learning algorithms

    Uncovering the Potential of Federated Learning: Addressing Algorithmic and Data-driven Challenges under Privacy Restrictions

    Get PDF
    Federated learning is a groundbreaking distributed machine learning paradigm that allows for the collaborative training of models across various entities without directly sharing sensitive data, ensuring privacy and robustness. This Ph.D. dissertation delves into the intricacies of federated learning, investigating the algorithmic and data-driven challenges of deep learning models in the presence of additive noise in this framework. The main objective is to provide strategies to measure the generalization, stability, and privacy-preserving capabilities of these models and further improve them. To this end, five noise infusion mechanisms at varying noise levels within centralized and federated learning settings are explored. As model complexity is a key component of the generalization and stability of deep learning models during training and evaluation, a comparative analysis of three Convolutional Neural Network (CNN) architectures is provided. A key contribution of this study is introducing specific metrics for training with noise. Signal-to-Noise Ratio (SNR) is introduced as a quantitative measure of the trade-off between privacy and training accuracy of noise-infused models, aiming to find the noise level that yields optimal privacy and accuracy. Moreover, the Price of Stability and Price of Anarchy are defined in the context of privacy-preserving deep learning, contributing to the systematic investigation of the noise infusion mechanisms to enhance privacy without compromising performance. This research sheds light on the delicate balance between these critical factors, fostering a deeper understanding of the implications of noise-based regularization in machine learning. The present study also explores a real-world application of federated learning in weather prediction applications that suffer from the issue of imbalanced datasets. Utilizing data from multiple sources combined with advanced data augmentation techniques improves the accuracy and generalization of weather prediction models, even when dealing with imbalanced datasets. Overall, federated learning is pivotal in harnessing decentralized datasets for real-world applications while safeguarding privacy. By leveraging noise as a tool for regularization and privacy enhancement, this research study aims to contribute to the development of robust, privacy-aware algorithms, ensuring that AI-driven solutions prioritize both utility and privacy

    Exploring Machine Learning Models for Federated Learning: A Review of Approaches, Performance, and Limitations

    Full text link
    In the growing world of artificial intelligence, federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data. Federated learning lays the groundwork for collaborative research in areas where the data is sensitive. Federated learning has several implications for real-world problems. In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data. This distributed approach enables us to leverage information from multiple sources and gain more diverse insights. This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Specifically, we have presented an extensive review of supervised/unsupervised machine learning algorithms, ensemble methods, meta-heuristic approaches, blockchain technology, and reinforcement learning used in the framework of federated learning, in addition to an overview of federated learning applications. This paper reviews the literature on the components of federated learning and its applications in the last few years. The main purpose of this work is to provide researchers and practitioners with a comprehensive overview of federated learning from the machine learning point of view. A discussion of some open problems and future research directions in federated learning is also provided

    The Survey of NOX Distribution Using Dispersion Models AERMOD and CALPUFF at a Gas Refinery

    No full text
    Background & Objectives: Nowadays, air pollution is one of the major challenges in the world, therefore, in the present study, according to the importance of the fourth refinery gas as the largest gas refinery in the region, the amount of emissions from stacks has been initially determined and then the distribution has been identified in the region. Methods: In this research, AERMOD and CALPUFF models have been used as the tools for the analysis of NOX emissions of stacks of 4th South Pars gas refinery located in Assaluyeh. First, NOX emissions from refinery stacks have been obtained by field measurements. Then, the distribution of these emissions has been examined using dispersion models AERMOD and CALPUFF in an area of 50 × 50 km in each direction x and y in the one-year period of 2013 to the average time of 1, 3, 8, and 24 and the amounts resulting from the implementation of the models have been compared to the results of field measurements at 9 receiving stations as a separate receptors in the model. Results: Review of charts and statistical parameters has shown that, according to the evaluation of predictions made, the CALPUFF model was better than AERMOD model, in the studied area. Conclusion: It could be concluded that performance of both models to predict the concentration of pollutants in the region can be generally considered acceptable

    Dispersion Modeling of CO with AERMOD in South Pars fourth Gas Refinery

    No full text
    Background: Air quality modeling can be considered as a useful tool to predict air quality in future and determine the control strategies of emissions abatement. In this study, AERMOD dispersion model has been applied as a tool for the analysis of the values of CO emissions from the stacks and flares of South Pars fourth Gas Refinery located in Asaluyeh. Methods: First, the values of CO  emissions from the refinery's stacks and flares were investigated by measurement and using the emission factors in four seasons of 2013. Then, dispersion of pollutants was predicted by using the AERMOD model in the region with area of 10×10 km2  in each direction of x and y, in average times of 1, 3, 8, 24-hours and for the annual statistical period. Then the predicted and field measurement values in 9 receptors have been compared. Results: Statistical evaluation showed that the correlation coefficient values for CO were 0.85 in spring, 0.89 in summer, 0.96 in fall, and 0.95 in winter. The maximum concentration of CO was occurred in local scale of 10×10 km2. Conclusion: Comparison of maximum 1-hour and 8-hour concentrations of the predicted results with the national and international standards showed that CO concentration is higher than standard values. In total, according to the evaluation of the predictions made, the performance of AERMOD model was acceptable in prediction of CO concentrations in the study area
    corecore