1,015 research outputs found

    Global state, local decisions: Decentralized NFV for ISPs via enhanced SDN

    Get PDF
    The network functions virtualization paradigm is rapidly gaining interest among Internet service providers. However, the transition to this paradigm on ISP networks comes with a unique set of challenges: legacy equipment already in place, heterogeneous traffic from multiple clients, and very large scalability requirements. In this article we thoroughly analyze such challenges and discuss NFV design guidelines that address them efficiently. Particularly, we show that a decentralization of NFV control while maintaining global state improves scalability, offers better per-flow decisions and simplifies the implementation of virtual network functions. Building on top of such principles, we propose a partially decentralized NFV architecture enabled via an enhanced software-defined networking infrastructure. We also perform a qualitative analysis of the architecture to identify advantages and challenges. Finally, we determine the bottleneck component, based on the qualitative analysis, which we implement and benchmark in order to assess the feasibility of the architecture.Peer ReviewedPostprint (author's final draft

    Collaborative Improvement of Smart Manufacturing using Privacy-Preserving Federated Learning

    Get PDF
    Nowadays, data sharing among different sources is is very challenging in the manufac- turing domain, mainly due to industry competition, complicated bureaucratic processes, and privacy and security concerns. Centralized Machine Learning (ML) poses an essential aspect in several industries, including smart manufacturing. However this approach may lead to several issues regarding security and performance. In response to these problems, Federated Learning (FL) was created. FL is an innova- tive and decentralized approach to ML, focused on collaboration and data privacy. In this approach, data is kept in each source where it is trained locally, and only model weights or gradients are shared to create a global model. Although several works have already been implemented towards this problem, there are still many unresolved issues concerning the application of FL frameworks in smart manufacturing scenarios. Among the several issues found in the analysed works it is important to emphasize the disregard facing industry 4.0 architectures, strategies and the unavailability to improve those frameworks further. This work aims to build a FL framework for smart manufacturing with specific con- cerns in privacy and applicability in industrial scenarios. The main focus of this frame- work is to facilitate a collaborative approach in the application of ML to manufacturing by enabling the knowledge sharing for this purpose and taking privacy as a special concern. In addition, the implementation and testing of privacy-preserving algorithms, while im- proving the framework for industrial scenarios are emphasized. A modular approach is chosen to create a framework adapted to various industrial cases by implementing several nodes that focus on specific aspects of data collection, data treatment, connection with the FL system, and ML model management. The results revealed a competitive model performance of the framework compared to the centralized approach while keeping data at each source, protecting its privacy. The implemented framework also proved to be compliant with the IEEE Std 3652.1-2020 standard guidelines, attaining the established requirement levels.Atualmente, a partilha de dados entre diferentes fontes é um grande desafio no domí- nio da manufatura, principalmente devido à concorrência da indústria, processos burocrá- ticos complicados e preocupações de privacidade e segurança. O Machine Learning (ML) impõe-se como um aspeto essencial em várias indústrias, incluindo a manufatura inteli- gente. Contudo, esta abordagem pode levantar várias questões relativamente à segurança e ao desempenho. Em resposta a estes problemas, foi criado o Federated Learning (FL). FL é uma aborda- gem inovadora e descentralizada de ML, centrada na colaboração e privacidade de dados. Nesta abordagem, os dados são mantidos em cada fonte, onde são treinados localmente, e apenas os pesos ou gradientes dos modelos são partilhados para criar um modelo global. Embora vários trabalhos já tenham sido implementados visando esta temática, ainda existem muitas questões por resolver relativas à aplicação de frameworks de FL em ce- nários de manufatura inteligente. Entre as várias questões encontradas na literatura analisada, é importante enfatizar a desconsideração pelas arquiteturas e estratégias da indústria 4.0 e a indisponibilidade para melhorar essas frameworks. Este trabalho visa construir uma framework de FL aplicada à manufatura inteligente com preocupações específicas no que toca a matérias de privacidade e aplicabilidade em cenários industriais. O principal objectivo desta framework é facilitar uma abordagem colaborativa na aplicação de ML ao fabrico, permitindo a partilha de conhecimentos para este fim e enfatizando a preocupação na privacidade dos utilizadores. Uma abordagem modular foi escolhida para criar uma framework adaptada a vários casos industriais atra- vés da implementação de vários nós que se concentram em aspetos específicos da recolha de dados, tratamento de dados, ligação com o sistema de FL e gestão do modelo de ML. Os resultados revelaram um desempenho competitivo do modelo em relação a uma abordagem centralizada, mantendo os dados em cada fonte e protegendo a sua privaci- dade. A framework implementada também provou estar em conformidade com a norma IEEE Std 3652.1-2020, atingindo os níveis de exigência estabelecidos

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    FLIPS: Federated Learning using Intelligent Participant Selection

    Full text link
    This paper presents the design and implementation of FLIPS, a middleware system to manage data and participant heterogeneity in federated learning (FL) training workloads. In particular, we examine the benefits of label distribution clustering on participant selection in federated learning. FLIPS clusters parties involved in an FL training job based on the label distribution of their data apriori, and during FL training, ensures that each cluster is equitably represented in the participants selected. FLIPS can support the most common FL algorithms, including FedAvg, FedProx, FedDyn, FedOpt and FedYogi. To manage platform heterogeneity and dynamic resource availability, FLIPS incorporates a straggler management mechanism to handle changing capacities in distributed, smart community applications. Privacy of label distributions, clustering and participant selection is ensured through a trusted execution environment (TEE). Our comprehensive empirical evaluation compares FLIPS with random participant selection, as well as two other "smart" selection mechanisms - Oort and gradient clustering using two real-world datasets, two different non-IID distributions and three common FL algorithms (FedYogi, FedProx and FedAvg). We demonstrate that FLIPS significantly improves convergence, achieving higher accuracy by 17 - 20 % with 20 - 60 % lower communication costs, and these benefits endure in the presence of straggler participants

    A review of machine learning for big data analysis

    Get PDF
    Big data is the key to the success of many large technology companies right now. As more and more companies use it to store, analyze, and get value from their huge amounts of data, it gets harder for them to use the data they get in the best way. Most systems have come up with ways to use machine learning. In a real-time web system, data must be processed in a smart way at each node based on data that is spread out. As data privacy becomes a more important social issue, standardized learning has become a popular area of research to make it possible for different organizations to train machine learning models together while keeping privacy in mind. Researchers are becoming more interested in supporting more machine learning models that keep privacy in different ways. There is a need to build systems and infrastructure that make it easier for different standardized learning algorithms to be created. In this research, we look at and talk about the unified and distributed machine learning technology that is used to process large amounts of data. FedML is a Python program that let machine learning be used at any scale. It is a unified, distributed machine learning package
    corecore