1,015 research outputs found
Global state, local decisions: Decentralized NFV for ISPs via enhanced SDN
The network functions virtualization paradigm is rapidly gaining interest among Internet service providers. However, the transition to this paradigm on ISP networks comes with a unique set of challenges: legacy equipment already in place, heterogeneous traffic from multiple clients, and very large scalability requirements. In this article we thoroughly analyze such challenges and discuss NFV design guidelines that address them efficiently. Particularly, we show that a decentralization of NFV control while maintaining global state improves scalability, offers better per-flow decisions and simplifies the implementation of virtual network functions. Building on top of such principles, we propose a partially decentralized NFV architecture enabled via an enhanced software-defined networking infrastructure. We also perform a qualitative analysis of the architecture to identify advantages and challenges. Finally, we determine the bottleneck component, based on the qualitative analysis, which we implement and benchmark in order to assess the feasibility of the architecture.Peer ReviewedPostprint (author's final draft
Collaborative Improvement of Smart Manufacturing using Privacy-Preserving Federated Learning
Nowadays, data sharing among different sources is is very challenging in the manufac-
turing domain, mainly due to industry competition, complicated bureaucratic processes,
and privacy and security concerns. Centralized Machine Learning (ML) poses an essential
aspect in several industries, including smart manufacturing. However this approach may
lead to several issues regarding security and performance.
In response to these problems, Federated Learning (FL) was created. FL is an innova-
tive and decentralized approach to ML, focused on collaboration and data privacy. In this
approach, data is kept in each source where it is trained locally, and only model weights
or gradients are shared to create a global model.
Although several works have already been implemented towards this problem, there
are still many unresolved issues concerning the application of FL frameworks in smart
manufacturing scenarios. Among the several issues found in the analysed works it is
important to emphasize the disregard facing industry 4.0 architectures, strategies and
the unavailability to improve those frameworks further.
This work aims to build a FL framework for smart manufacturing with specific con-
cerns in privacy and applicability in industrial scenarios. The main focus of this frame-
work is to facilitate a collaborative approach in the application of ML to manufacturing by
enabling the knowledge sharing for this purpose and taking privacy as a special concern.
In addition, the implementation and testing of privacy-preserving algorithms, while im-
proving the framework for industrial scenarios are emphasized. A modular approach is
chosen to create a framework adapted to various industrial cases by implementing several
nodes that focus on specific aspects of data collection, data treatment, connection with
the FL system, and ML model management.
The results revealed a competitive model performance of the framework compared to
the centralized approach while keeping data at each source, protecting its privacy. The
implemented framework also proved to be compliant with the IEEE Std 3652.1-2020
standard guidelines, attaining the established requirement levels.Atualmente, a partilha de dados entre diferentes fontes é um grande desafio no domí-
nio da manufatura, principalmente devido à concorrência da indústria, processos burocrá-
ticos complicados e preocupações de privacidade e segurança. O Machine Learning (ML)
impõe-se como um aspeto essencial em várias indústrias, incluindo a manufatura inteli-
gente. Contudo, esta abordagem pode levantar várias questões relativamente à segurança
e ao desempenho.
Em resposta a estes problemas, foi criado o Federated Learning (FL). FL é uma aborda-
gem inovadora e descentralizada de ML, centrada na colaboração e privacidade de dados.
Nesta abordagem, os dados são mantidos em cada fonte, onde são treinados localmente, e
apenas os pesos ou gradientes dos modelos são partilhados para criar um modelo global.
Embora vários trabalhos já tenham sido implementados visando esta temática, ainda
existem muitas questões por resolver relativas à aplicação de frameworks de FL em ce-
nários de manufatura inteligente. Entre as várias questões encontradas na literatura
analisada, é importante enfatizar a desconsideração pelas arquiteturas e estratégias da
indústria 4.0 e a indisponibilidade para melhorar essas frameworks.
Este trabalho visa construir uma framework de FL aplicada à manufatura inteligente
com preocupações específicas no que toca a matérias de privacidade e aplicabilidade em
cenários industriais. O principal objectivo desta framework é facilitar uma abordagem
colaborativa na aplicação de ML ao fabrico, permitindo a partilha de conhecimentos para
este fim e enfatizando a preocupação na privacidade dos utilizadores. Uma abordagem
modular foi escolhida para criar uma framework adaptada a vários casos industriais atra-
vés da implementação de vários nós que se concentram em aspetos específicos da recolha
de dados, tratamento de dados, ligação com o sistema de FL e gestão do modelo de ML.
Os resultados revelaram um desempenho competitivo do modelo em relação a uma
abordagem centralizada, mantendo os dados em cada fonte e protegendo a sua privaci-
dade. A framework implementada também provou estar em conformidade com a norma
IEEE Std 3652.1-2020, atingindo os níveis de exigência estabelecidos
The Family of MapReduce and Large Scale Data Processing Systems
In the last two decades, the continuous increase of computational power has
produced an overwhelming flow of data which has called for a paradigm shift in
the computing architecture and large scale data processing mechanisms.
MapReduce is a simple and powerful programming model that enables easy
development of scalable parallel applications to process vast amounts of data
on large clusters of commodity machines. It isolates the application from the
details of running a distributed program such as issues on data distribution,
scheduling and fault tolerance. However, the original implementation of the
MapReduce framework had some limitations that have been tackled by many
research efforts in several followup works after its introduction. This article
provides a comprehensive survey for a family of approaches and mechanisms of
large scale data processing mechanisms that have been implemented based on the
original idea of the MapReduce framework and are currently gaining a lot of
momentum in both research and industrial communities. We also cover a set of
introduced systems that have been implemented to provide declarative
programming interfaces on top of the MapReduce framework. In addition, we
review several large scale data processing systems that resemble some of the
ideas of the MapReduce framework for different purposes and application
scenarios. Finally, we discuss some of the future research directions for
implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author
FLIPS: Federated Learning using Intelligent Participant Selection
This paper presents the design and implementation of FLIPS, a middleware
system to manage data and participant heterogeneity in federated learning (FL)
training workloads. In particular, we examine the benefits of label
distribution clustering on participant selection in federated learning. FLIPS
clusters parties involved in an FL training job based on the label distribution
of their data apriori, and during FL training, ensures that each cluster is
equitably represented in the participants selected. FLIPS can support the most
common FL algorithms, including FedAvg, FedProx, FedDyn, FedOpt and FedYogi. To
manage platform heterogeneity and dynamic resource availability, FLIPS
incorporates a straggler management mechanism to handle changing capacities in
distributed, smart community applications. Privacy of label distributions,
clustering and participant selection is ensured through a trusted execution
environment (TEE). Our comprehensive empirical evaluation compares FLIPS with
random participant selection, as well as two other "smart" selection mechanisms
- Oort and gradient clustering using two real-world datasets, two different
non-IID distributions and three common FL algorithms (FedYogi, FedProx and
FedAvg). We demonstrate that FLIPS significantly improves convergence,
achieving higher accuracy by 17 - 20 % with 20 - 60 % lower communication
costs, and these benefits endure in the presence of straggler participants
A review of machine learning for big data analysis
Big data is the key to the success of many large technology companies right now. As more and more companies use it to store, analyze, and get value from their huge amounts of data, it gets harder for them to use the data they get in the best way. Most systems have come up with ways to use machine learning. In a real-time web system, data must be processed in a smart way at each node based on data that is spread out. As data privacy becomes a more important social issue, standardized learning has become a popular area of research to make it possible for different organizations to train machine learning models together while keeping privacy in mind. Researchers are becoming more interested in supporting more machine learning models that keep privacy in different ways. There is a need to build systems and infrastructure that make it easier for different standardized learning algorithms to be created. In this research, we look at and talk about the unified and distributed machine learning technology that is used to process large amounts of data. FedML is a Python program that let machine learning be used at any scale. It is a unified, distributed machine learning package
- …