15 research outputs found
Active-Code Replacement in the OODIDA Data Analytics Platform
OODIDA (On-board/Off-board Distributed Data Analytics) is a platform for
distributing and executing concurrent data analytics tasks. It targets fleets
of reference vehicles in the automotive industry and has a particular focus on
rapid prototyping. Its underlying message-passing infrastructure has been
implemented in Erlang/OTP. External Python applications perform data analytics
tasks. Most work is performed by clients (on-board). A central cloud server
performs supplementary tasks (off-board). OODIDA can be automatically packaged
and deployed, which necessitates restarting parts of the system, or all of it.
This is potentially disruptive. To address this issue, we added the ability to
execute user-defined Python modules on clients as well as the server. These
modules can be replaced without restarting any part of the system and they can
even be replaced between iterations of an ongoing assignment. This facilitates
use cases such as iterative A/B testing of machine learning algorithms or
modifying experimental algorithms on-the-fly.Comment: 6 pages, 2 figures; Published in Euro-Par 2019: Parallel Processing
Workshops proceedings; DOI was added to the PDF. There is also an extended
version of this paper, cf. arXiv admin note: text overlap with
arXiv:1903.0947
S-RASTER: Contraction Clustering for Evolving Data Streams
Contraction Clustering (RASTER) is a single-pass algorithm for density-based
clustering of 2D data. It can process arbitrary amounts of data in linear time
and in constant memory, quickly identifying approximate clusters. It also
exhibits good scalability in the presence of multiple CPU cores. RASTER
exhibits very competitive performance compared to standard clustering
algorithms, but at the cost of decreased precision. Yet, RASTER is limited to
batch processing and unable to identify clusters that only exist temporarily.
In contrast, S-RASTER is an adaptation of RASTER to the stream processing
paradigm that is able to identify clusters in evolving data streams. This
algorithm retains the main benefits of its parent algorithm, i.e. single-pass
linear time cost and constant memory requirements for each discrete time step
within a sliding window. The sliding window is efficiently pruned, and
clustering is still performed in linear time. Like RASTER, S-RASTER trades off
an often negligible amount of precision for speed. Our evaluation shows that
competing algorithms are at least 50% slower. Furthermore, S-RASTER shows good
qualitative results, based on standard metrics. It is very well suited to
real-world scenarios where clustering does not happen continually but only
periodically.Comment: 24 pages, 5 figures, 2 table
Compiling Agda to System Fω in Theory
We develop a theoretical foundation for compiling the programming
language Agda to System Fω, which is a stepping stone towards
a compiler from Agda to Haskell. The practical relevance
for software engineering and the problem of providing correctness
guarantees for programs is highlighted. After describing relevant λ-
calculi, we specify the semantics for compiling Agda to System Fω.
Finally, we illustrate those compilation rules by manually translating
several Agda code examples to System Fω
Purely Functional Federated Learning in Erlang
Arguably the biggest strength of the functional programming language Erlang is how straightforward it is to implement concurrent and distributed programs with it. Numerical computing, on the other hand, is not necessarily seen as one of its strengths. The recent introduction of Federated Learning, a concept according to which edge devices are leveraged for decentralized machine learning tasks, while a central server only updates and distributes a global model, provided the motivation for exploring how well Erlang was suited to such a use case.We present a framework for Federated Learning in Erlang, written in a purely functional style, and compare two versions of it: one that has been exclusively written in Erlang, and one in which Erlang is relegated to coordinating client processes that rely on performing numerical computations in the programming language C. Initial results are promising, as we learnt that a real-world industrial use case of distributed data analytics can easily be tackled with a system purely written in Erlang.The novelty of our work is that we present the first implementation of a Federated Learning framework in a functional programming language, with the added benefit of being purely functional. In addition, we demonstrate that Erlang can not only be leveraged for message passing but also performs adequately for practical machine learning tasks
Functional Federated Learning in Erlang
A modern connected car produces gigabytes to terabytes of data per day. Collecting data generated by an entire fleet of cars, and processing it centrally on a server farm, is thus not feasible. The problem is that the total amount of data generated by cars, i.e. on edge devices, is too large to be efficiently transmitted to a central server. However, CPUs used in edge devices such as connected cars but also regular smart phones that connect to the cloud, have been getting more and more powerful in recent years. Tapping into this computational resource is one way of addressing the problem of processing big data that is generated by large numbers of edge devices.One such approach consists of distributed data processing. Using the example of training an Artificial Neural Network, we introduce a framework for distributed data processing. A particular focus is on the implementation language Erlang. Arguably the biggest strength of the functional programming language Erlang is how straightforward it is to implement concurrent and distributed programs with it. Numerical computing, on the other hand, is not necessarily seen as one of its strengths.The recent introduction of Federated Learning, a concept according to which edge devices are leveraged for decentralized machine learning tasks, while a central server only updates and distributes a global model, provides the motivation for exploring how well Erlang is suited to such a use case. We present a framework for Federated Learning in Erlang, written in a purely functional style. Erlang is used for coordinating data processing tasks but also for performing numerical computations. Initial results show that Erlang is well-suited for that kind of task.We provide an overview of the general framework and also discuss an existing and fully realized in-house prototypical implementation that performs distributed machine learning tasks according to the Federated Learning paradigm. While we focus on Artificial Neural Networks, our Federated Learning framework is of a more general nature and could also be used with other machine learning algorithms.The novelty of our work is that we present the first publicly available implementation of a Federated Learning framework; our work is also the first implementation of Federated Learning in a functional programming language, with the added benefit of being purely functional. In addition, we demonstrate that Erlang can not only be leveraged for message passing but that it also performs adequately for practical machine learning tasks.Our presentation is based on our work-in-progress paper “Purely Functional Federated Learning in Erlang”, which we presented at IFL 2017. The context of this research is our ongoing involvement in the Vinnova-funded project “On-board/off-board distributed data analysis” (OODIDA), which is a joint-project between the Fraunhofer-Chalmers Research Centre for Industrial Mathematics, Chalmers University of Technology, Volvo Car Corporation, Volvo Trucks, and Alkit Communications
A Performance Evaluation of Federated Learning Algorithms
Federated learning is an approach to distributed machine learning where a global model is learned by aggregating models that have been trained locally on data-generating clients. Contrary to centralized optimization, clients can be very large in number and face challenges of data and network heterogeneity. Examples of clients include smartphones and connected vehicles, which highlights the practical relevance of federated learning. We benchmark three federated learning algorithms and compare their performance against a centralized approach where data resides on the server. The algorithms Federated Averaging (FedAvg), Federated Stochastic Variance Reduced Gradient, and CO-OP are evaluated on the MNIST dataset, using both i.i.d. and non-i.i.d. partitionings of the data. Our results show that FedAvg achieves the highest accuracy among the federated algorithms, regardless of how data was partitioned. Our comparison between FedAvg and centralized learning shows that they are practically equivalent when i.i.d. data is used. However, the centralized approach outperforms FedAvg with non-i.i.d. data
Bevoelkerungswachstum und Einflussfaktoren Beobachtungen, Analysen und Perspektiven; die Welt auf dem Weg ins 21. Jahrhundert
'Das Thema des 1. und 2. Bursenjahres 'Umweltschutz und Krisenmanagement' ist von der ersten Bursengeneration heterogen und mit einer Breitenstrategie angegangen und aufbereitet worden. Die Ergebnisse sind in einem Tagungsband veroeffentlicht und dieser kann im Bursen-Sekretariat angefordert werden. Fuer das nun laufende 2. Bursenjahr wurde unter dem Dach des gleichen Oberthemas eine Fokussierung auf die Problematik der Bevoelkerungsentwicklung vorgenommen und es duerfen nun wesentlich homogenere Arbeiten und Beitraege erwartet werden. Mit diesem 1. (internen) Workshop des 2. Bursenjahres soll ein Zwischenstand gezeigt und der 2. Workshop vorbereitet werden. Dieser wird im Sommer 1998 stattfinden und wie auch schon der Workshop 1997 mit externen Experten das Thema Bevoelkerungsentwicklung aus verschiedenen Perspektiven beleuchten. Die in diesem Band versammelten Beitraege gelten in erster Linie den 4 Bursalen und der Demonstration ihrer bisher geleisteten Arbeit. Ergaenzt werden diese vier Vortraege durch drei universitaetsinterne Referenten, die sich dem Thema von anderen Standpunkten naehern.' (Textauszug). Inhaltsverzeichnis: Christopher Stehr: Das 'magische Fuenfeck': Zusammenhaenge zwischen Weltbevoelkerung, Migration, Wirtschaftswachstum, nachhaltiger Entwicklung und Menschenrechten. Eine Betrachtung im Rahmen einer Konstellationsanalyse (11-39); Takako Yoshida: Die Einflussfaktoren des Bevoelkerungswachstums. Eine Untersuchung zur Beziehung zwischen Bevoelkerung und Faktoren mit den jeweiligen Korrelationskoeffizienten (41-58); Gregor Black: Gegenseitige Abhaengigkeit von Bevoelkerungswachstum und Gesundheitssystem. Eine exemplarische Darstellung anhand des Reproduktiven Gesundheitssystems (59-71); Toru Sasaki: Eine oekologische und biologische Betrachtung der Veraenderung der Population (73-85); Ruediger Seydel: Dynamische Systeme. Strukturaenderungen - Vorstufen zum Chaos (87-108); Eva Haas: Modellierung, Simulation und Darstellung des Gesundheitszustandes einer Region. Ein integrativer Ansatz (109-126); Juergen Strohm: Internationale Massnahmen zur nachhaltigen Entwicklung. Oekologische und soziale Leitplanken an den globalen Markt (127-154); Christopher Stehr: Nachwort (155-158)SIGLEAvailable from UuStB Koeln(38)-20000106563 / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekDEGerman