4,113 research outputs found
Enabling scalability by partitioning virtual environments using frontier sets
We present a class of partitioning scheme that we have called frontier sets. Frontier sets build on the notion of a potentially visible set (PVS). In a PVS, a world is subdivided into cells and for each cell all the other cells that can be seen are computed. In contrast, a frontier set considers pairs of cells, A and B. For each pair, it lists two sets of cells (two frontiers), FAB and FBA. By definition, from no cell in FAB is any cell in FBA visible and vice versa.
Our initial use of frontier sets has been to enable scalability in distributed networking. This is possible because, for example, if at time t0 Player1 is in cell A and Player2 is in cell B, as long as they stay in their respective frontiers, they do not need to send update information to each other.
In this paper we describe two strategies for building frontier sets. Both strategies are dynamic and compute frontiers only as necessary at runtime. The first is distance-based frontiers. This strategy requires precomputation of an enhanced potentially visible set. The second is greedy frontiers. This strategy is more expensive to compute at runtime, however it leads to larger and thus more efficient frontiers.
Network simulations using code based on the Quake II engine show that frontiers have significant promise and may allow a new class of scalable peer-to-peer game infrastructures to emerge
Knowledge is at the Edge! How to Search in Distributed Machine Learning Models
With the advent of the Internet of Things and Industry 4.0 an enormous amount
of data is produced at the edge of the network. Due to a lack of computing
power, this data is currently send to the cloud where centralized machine
learning models are trained to derive higher level knowledge. With the recent
development of specialized machine learning hardware for mobile devices, a new
era of distributed learning is about to begin that raises a new research
question: How can we search in distributed machine learning models? Machine
learning at the edge of the network has many benefits, such as low-latency
inference and increased privacy. Such distributed machine learning models can
also learn personalized for a human user, a specific context, or application
scenario. As training data stays on the devices, control over possibly
sensitive data is preserved as it is not shared with a third party. This new
form of distributed learning leads to the partitioning of knowledge between
many devices which makes access difficult. In this paper we tackle the problem
of finding specific knowledge by forwarding a search request (query) to a
device that can answer it best. To that end, we use a entropy based quality
metric that takes the context of a query and the learning quality of a device
into account. We show that our forwarding strategy can achieve over 95%
accuracy in a urban mobility scenario where we use data from 30 000 people
commuting in the city of Trento, Italy.Comment: Published in CoopIS 201
High Energy Physics Forum for Computational Excellence: Working Group Reports (I. Applications Software II. Software Libraries and Tools III. Systems)
Computing plays an essential role in all aspects of high energy physics. As
computational technology evolves rapidly in new directions, and data throughput
and volume continue to follow a steep trend-line, it is important for the HEP
community to develop an effective response to a series of expected challenges.
In order to help shape the desired response, the HEP Forum for Computational
Excellence (HEP-FCE) initiated a roadmap planning activity with two key
overlapping drivers -- 1) software effectiveness, and 2) infrastructure and
expertise advancement. The HEP-FCE formed three working groups, 1) Applications
Software, 2) Software Libraries and Tools, and 3) Systems (including systems
software), to provide an overview of the current status of HEP computing and to
present findings and opportunities for the desired HEP computational roadmap.
The final versions of the reports are combined in this document, and are
presented along with introductory material.Comment: 72 page
Parallel programming paradigms and frameworks in big data era
With Cloud Computing emerging as a promising new approach for ad-hoc parallel data processing, major companies have started to integrate frameworks for parallel data processing in their product portfolio, making it easy for customers to access these services and to deploy their programs. We have entered the Era of Big Data. The explosion and profusion of available data in a wide range of application domains rise up new challenges and opportunities in a plethora of disciplines-ranging from science and engineering to biology and business. One major challenge is how to take advantage of the unprecedented scale of data-typically of heterogeneous nature-in order to acquire further insights and knowledge for improving the quality of the offered services. To exploit this new resource, we need to scale up and scale out both our infrastructures and standard techniques. Our society is already data-rich, but the question remains whether or not we have the conceptual tools to handle it. In this paper we discuss and analyze opportunities and challenges for efficient parallel data processing. Big Data is the next frontier for innovation, competition, and productivity, and many solutions continue to appear, partly supported by the considerable enthusiasm around the MapReduce paradigm for large-scale data analysis. We review various parallel and distributed programming paradigms, analyzing how they fit into the Big Data era, and present modern emerging paradigms and frameworks. To better support practitioners interesting in this domain, we end with an analysis of on-going research challenges towards the truly fourth generation data-intensive science.Peer ReviewedPostprint (author's final draft
Data analytics in IoT FaaS with DataFlasks
Dissertação de mestrado em Computer ScienceThe current exponential growth of data demands new strategies for processing and analyzing information.
Increased Internet usage, as well as the everyday appearance of new sources of data, is
generating data volumes to be processed by Cloud applications that are growing much faster than
available Cloud computing power.
These issues, combined with the appearance of new devices with relatively low computational
power (such as smartphones), have pushed for the development of new applications able to make
use of this power as a complement to the Cloud, pushing the frontier of computing applications,
data storage and services to the edge of the network.
However, the environment in Edge computing is very unstable. It requires leveraging resources
that may not be continuously connected to a network and device failure is a certainty. The system
has to be aware of the processing capabilities of each node to achieve proper task distribution as it
may exist a high level of heterogeneity between the system devices.
A recent approach for developing applications in the Cloud, named Function as a Service (FaaS),
proposes a way to enable data processing in these environments. FaaS services adhere to the principles
of serverless architectures, providing stateless computing containers that allow users to run
code without provisioning or managing servers.
In this dissertation we present OpenFlasks, a new approach to the management and processing
of data in a decentralized manner across Cloud and Edge. We build upon these types of architectures
and other data storage tools and combine them in a novel way to create a flexible system
capable of balancing data storage and data analytics needs in both environments. In addition, we
call for a new approach to provide task execution both in Edge and Cloud environments that is able
to handle high churn and heterogeneity of the system.
Our evaluation shows an increase in the percentage of task execution success under high churn
environments of up to 18%withOpenFlasks relatively to other FaaS systems. In addition, it denotes
improvements in load balancing and average resource usage in the system for the execution of
simple analytics at the Edge.O atual crescimento exponencial de dados exige novas estratégias para processar e analisar informação.
O aumento do uso da Internet, assim como o aparecimento diário de novas fontes de
dados, produz volumes de dados a ser processados por aplicações Cloud que crescem a umamaior
velocidade do que o poder de computação aí disponível.
Este problema, combinado com o surgir de novos dispositivos com poder computacional relativamente
baixo (como smartphones), tem motivado o desenvolvimento de novas aplicações capazes
de usar esse poder como complemento a Cloud computing, expandindo a fronteira dos
serviços de processamento e armazenamento de dados atuais para o limite da rede (Edge).
No entanto, o ambiente de Edge computing é muito instável. Requer a gestão de recursos que
podem não estar continuamente conectados à rede e a falha de dispositivos é uma certeza. O
sistema deve estar ciente das capacidades de processamento de cada dispositivo para obter uma
distribuição de tarefas adequada, dado que pode existir um alto nível de heterogeneidade entre os
dispositivos do sistema.
Uma abordagem recente para o desenvolvimento de aplicações de Cloud computing, denominada
Function as a Service (FaaS), propõe uma forma de permitir o processamento de dados neste
tipo de ambientes. Os serviços FaaS aderem aos princípios de arquiteturas serverless, fornecendo
containers de computação que nãomantêmestado e que permitemaos utilizadores executar código
sem a necessidade de instanciar e gerir servidores.
Nesta dissertação apresentamos OpenFlasks, uma nova abordagem para a gestão e processamento
de dados de forma descentralizada em ambientes Cloud e Edge. Baseamo-nos neste tipo
de arquiteturas, assimcomo outros serviços atuais de armazenamento de dados e combinamo-los
de forma a criar um sistema flexível, capaz de equilibrar o armazenamento e as necessidades de
análise de dados em ambos ambientes. Além disso, propomos uma nova abordagem para possibilitar
a execução de tarefas tanto em ambientes de Edge como de Cloud, capaz de lidar com o
elevado dinamismo e heterogeneidade do sistema.
A nossa avaliação mostra um aumento na percentagem de sucesso da execução de tarefas sob
ambientes de elevado dinamismo de até 18% relativamente a outros sistemas FaaS. Além disso,
denotamelhorias na distribuição de carga e no uso médio de recursos do sistema para a execução
de data analytics simples em ambientes Edge
Scalable propagation of continuous actions in peer-to-peer-based massively multiuser virtual environments : the continuous events approach
Peer-to-Peer-based Massively Multiuser Virtual Environments (P2P-MMVEs) provide a shared virtual environment for up to several thousand simultaneous users based on a peer-to-peer network. Users interact in the virtual environment by controlling virtual representations of themselves, so-called avatars. Their computers communicate with each other via a wide area network such as the Internet to provide the shared virtual environment. A crucial challenge for P2P-MMVEs is propagating state changes of objects in the virtual environment between a large number of user computers in a scalable way. Objects may change their state on one of the computers, e.g. their position. Information about a state change has to be propagated via the peer-to-peer network to computers of other users whose avatars are able to perceive the object. Optimization algorithms for a scalable propagation of state changes are needed because of the very large number of users and the typically limited bandwidth of their Internet connections. This thesis describes an approach that optimizes the propagation of state changes caused by continuous actions. Continuous actions lead to multiple subsequent state changes over a given period of time. Instead of propagating each subsequent state change caused by continuous actions via the network, the approach propagates descriptions of the actions included in so-called continuous events. Based on the descriptions, the subsequent state changes are calculated and applied over time on each user's computer. Continuous events contain information about (1) the timing of calculations, (2) the spatial extent of the influence of the continuous action in the virtual environment over time and (3) the effect of the continuous action on influenced objects over time. The propagation and management of continuous events is performed based on the spatial publish subscribe communication model. Each user computer declares interest in a certain space in the virtual environment. If the space intersects with the spatial extent of the influence of a continuous event, the particular computer is provided with the continuous event. This thesis describes the basic concept of continuous events, presents a system architecture for support of continuous events in the context of a given target system model for P2P-MMVEs, and evaluates the continuous events approach based on a prototypical implementation of the system architecture
- …