21 research outputs found
Ad hoc cloud computing
Commercial and private cloud providers offer virtualized resources via a set of co-located
and dedicated hosts that are exclusively reserved for the purpose of offering
a cloud service. While both cloud models appeal to the mass market, there are many
cases where outsourcing to a remote platform or procuring an in-house infrastructure
may not be ideal or even possible.
To offer an attractive alternative, we introduce and develop an ad hoc cloud computing
platform to transform spare resource capacity from an infrastructure owner’s
locally available, but non-exclusive and unreliable infrastructure, into an overlay cloud
platform. The foundation of the ad hoc cloud relies on transferring and instantiating
lightweight virtual machines on-demand upon near-optimal hosts while virtual machine
checkpoints are distributed in a P2P fashion to other members of the ad hoc
cloud. Virtual machines found to be non-operational are restored elsewhere ensuring
the continuity of cloud jobs.
In this thesis we investigate the feasibility, reliability and performance of ad hoc
cloud computing infrastructures. We firstly show that the combination of both volunteer
computing and virtualization is the backbone of the ad hoc cloud. We outline the
process of virtualizing the volunteer system BOINC to create V-BOINC. V-BOINC
distributes virtual machines to volunteer hosts allowing volunteer applications to be
executed in the sandbox environment to solve many of the downfalls of BOINC; this
however also provides the basis for an ad hoc cloud computing platform to be developed.
We detail the challenges of transforming V-BOINC into an ad hoc cloud and outline
the transformational process and integrated extensions. These include a BOINC job
submission system, cloud job and virtual machine restoration schedulers and a periodic
P2P checkpoint distribution component. Furthermore, as current monitoring tools are
unable to cope with the dynamic nature of ad hoc clouds, a dynamic infrastructure
monitoring and management tool called the Cloudlet Control Monitoring System is
developed and presented.
We evaluate each of our individual contributions as well as the reliability, performance
and overheads associated with an ad hoc cloud deployed on a realistically
simulated unreliable infrastructure. We conclude that the ad hoc cloud is not only a
feasible concept but also a viable computational alternative that offers high levels of
reliability and can at least offer reasonable performance, which at times may exceed
the performance of a commercial cloud infrastructure
Dimensionerings- en werkverdelingsalgoritmen voor lambda grids
Grids bestaan uit een verzameling reken- en opslagelementen die geografisch verspreid kunnen zijn, maar waarvan men de gezamenlijke capaciteit wenst te benutten. Daartoe dienen deze elementen verbonden te worden met een netwerk. Vermits veel wetenschappelijke applicaties gebruik maken van een Grid, en deze applicaties doorgaans grote hoeveelheden data verwerken, is het noodzakelijk om een netwerk te voorzien dat dergelijke grote datastromen op betrouwbare wijze kan transporteren. Optische transportnetwerken lenen zich hier uitstekend toe. Grids die gebruik maken van dergelijk netwerk noemt men lambda Grids. Deze thesis beschrijft een kader waarin het ontwerp en dimensionering van optische netwerken voor lambda Grids kunnen beschreven worden. Ook wordt besproken hoe werklast kan verdeeld worden op een Grid eens die gedimensioneerd is. Een groot deel van de resultaten werd bekomen door simulatie, waarbij gebruik gemaakt wordt van een eigen Grid simulatiepakket dat precies focust op netwerk- en Gridelementen. Het ontwerp van deze simulator, en de daarbijhorende implementatiekeuzes worden dan ook uitvoerig toegelicht in dit werk
A Process Model for the Integrated Reasoning about Quantitative IT Infrastructure Attributes
IT infrastructures can be quantitatively described by attributes, like performance or energy efficiency. Ever-changing user demands and economic attempts require varying short-term and long-term decisions regarding the alignment of an IT infrastructure and particularly its attributes to this dynamic surrounding. Potentially conflicting attribute goals and the central role of IT infrastructures presuppose decision making based upon reasoning, the process of forming inferences from facts or premises. The focus on specific IT infrastructure parts or a fixed (small) attribute set disqualify existing reasoning approaches for this intent, as they neither cover the (complex) interplay of all IT infrastructure components simultaneously, nor do they address inter- and intra-attribute correlations sufficiently.
This thesis presents a process model for the integrated reasoning about quantitative IT infrastructure attributes. The process model’s main idea is to formalize the compilation of an individual reasoning function, a mathematical mapping of parametric influencing factors and modifications on an attribute vector. Compilation bases upon model integration to benefit from the multitude of existing specialized, elaborated, and well-established attribute models. The achieved reasoning function consumes an individual tuple of IT infrastructure components, attributes, and external influencing factors to expose a broad applicability. The process model formalizes a reasoning intent in three phases. First, reasoning goals and parameters are collected in a reasoning suite, and formalized in a reasoning function skeleton. Second, the skeleton is iteratively refined, guided by the reasoning suite. Third, the achieved reasoning function is employed for What-if analyses, optimization, or descriptive statistics to conduct the concrete reasoning. The process model provides five template classes that collectively formalize all phases in order to foster reproducibility and to reduce error-proneness.
Process model validation is threefold. A controlled experiment reasons about a Raspberry Pi cluster’s performance and energy efficiency to illustrate feasibility. Besides, a requirements analysis on a world-class supercomputer and on the European-wide execution of hydro meteorology simulations as well as a related work examination disclose the process model’s level of innovation. Potential future work employs prepared automation capabilities, integrates human factors, and uses reasoning results for the automatic generation of modification recommendations.IT-Infrastrukturen können mit Attributen, wie Leistung und Energieeffizienz, quantitativ beschrieben werden. Nutzungsbedarfsänderungen und ökonomische Bestrebungen erfordern Kurz- und Langfristentscheidungen zur Anpassung einer IT-Infrastruktur und insbesondere ihre Attribute an dieses dynamische Umfeld. Potentielle Attribut-Zielkonflikte sowie die zentrale Rolle von IT-Infrastrukturen erfordern eine Entscheidungsfindung mittels Reasoning, einem Prozess, der Rückschlüsse (rein) aus Fakten und Prämissen zieht. Die Fokussierung auf spezifische Teile einer IT-Infrastruktur sowie die Beschränkung auf (sehr) wenige Attribute disqualifizieren bestehende Reasoning-Ansätze für dieses Vorhaben, da sie weder das komplexe Zusammenspiel von IT-Infrastruktur-Komponenten, noch Abhängigkeiten zwischen und innerhalb einzelner Attribute ausreichend berücksichtigen können.
Diese Arbeit präsentiert ein Prozessmodell für das integrierte Reasoning über quantitative IT-Infrastruktur-Attribute. Die grundlegende Idee des Prozessmodells ist die Herleitung einer individuellen Reasoning-Funktion, einer mathematischen Abbildung von Einfluss- und Modifikationsparametern auf einen Attributvektor. Die Herleitung basiert auf der Integration bestehender (Attribut-)Modelle, um von deren Spezialisierung, Reife und Verbreitung profitieren zu können. Die erzielte Reasoning-Funktion verarbeitet ein individuelles Tupel aus IT-Infrastruktur-Komponenten, Attributen und externen Einflussfaktoren, um eine breite Anwendbarkeit zu gewährleisten. Das Prozessmodell formalisiert ein Reasoning-Vorhaben in drei Phasen. Zunächst werden die Reasoning-Ziele und -Parameter in einer Reasoning-Suite gesammelt und in einem Reasoning-Funktions-Gerüst formalisiert. Anschließend wird das Gerüst entsprechend den Vorgaben der Reasoning-Suite iterativ verfeinert. Abschließend wird die hergeleitete Reasoning-Funktion verwendet, um mittels “What-if”–Analysen, Optimierungsverfahren oder deskriptiver Statistik das Reasoning durchzuführen. Das Prozessmodell enthält fünf Template-Klassen, die den Prozess formalisieren, um Reproduzierbarkeit zu gewährleisten und Fehleranfälligkeit zu reduzieren.
Das Prozessmodell wird auf drei Arten validiert. Ein kontrolliertes Experiment zeigt die Durchführbarkeit des Prozessmodells anhand des Reasonings zur Leistung und Energieeffizienz eines Raspberry Pi Clusters. Eine Anforderungsanalyse an einem Superrechner und an der europaweiten Ausführung von Hydro-Meteorologie-Modellen erläutert gemeinsam mit der Betrachtung verwandter Arbeiten den Innovationsgrad des Prozessmodells. Potentielle Erweiterungen nutzen die vorbereiteten Automatisierungsansätze, integrieren menschliche Faktoren, und generieren Modifikationsempfehlungen basierend auf Reasoning-Ergebnissen
Proceedings of the Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015) Krakow, Poland
Proceedings of: Second International Workshop on Sustainable Ultrascale Computing Systems (NESUS 2015). Krakow (Poland), September 10-11, 2015
FAT-DBT engine (framework for application-tailorcd, co-designcd dynamic binary translation enginc)
Tese de Doutoramento em Engenharia Eletrónica e de Computadores (PDEEC)Dynamic binary translation (DBT) has emerged as an execution engine that monitors,
modifies and possibly optimizes running applications for specific purposes.
DBT is deployed as an execution layer between the application binary and the operating
system or host-machine, which creates opportunities for collecting runtime
information. Initially, DBT supported binary-level compatibility, but based on the
collected runtime information, it also became popular for code instrumentation,
ISA-virtualization and dynamic-optimization purposes.
Building a DBT system brings many challenges, as it involves complex components
integration and requires deep architectural level knowledge. Moreover, DBT incurs
in significant overheads, mainly due to code decoding and translation, as well as
execution along with general functionalities emulation. While initially conceived
bearing in mind high-end architectures for performance demanding applications,
such challenges become even more evident when directing DBT to embedded systems.
The latter makes an effective deployment very challenging due to its complexity,
tight constraints on memory, and limited performance and power. Legacy
support and binary compatibility is a topic of relevant interest in such systems,
due to their broad dissemination among industrial environments and wide utilization
in sensing and monitoring processes, from yearly times, with considerable
maintenance and replacement costs.
To address such issues, this thesis intents to contribute with a solution that leverages
an optimized and accelerated dynamic binary translator targeting resourceconstrained
embedded systems while supporting legacy systems.
The developed work allows to: (1) evaluate the potential of DBT for legacy support
purposes on the resource-constrained embedded systems; (2) achieve a configurable
DBT architecture specialized for resource-constrained embedded systems;
(3) address DBT translation, execution and emulation overheads through the combination
of software and hardware; and (4) promote DBT utilization as a legacy
support tool for the industry as a end-product.A tradução binária dinâmica (TBD) emergiu como um motor de execução que
permite a modificação e possível optimização de código executável para um determinado
propósito. A TBD é integrada nos sistemas como uma camada de execução
entre o código binário executável e o sistema operativo ou a máquina hospedeira
alvo, o que origina oportunidades de recolha de informação de execução.
A criação de um sistema de TBD traz consigo diversos desafios, uma vez que envolve
a integração de componentes complexos e conhecimentos aprofundados das
arquitecturas de processadores envolvidas. Ademais, a utilização de TBD gera diversos
custos computacionais indirectos, maioritariamente devido à descodificação
e tradução de código, bem como emulação de funcionalidades em geral. Considerando
que a TBD foi inicialmente pensada para sistemas de gama alta, os
desafios mencionados tornam-se ainda mais evidentes quando a mesma é aplicada
em sistemas embebidos. Nesta área os limitados recursos de memória e os exigentes
requisitos de desempenho e consumo energético,tornam uma implementação eficiente
de TBD muito difícil de obter. Compatibilidade binária e suporte a código
de legado são tópicos de interesse em sistemas embebidos, justificado pela ampla
disseminação dos mesmos no meio industrial para tarefas de sensorização e monitorização
ao longo dos tempos, reforçado pelos custos de manutenção adjacentes
à sua utilização.
Para endereçar os desafios descritos, nesta tese propõe-se uma solução para potencializar
a tradução binária dinâmica, optimizada e com aceleração, para suporte a
código de legado em sistemas embebidos de baixa gama.
O trabalho permitiu (1) avaliar o potencial da TBD quando aplicada ao suporte
a código de legado em sistemas embebidos de baixa gama; (2) a obtenção de
uma arquitectura de TBD configurável e especializada para este tipo de sistemas;
(3) reduzir os custos computacionais associados à tradução, execução e emulação,
através do uso combinado de software e hardware; (4) e promover a utilização na
industria de TBD como uma ferramenta de suporte a código de legado.This thesis was supported by a PhD scholarship from Fundação para a Ciência e
Tecnologia, SFRH/BD/81681/201
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
Cyber-Physical Threat Intelligence for Critical Infrastructures Security
Modern critical infrastructures comprise of many interconnected cyber and physical assets, and as such are large scale cyber-physical systems. Hence, the conventional approach of securing these infrastructures by addressing cyber security and physical security separately is no longer effective. Rather more integrated approaches that address the security of cyber and physical assets at the same time are required. This book presents integrated (i.e. cyber and physical) security approaches and technologies for the critical infrastructures that underpin our societies. Specifically, it introduces advanced techniques for threat detection, risk assessment and security information sharing, based on leading edge technologies like machine learning, security knowledge modelling, IoT security and distributed ledger infrastructures. Likewise, it presets how established security technologies like Security Information and Event Management (SIEM), pen-testing, vulnerability assessment and security data analytics can be used in the context of integrated Critical Infrastructure Protection. The novel methods and techniques of the book are exemplified in case studies involving critical infrastructures in four industrial sectors, namely finance, healthcare, energy and communications. The peculiarities of critical infrastructure protection in each one of these sectors is discussed and addressed based on sector-specific solutions. The advent of the fourth industrial revolution (Industry 4.0) is expected to increase the cyber-physical nature of critical infrastructures as well as their interconnection in the scope of sectorial and cross-sector value chains. Therefore, the demand for solutions that foster the interplay between cyber and physical security, and enable Cyber-Physical Threat Intelligence is likely to explode. In this book, we have shed light on the structure of such integrated security systems, as well as on the technologies that will underpin their operation. We hope that Security and Critical Infrastructure Protection stakeholders will find the book useful when planning their future security strategies
Automatic Generation of Distributed Runtime Infrastructure for Internet of Things
Ph. D. ThesisThe Internet of Things (IoT) represents a network of connected devices that are able to
cooperate and interact with each other in order to reach a particular goal. To attain this,
the devices are equipped with identifying, sensing, networking and processing capabilities.
Cloud computing, on the other hand, is the delivering of on-demand computing services –
from applications, to storage, to processing power – typically over the internet. Clouds
bring a number of advantages to distributed computing because of highly available pool of
virtualized computing resource. Due to the large number of connected devices, real-world
IoT use cases may generate overwhelmingly large amounts of data. This prompts the use
of cloud resources for processing, storage and analysis of the data. Therefore, a typical IoT
system comprises of a front-end (devices that collect and transmit data), and back-end –
typically distributed Data Stream Management Systems (DSMSs) deployed on the cloud
infrastructure, for data processing and analysis.
Increasingly, new IoT devices are being manufactured to provide limited execution
environment on top of their data sensing and transmitting capabilities. This consequently
demands a change in the way data is being processed in a typical IoT-cloud setup. The
traditional, centralised cloud-based data processing model – where IoT devices are used
only for data collection – does not provide an efficient utilisation of all available resources.
In addition, the fundamental requirements of real-time data processing such as short
response time may not always be met. This prompts a new processing model which is
based on decentralising the data processing tasks. The new decentralised architectural
pattern allows some parts of data streaming computation to be executed directly on edge
devices – closer to where the data is collected. Extending the processing capabilities to the
IoT devices increases the robustness of applications as well as reduces the communication
overhead between different components of an IoT system. However, this new pattern poses new challenges in the development, deployment and management of IoT applications.
Firstly, there exists a large resource gap between the two parts of a typical IoT system (i.e.
clouds and IoT devices); hence, prompting a new approach for IoT applications deployment
and management. Secondly, the new decentralised approach necessitates the deployment
of DSMS on distributed clusters of heterogeneous nodes resulting in unpredictable runtime
performance and complex fault characteristics. Lastly, the environment where DSMSs are
deployed is very dynamic due to user or device mobility, workload variation, and resource
availability.
In this thesis we present solutions to address the aforementioned challenges. We
investigate how a high-level description of a data streaming computation can be used
to automatically generate a distributed runtime infrastructure for Internet of Things.
Subsequently, we develop a deployment and management system capable of distributing
different operators of a data streaming computation onto different IoT gateway devices
and cloud infrastructure.
To address the other challenges, we propose a non-intrusive approach for performance
evaluation of DSMSs and present a protocol and a set of algorithms for dynamic migration
of stateful data stream operators. To improve our migration approach, we provide an
optimisation technique which provides minimal application downtime and improves the
accuracy of a data stream computation