19 research outputs found
Partitioning workflow applications over federated clouds to meet non-functional requirements
PhD ThesisWith cloud computing, users can acquire computer resources when they need them
on a pay-as-you-go business model. Because of this, many applications are now being
deployed in the cloud, and there are many di erent cloud providers worldwide. Importantly,
all these various infrastructure providers o er services with di erent levels
of quality. For example, cloud data centres are governed by the privacy and security
policies of the country where the centre is located, while many organisations have
created their own internal \private cloud" to meet security needs.
With all this varieties and uncertainties, application developers who decide to host their
system in the cloud face the issue of which cloud to choose to get the best operational
conditions in terms of price, reliability and security. And the decision becomes even
more complicated if their application consists of a number of distributed components,
each with slightly di erent requirements.
Rather than trying to identify the single best cloud for an application, this thesis
considers an alternative approach, that is, combining di erent clouds to meet users'
non-functional requirements. Cloud federation o ers the ability to distribute a single
application across two or more clouds, so that the application can bene t from the
advantages of each one of them. The key challenge for this approach is how to nd the
distribution (or deployment) of application components, which can yield the greatest
bene ts. In this thesis, we tackle this problem and propose a set of algorithms, and a
framework, to partition a work
ow-based application over federated clouds in order to
exploit the strengths of each cloud. The speci c goal is to split a distributed application
structured as a work
ow such that the security and reliability requirements of each
component are met, whilst the overall cost of execution is minimised.
To achieve this, we propose and evaluate a cloud broker for partitioning a work
ow
application over federated clouds. The broker integrates with the e-Science Central
cloud platform to automatically deploy a work
ow over public and private clouds.
We developed a deployment planning algorithm to partition a large work
ow appli-
- i -
cation across federated clouds so as to meet security requirements and minimise the
monetary cost.
A more generic framework is then proposed to model, quantify and guide the partitioning
and deployment of work
ows over federated clouds. This framework considers
the situation where changes in cloud availability (including cloud failure) arise during
work
ow execution
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientists, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Integrating Data Science and Earth Science
This open access book presents the results of three years collaboration between earth scientists and data scientist, in developing and applying data science methods for scientific discovery. The book will be highly beneficial for other researchers at senior and graduate level, interested in applying visual data exploration, computational approaches and scientifc workflows
Semantic Systems. The Power of AI and Knowledge Graphs
This open access book constitutes the refereed proceedings of the 15th International Conference on Semantic Systems, SEMANTiCS 2019, held in Karlsruhe, Germany, in September 2019. The 20 full papers and 8 short papers presented in this volume were carefully reviewed and selected from 88 submissions. They cover topics such as: web semantics and linked (open) data; machine learning and deep learning techniques; semantic information management and knowledge integration; terminology, thesaurus and ontology management; data mining and knowledge discovery; semantics in blockchain and distributed ledger technologies
Design and Evaluation of Low-Latency Communication Middleware on High Performance Computing Systems
[Resumen]El inter茅s en Java para computaci贸n paralela est谩 motivado por sus interesantes
caracter铆sticas, tales como su soporte multithread, portabilidad, facilidad de aprendizaje,alta productividad y el aumento significativo en su rendimiento omputacional.
No obstante, las aplicaciones paralelas en Java carecen generalmente de mecanismos
de comunicaci贸n eficientes, los cuales utilizan a menudo protocolos basados
en sockets incapaces de obtener el m谩ximo provecho de las redes de baja latencia,
obstaculizando la adopci贸n de Java en computaci贸n de altas prestaciones (High Per-
formance Computing, HPC). Esta Tesis Doctoral presenta el dise帽o, implementaci贸n
y evaluaci贸n de soluciones de comunicaci贸n en Java que superan esta limitaci贸n. En
consecuencia, se desarrollaron m煤ltiples dispositivos de comunicaci贸n a bajo nivel
para paso de mensajes en Java (Message-Passing in Java, MPJ) que aprovechan al
m谩ximo el hardware de red subyacente mediante operaciones de acceso directo a memoria remota que proporcionan comunicaciones de baja latencia. Tambi茅n se incluye una biblioteca de paso de mensajes en Java totalmente funcional, FastMPJ, en la
cual se integraron los dispositivos de comunicaci贸n. La evaluaci贸n experimental ha
mostrado que las primitivas de comunicaci贸n de FastMPJ son competitivas en comparaci贸n con bibliotecas nativas, aumentando significativamente la escalabilidad de
aplicaciones MPJ. Por otro lado, esta Tesis analiza el potencial de la computaci贸n en
la nube (cloud computing) para HPC, donde el modelo de distribuci贸n de infraestructura
como servicio (Infrastructure as a Service, IaaS) emerge como una alternativa
viable a los sistemas HPC tradicionales. La evaluaci贸n del rendimiento de recursos
cloud espec铆ficos para HPC del proveedor l铆der, Amazon EC2, ha puesto de manifiesto el impacto significativo que la virtualizaci贸n impone en la red, impidiendo
mover las aplicaciones intensivas en comunicaciones a la nube. La clave reside en un soporte de virtualizaci贸n apropiado, como el acceso directo al hardware de red, junto
con las directrices para la optimizaci贸n del rendimiento sugeridas en esta Tesis.[Resumo]O interese en Java para computaci贸n paralela est谩 motivado polas s煤as interesantes caracter铆sticas, tales como o seu apoio multithread, portabilidade, facilidade de aprendizaxe, alta produtividade e o aumento signi cativo no seu rendemento computacional. No entanto, as aplicaci贸ns paralelas en Java carecen xeralmente de mecanismos de comunicaci贸n e cientes, os cales adoitan usar protocolos baseados en sockets que son incapaces de obter o m谩ximo proveito das redes de baixa latencia, obstaculizando a adopci贸n de Java na computaci贸n de altas prestaci贸ns (High
Performance Computing, HPC). Esta Tese de Doutoramento presenta o dese帽o, implementaci
贸n e avaliaci贸n de soluci贸ns de comunicaci贸n en Java que superan esta limitaci贸n. En consecuencia, desenvolv茅ronse m煤ltiples dispositivos de comunicaci贸n a baixo nivel para paso de mensaxes en Java (Message-Passing in Java, MPJ) que aproveitan ao m谩aximo o hardware de rede subxacente mediante operaci贸ns de acceso
directo a memoria remota que proporcionan comunicaci贸ns de baixa latencia.
Tam茅n se incl煤e unha biblioteca de paso de mensaxes en Java totalmente funcional,
FastMPJ, na cal foron integrados os dispositivos de comunicaci贸n. A avaliaci贸n experimental amosou que as primitivas de comunicaci贸n de FastMPJ son competitivas
en comparaci贸n con bibliotecas nativas, aumentando signi cativamente a escalabilidade
de aplicaci贸ns MPJ. Por outra banda, esta Tese analiza o potencial da computaci贸n na nube (cloud computing) para HPC, onde o modelo de distribuci贸n de infraestrutura como servizo (Infrastructure as a Service, IaaS) xorde como unha alternativa viable aos sistemas HPC tradicionais. A ampla avaliaci贸n do rendemento de recursos cloud espec铆fi cos para HPC do proveedor l铆der, Amazon EC2, puxo de manifesto o impacto signi ficativo que a virtualizaci贸n imp贸n na rede, impedindo mover as aplicaci贸ns intensivas en comunicaci贸ns 谩 nube. A clave at贸pase no soporte de virtualizaci贸n apropiado, como o acceso directo ao hardware de rede, xunto coas directrices para a optimizaci贸n do rendemento suxeridas nesta Tese.[Abstract]The use of Java for parallel computing is becoming more promising owing to
its appealing features, particularly its multithreading support, portability, easy-tolearn properties, high programming productivity and the noticeable improvement in its computational performance. However, parallel Java applications generally su er
from inefficient communication middleware, most of which use socket-based protocols
that are unable to take full advantage of high-speed networks, hindering the
adoption of Java in the High Performance Computing (HPC) area. This PhD Thesis
presents the design, development and evaluation of scalable Java communication
solutions that overcome these constraints. Hence, we have implemented several lowlevel
message-passing devices that fully exploit the underlying network hardware while taking advantage of Remote Direct Memory Access (RDMA) operations to provide low-latency communications. Moreover, we have developed a productionquality Java message-passing middleware, FastMPJ, in which the devices have been integrated seamlessly, thus allowing the productive development of Message-Passing in Java (MPJ) applications. The performance evaluation has shown that FastMPJ communication primitives are competitive with native message-passing libraries, improving signi cantly the scalability of MPJ applications. Furthermore, this Thesis
has analyzed the potential of cloud computing towards spreading the outreach of
HPC, where Infrastructure as a Service (IaaS) o erings have emerged as a feasible
alternative to traditional HPC systems. Several cloud resources from the leading
IaaS provider, Amazon EC2, which speci cally target HPC workloads, have been
thoroughly assessed. The experimental results have shown the signi cant impact
that virtualized environments still have on network performance, which hampers
porting communication-intensive codes to the cloud. The key is the availability of
the proper virtualization support, such as the direct access to the network hardware,
along with the guidelines for performance optimization suggested in this Thesis