12 research outputs found
Developing High Performance Computing Resources for Teaching Cluster and Grid Computing courses
High-Performance Computing (HPC) and the ability to process large amounts of data are of
paramount importance for UK business and economy as outlined by Rt Hon David Willetts
MP at the HPC and Big Data conference in February 2014. However there is a shortage of
skills and available training in HPC to prepare and expand the workforce for the HPC and
Big Data research and development. Currently, HPC skills are acquired mainly by students
and staff taking part in HPC-related research projects, MSc courses, and at the dedicated
training centres such as Edinburgh University’s EPCC. There are few UK universities teaching
the HPC, Clusters and Grid Computing courses at the undergraduate level. To address the
issue of skills shortages in the HPC it is essential to provide teaching and training as part of
both postgraduate and undergraduate courses. The design and development of such courses is
challenging since the technologies and software in the fields of large scale distributed systems
such as Cluster, Cloud and Grid computing are undergoing continuous change. The students
completing the HPC courses should be proficient in these evolving technologies and equipped
with practical and theoretical skills for future jobs in this fast developing area.
In this paper we present our experience in developing the HPC, Cluster and Grid modules
including a review of existing HPC courses offered at the UK universities. The topics covered in
the modules are described, as well as the coursework projects based on practical laboratory work.
We conclude with an evaluation based on our experience over the last ten years in developing
and delivering the HPC modules on the undergraduate courses, with suggestions for future work
BoscoR: Extending R from the desktop to the Grid
In this paper, we describe a framework to execute R functions on remote resources from the desktop using Bosco. The R language is attractive to researchers because of its high level programming constructs which lower the barrier of entry for use. As the use of the R programming language in HPC and High Throughput Computing (HTC) has grown, so too has the need for parallel libraries in order to utilize computing resources.
Bosco is middleware that uses common protocols to manage job submissions to a variety of remote computational platforms and resources. The researcher is able to control and monitor remote submission from their interactive R IDE, such as RStudio. Bosco is capable of managing many concurrent tasks submitted to remote resources while providing feedback to the interactive R environment. We will also show how this framework can be used to access national infrastructure such as the Open Science Grid.
Through interviews with R users, and their feedback after using BoscoR, we learned how R users work and designed BoscoR to fit their needs. We incorporated their feedback to improve BoscoR by adding much needed features, such as remote package management. A key design goal was to have a flat learning curve in using BoscoR for any R user
GNuggies: A proposal for hosting resilient stateless services using untrusted nodes
This thesis outlines a proposal for a serverless cloud compute system hosted on untrusted nodes. We call this proposed system “GNuggies”. It is designed to feel instantly familiar to existing serverless offerings such as AWS Lambda or Azure Cloud Functions. The key difference between GNuggies and existing offerings is that GNuggies proposes leveraging spare compute resources by allowing anyone to contribute nodes into the system. These contributed nodes must be treated as untrusted and this is where the bulk this thesis’s contributions arise:
1. A proposed architecture that adapts well understood Distributed Systems concepts to situations involving untrusted nodes and to run in the absence of central authorities.
2. A proposed system wherein actors choose to contribute spare compute and are effectively incentivized to do so.
3. An incentive structure that makes actors less willing to behave in a malicious manner.
This thesis discusses the methods to be used and evaluates their strengths and weaknesses. It also argues that decentralized serverless is a direction that the internet will potentially benefit from moving towards
Sofie: Smart Operating System For Internet Of Everything
The proliferation of Internet of Things and the success of rich cloud services have pushed the
horizon of a new computing paradigm, Edge computing, which calls for processing the data at
the edge of the network. Applications such as cloud offloading, smart home, and smart city
are idea area for Edge computing to achieve better performance than cloud computing. Edge
computing has the potential to address the concerns of response time requirement, battery life
constraint, bandwidth cost saving, as well as data safety and privacy.
However, there are still some challenges for applying Edge computing in our daily life. The
missing of the specialized operating system for Edge computing is holding back the flourish of
Edge computing applications. Service management, device management, component selection
as well as data privacy and security is also not well supported yet in the current computing
structure.
To address the challenges for Edge computing systems and applications in these aspects, we
have planned a series of empirical and theoretical research. We propose SOFIE: Smart Operating
System For Internet Of Everything. SOFIE is the operating system specialized for Edge
computing running on the Edge gateway. SOFIE could establish and maintain a reliable connection
between cloud and Edge device to handle the data transportation between gateway and
Edge devices; to provide service management and data management for Edge applications; to
protect data privacy and security for Edge users; to guarantee the wellness of the Edge devices.
Moreover, SOFIE also provide a naming mechanism to connect Edge device more efficiently.
To solve the component selection problem in Edge computing paradigm, SOFIE also include
our previous work, SURF, as a model to optimize the performance of the system. Finally,
we deployed the design of SOFIE on an IoT/M2M system and support semantics with access
control
ImplantaciĂłn y evaluaciĂłn de un entorno de simulaciĂłn de redes distribuido basado en ns-3 y computacĂon en nube
Las simulaciones de barrido de parámetros ofrecen un gran potencial en el estudio
de las redes de comunicación, tanto en investigación como en el nivel educativo, pero requieren de un elevado tiempo de ejecución por la gran cantidad de simulaciones individuales que resultan de la explosión combinatoria de dar diversos valores a algunos de sus parámetros. En este TFM se ha trabajado en el DNSE3, un entorno de simulación que hace uso de la nube computacional para procesar de forma distribuida las simulaciones para reducir sus tiempos de ejecución. Su diseño, tanto de la arquitectura como del funcionamiento interno, se ha preparado para integrarse dentro
de diferentes sistemas de nube y para explotar su escalado, ofreciendo un mecanismo de asignación de trabajos independiente del número de recursos desplegados y también una serie de reglas que ofrezcan un auto-escalado eficiente, aprovisionando y eliminando las instancias según sea necesario. Con el sistema final operativo, se ha constrastado su rendimiento y usabilidad a traves de pruebas sintéticas y con usuarios reales, respectivamente, con los ofrecidos con el uso local del simulador.Parameter swepp simulations have a great potential in understanding computer networks, in both research and education, but they convey a significant computation load due
to the great amount of individual tasks resulting form the different combination of parameters. This Master Thesis proposes DNSE3, a simulation environment that makes use
of cloud computing to exploit parallelization in order to achive reduce response times for
parameter sweeps. Its design, both architecture and internal behavior, has been prepared
to integrate through differente cloud infrastructres, featuring a decentralized scheduling
mecanism that, along with aplication level workload metrics and user defined scalability
rules, achieves self-scalability, by provisioning and freeing resources for simulations as
needed. With the final operating version of the system, performance and usability have
been compared with those offered with the local use of the simulator though synthetics
tests and real users, respectivelyDepartamento de TeorĂa de la Señal y Comunicaciones e IngenierĂa TelemáticaMáster en IngenierĂa de TelecomunicaciĂł
Entorno de simulaciĂłn de redes distribuido basado en ns-3 y computaciĂłn en nube
El uso de los simuladores tiene especial interĂ©s en entornos educativos, donde los alumnos pueden emplearlos para reforzar los conocimientos adquiridos en las sesiones de teorĂa, con un bajo coste y una considerable flexibilidad. Sin embargo, su tiempo de ejecuciĂłn es elevado, sobre todo en las simulaciones de barrido de parámetros. En este trabajo se propone diseñar, desarrollar y evaluar un entorno de simulaciĂłn distribuido de redes TCP/IP que haga uso de la nube computacional para la ejecuciĂłn distribuida de las simulaciones basadas en el simulador ns-3. Esta aplicaciĂłn parte de una versiĂłn anterior, el DNSE (Distributed Network Simulation Environment), que ofrecĂa un entorno similar basado en grid computacional, y de un trabajo teĂłrico de traslaciĂłn de esta soluciĂłn a la nube computacional, llamado DNSE3. En este TFG se realiza un rediseño, aprovechando servicios de infraestructura de nubes populares (AWS y OpenStack), incluyendo mecanismos para la escalabilidad automática. Tras desarrollar un prototipo completamente funcional, se evalĂşa su rendimiento en pruebas sintĂ©ticas y se discute su posible uso en un escenario acadĂ©mico.Grado en IngenierĂa de TecnologĂas de TelecomunicaciĂł
Scalable and Distributed Resource Management for Many-Core Systems
Many-core systems provide researchers with important new challenges, including the handling of very dynamic and hardly predictable computational loads. The large number of applications and cores causes scalability issues for centrally acting
heuristics, which always must retain a global view of the entire system. Resource management itself can become a bottleneck which limits the achievable performance of the system. The focus of this work is to achieve scalability of resource management
Bandwidth-aware distributed ad-hoc grids in deployed wireless sensor networks
Nowadays, cost effective sensor networks can be deployed as a result of a plethora of recent engineering
advances in wireless technology, storage miniaturisation, consolidated microprocessor design, and
sensing technologies.
Whilst sensor systems are becoming relatively cheap to deploy, two issues arise in their typical
realisations: (i) the types of low-cost sensors often employed are capable of limited resolution and tend
to produce noisy data; (ii) network bandwidths are relatively low and the energetic costs of using the
radio to communicate are relatively high. To reduce the transmission of unnecessary data, there is a
strong argument for performing local computation. However, this can require greater computational
capacity than is available on a single low-power processor. Traditionally, such a problem has been
addressed by using load balancing: fragmenting processes into tasks and distributing them amongst the
least loaded nodes. However, the act of distributing tasks, and any subsequent communication between
them, imposes a geographically defined load on the network. Because of the shared broadcast nature of
the radio channels and MAC layers in common use, any communication within an area will be slowed by
additional traffic, delaying the computation and reporting that relied on the availability of the network.
In this dissertation, we explore the tradeoff between the distribution of computation, needed to enhance
the computational abilities of networks of resource-constrained nodes, and the creation of network
traffic that results from that distribution. We devise an application-independent distribution paradigm and
a set of load distribution algorithms to allow computationally intensive applications to be collaboratively
computed on resource-constrained devices. Then, we empirically investigate the effects of network
traffic information on the distribution performance. We thus devise bandwidth-aware task offload mechanisms
that, combining both nodes computational capabilities and local network conditions, investigate
the impacts of making informed offload decisions on system performance.
The highly deployment-specific nature of radio communication means that simulations that are
capable of producing validated, high-quality, results are extremely hard to construct. Consequently, to
produce meaningful results, our experiments have used empirical analysis based on a network of motes
located at UCL, running a variety of I/O-bound, CPU-bound and mixed tasks. Using this setup, we have
established that even relatively simple load sharing algorithms can improve performance over a range of
different artificially generated scenarios, with more or less timely contextual information. In addition,
we have taken a realistic application, based on location estimation, and implemented that across the same
network with results that support the conclusions drawn from the artificially generated traffic
Conception et analyse des biopuces à ADN en environnements parallèles et distribués
Microorganisms represent the largest diversity of the living beings. They play a crucial rôle in all biological processes related to their huge metabolic potentialities and their capacity for adaptation to different ecological niches. The development of new genomic approaches allows a better knowledge of the microbial communities involved in complex environments functioning. In this context, DNA microarrays represent high-throughput tools able to study the presence, or the expression levels of several thousands of genes, combining qualitative and quantitative aspects in only one experiment. However, the design and analysis of DNA microarrays, with their current high density formats as well as the huge amount of data to process, are complex but crucial steps. To improve the quality and performance of these two steps, we have proposed new bioinformatics approaches for the design and analysis of DNA microarrays in parallel and distributed environments. These multipurpose approaches use high performance computing (HPC) and new software engineering approaches, especially model driven engineering (MDE), to overcome the current limitations. We have first developed PhylGrid 2.0, a new distributed approach for the selection of explorative probes for phylogenetic DNA microarrays at large scale using computing grids. This software was used to build PhylOPDb: a comprehensive 16S rRNA oligonucleotide probe database for prokaryotic identification. MetaExploArrays, which is a parallel software of oligonucleotide probe selection on different computing architectures (a PC, a multiprocessor, a cluster or a computing grid) using meta-programming and a model driven engineering approach, has been developed to improve flexibility in accordance to user’s informatics resources. Then, PhylInterpret, a new software for the analysis of hybridization results of DNA microarrays. PhylInterpret uses the concepts of propositional logic to determine the prokaryotic composition of metagenomic samples. Finally, a new parallelization method based on model driven engineering (MDE) has been proposed to compute a complete backtranslation of short peptides to select probes for functional microarrays.Les microorganismes constituent la plus grande diversité du monde vivant. Ils jouent un rôle clef dans tous les processus biologiques grâce à leurs capacités d’adaptation et à la diversité de leurs capacités métaboliques. Le développement de nouvelles approches de génomique permet de mieux explorer les populations microbiennes. Dans ce contexte, les biopuces à ADN représentent un outil à haut débit de choix pour l'étude de plusieurs milliers d’espèces en une seule expérience. Cependant, la conception et l’analyse des biopuces à ADN, avec leurs formats de haute densité actuels ainsi que l’immense quantité de données à traiter, représentent des étapes complexes mais cruciales. Pour améliorer la qualité et la performance de ces deux étapes, nous avons proposé de nouvelles approches bioinformatiques pour la conception et l’analyse des biopuces à ADN en environnements parallèles. Ces approches généralistes et polyvalentes utilisent le calcul haute performance (HPC) et les nouvelles approches du génie logiciel inspirées de la modélisation, notamment l’ingénierie dirigée par les modèles (IDM) pour contourner les limites actuelles. Nous avons développé PhylGrid 2.0, une nouvelle approche distribuée sur grilles de calcul pour la sélection de sondes exploratoires pour biopuces phylogénétiques. Ce logiciel a alors été utilisé pour construire PhylOPDb: une base de données complète de sondes oligonucléotidiques pour l’étude des communautés procaryotiques. MetaExploArrays qui est un logiciel parallèle pour la détermination de sondes sur différentes architectures de calcul (un PC, un multiprocesseur, un cluster ou une grille de calcul), en utilisant une approche de méta-programmation et d’ingénierie dirigée par les modèles a alors été conçu pour apporter une flexibilité aux utilisateurs en fonction de leurs ressources matériel. PhylInterpret, quant à lui est un nouveau logiciel pour faciliter l’analyse des résultats d’hybridation des biopuces à ADN. PhylInterpret utilise les notions de la logique propositionnelle pour déterminer la composition en procaryotes d’échantillons métagénomiques. Enfin, une démarche d’ingénierie dirigée par les modèles pour la parallélisation de la traduction inverse d’oligopeptides pour le design des biopuces à ADN fonctionnelles a également été mise en place