211 research outputs found

    Towards Modern, Accessible and Dynamic HPC Using Container-based Virtual Clusters

    Get PDF
    In this thesis, a novel Virtual Container Cluster (VCC) framework is presented. Despite the growing popularity of container virtualisation in order to increase the flexi-bility of the software stack, run time environment virtualisation still poses significant portability challenges; by depending on the underlying cluster execution paradigm,a niche class of HPC only containers has emerged. This trend is detrimental to reusability, reproducibility, and encouraging new communities to HPC. Traditional virtualisation techniques have a rich history within HPC, and have been demonstrated to offer much more than software flexibility. A Virtual Machine by nature requires an OS and full stack environment akin to a physical machine, and this allows it to be instantiated regardless of the underlying machine and what services it provides. This capability is essential in order to implement job forwarding and spanning - where the burden of an entire job can be transferred or shared between hetero-geneous cluster systems - with a high level of confidence that the environments will be compatible. In turn, this brings improvements to global resource performance, reducing the job turnaround time and increasing cluster utilization. The VCC is an innovative solution that combines the full stack and container virtualisation approaches. Therefore, it offers both the flexibility of containers with the improved portability, performance and scalability of the full stack approach. In order to maintain the same accessibility and lower barrier of entry as the run time environment approach, the design incorporates an autonomous configuration and contextualisation mechanism, along with a Software Defined Networking technology, to ensure the full stack container does not place an additional burden on the user. The usefulness and performance is validated through benchmarking and two case studies: virtual clusters in the classroom and inter-institutional spanning

    Programming tools for intelligent systems

    Full text link
    Les outils de programmation sont des programmes informatiques qui aident les humains à programmer des ordinateurs. Les outils sont de toutes formes et tailles, par exemple les éditeurs, les compilateurs, les débogueurs et les profileurs. Chacun de ces outils facilite une tâche principale dans le flux de travail de programmation qui consomme des ressources cognitives lorsqu’il est effectué manuellement. Dans cette thèse, nous explorons plusieurs outils qui facilitent le processus de construction de systèmes intelligents et qui réduisent l’effort cognitif requis pour concevoir, développer, tester et déployer des systèmes logiciels intelligents. Tout d’abord, nous introduisons un environnement de développement intégré (EDI) pour la programmation d’applications Robot Operating System (ROS), appelé Hatchery (Chapter 2). Deuxièmement, nous décrivons Kotlin∇, un système de langage et de type pour la programmation différenciable, un paradigme émergent dans l’apprentissage automatique (Chapter 3). Troisièmement, nous proposons un nouvel algorithme pour tester automatiquement les programmes différenciables, en nous inspirant des techniques de tests contradictoires et métamorphiques (Chapter 4), et démontrons son efficacité empirique dans le cadre de la régression. Quatrièmement, nous explorons une infrastructure de conteneurs basée sur Docker, qui permet un déploiement reproductible des applications ROS sur la plateforme Duckietown (Chapter 5). Enfin, nous réfléchissons à l’état actuel des outils de programmation pour ces applications et spéculons à quoi pourrait ressembler la programmation de systèmes intelligents à l’avenir (Chapter 6).Programming tools are computer programs which help humans program computers. Tools come in all shapes and forms, from editors and compilers to debuggers and profilers. Each of these tools facilitates a core task in the programming workflow which consumes cognitive resources when performed manually. In this thesis, we explore several tools that facilitate the process of building intelligent systems, and which reduce the cognitive effort required to design, develop, test and deploy intelligent software systems. First, we introduce an integrated development environment (IDE) for programming Robot Operating System (ROS) applications, called Hatchery (Chapter 2). Second, we describe Kotlin∇, a language and type system for differentiable programming, an emerging paradigm in machine learning (Chapter 3). Third, we propose a new algorithm for automatically testing differentiable programs, drawing inspiration from techniques in adversarial and metamorphic testing (Chapter 4), and demonstrate its empirical efficiency in the regression setting. Fourth, we explore a container infrastructure based on Docker, which enables reproducible deployment of ROS applications on the Duckietown platform (Chapter 5). Finally, we reflect on the current state of programming tools for these applications and speculate what intelligent systems programming might look like in the future (Chapter 6)

    Doctor of Philosophy

    Get PDF
    dissertationA modern software system is a composition of parts that are themselves highly complex: operating systems, middleware, libraries, servers, and so on. In principle, compositionality of interfaces means that we can understand any given module independently of the internal workings of other parts. In practice, however, abstractions are leaky, and with every generation, modern software systems grow in complexity. Traditional ways of understanding failures, explaining anomalous executions, and analyzing performance are reaching their limits in the face of emergent behavior, unrepeatability, cross-component execution, software aging, and adversarial changes to the system at run time. Deterministic systems analysis has a potential to change the way we analyze and debug software systems. Recorded once, the execution of the system becomes an independent artifact, which can be analyzed offline. The availability of the complete system state, the guaranteed behavior of re-execution, and the absence of limitations on the run-time complexity of analysis collectively enable the deep, iterative, and automatic exploration of the dynamic properties of the system. This work creates a foundation for making deterministic replay a ubiquitous system analysis tool. It defines design and engineering principles for building fast and practical replay machines capable of capturing complete execution of the entire operating system with an overhead of several percents, on a realistic workload, and with minimal installation costs. To enable an intuitive interface of constructing replay analysis tools, this work implements a powerful virtual machine introspection layer that enables an analysis algorithm to be programmed against the state of the recorded system through familiar terms of source-level variable and type names. To support performance analysis, the replay engine provides a faithful performance model of the original execution during replay

    funcX: A Federated Function Serving Fabric for Science

    Full text link
    Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences. These new approaches must enable computation to be mobile, so that, for example, it can occur near data, be triggered by events (e.g., arrival of new data), be offloaded to specialized accelerators, or run remotely where resources are available. They also require new design approaches in which monolithic applications can be decomposed into smaller components, that may in turn be executed separately and on the most suitable resources. To address these needs we present funcX---a distributed function as a service (FaaS) platform that enables flexible, scalable, and high performance remote function execution. funcX's endpoint software can transform existing clouds, clusters, and supercomputers into function serving systems, while funcX's cloud-hosted service provides transparent, secure, and reliable function execution across a federated ecosystem of endpoints. We motivate the need for funcX with several scientific case studies, present our prototype design and implementation, show optimizations that deliver throughput in excess of 1 million functions per second, and demonstrate, via experiments on two supercomputers, that funcX can scale to more than more than 130000 concurrent workers.Comment: Accepted to ACM Symposium on High-Performance Parallel and Distributed Computing (HPDC 2020). arXiv admin note: substantial text overlap with arXiv:1908.0490

    Contributions to Desktop Grid Computing : From High Throughput Computing to Data-Intensive Sciences on Hybrid Distributed Computing Infrastructures

    Get PDF
    Since the mid 90’s, Desktop Grid Computing - i.e the idea of using a large number of remote PCs distributed on the Internet to execute large parallel applications - has proved to be an efficient paradigm to provide a large computational power at the fraction of the cost of a dedicated computing infrastructure.This document presents my contributions over the last decade to broaden the scope of Desktop Grid Computing. My research has followed three different directions. The first direction has established new methods to observe and characterize Desktop Grid resources and developed experimental platforms to test and validate our approach in conditions close to reality. The second line of research has focused on integrating Desk- top Grids in e-science Grid infrastructure (e.g. EGI), which requires to address many challenges such as security, scheduling, quality of service, and more. The third direction has investigated how to support large-scale data management and data intensive applica- tions on such infrastructures, including support for the new and emerging data-oriented programming models.This manuscript not only reports on the scientific achievements and the technologies developed to support our objectives, but also on the international collaborations and projects I have been involved in, as well as the scientific mentoring which motivates my candidature for the Habilitation `a Diriger les Recherches

    Generating mock skeletons for lightweight Web service testing : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Manawatū New Zealand

    Get PDF
    Modern application development allows applications to be composed using lightweight HTTP services. Testing such an application requires the availability of services that the application makes requests to. However, continued access to dependent services during testing may be restrained, making adequate testing a significant and non-trivial engineering challenge. The concept of Service Virtualisation is gaining popularity for testing such applications in isolation. It is a practise to simulate the behaviour of dependent services by synthesising responses using semantic models inferred from recorded traffic. Replacing services with their respective mocks is, therefore, useful to address their absence and move on application testing. In reality, however, it is unlikely that fully automated service virtualisation solutions can produce highly accurate proxies. Therefore, we recommend using service virtualisation to infer some attributes of HTTP service responses. We further acknowledge that engineers often want to fine-tune this. This requires algorithms to produce readily interpretable and customisable results. We assume that if service virtualisation is based on simple logical rules, engineers would have the potential to understand and customise rules. In this regard, Symbolic Machine Learning approaches can be investigated because of the high provenance of their results. Accordingly, this thesis examines the appropriateness of symbolic machine learning algorithms to automatically synthesise HTTP services' mock skeletons from network traffic recordings. We consider four commonly used symbolic techniques: the C4.5 decision tree algorithm, the RIPPER and PART rule learners, and the OCEL description logic learning algorithm. The experiments are performed employing network traffic datasets extracted from a few different successful, large-scale HTTP services. The experimental design further focuses on the generation of reproducible results. The chosen algorithms demonstrate the suitability of training highly accurate and human-readable semantic models for predicting the key aspects of HTTP service responses, such as the status and response headers. Having human-readable logics would make interpretation of the response properties simpler. These mock skeletons can then be easily customised to create mocks that can generate service responses suitable for testing

    Architecture of the cloud, virtualization takes command : learning from black boxes, data centers and an architecture of the conditioned environment

    Get PDF
    Thesis (S.M. in History, Theory and Criticism of Art and Architecture)--Massachusetts Institute of Technology, Dept. of Architecture, 2013.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (p. 127-128).A single manageable architecture of the Cloud has been one of the most important social and technical changes of the 21st century. Cloud computing, our newest public utility is an attempt to confront and control cultural risk, it has rendered the environment of our exchanges calculable, manageable, seemingly predictable, and most importantly as a new form of capital. Cloud computing in its most basic terms is the system of virtualization of data storage and program access into an instantaneous service utility. The transformation of computing into a service industry is one of the key changes of the Information Age, and its logic is tied to the highly guarded mechanisms of a black box, an architecture machine, or more commonly known as the data center. In 2008, on a day with without the usual fanfare or barrage of academic manifestoes, grand claims of paradigm shifts, virtualization quietly took command. A seemingly simple moment where a cloud, the Cloud, emerged as a new form of managerial space that tied a large system of users to the hidden mechanisms of large scaled factories of information, a network of data centers. The project positions the Cloud and the data center into the architectural discourse, both historically and materially, through an analysis of its relationship to an emergent digital sublime and how it is managed, controlled and propelled through the obscure typologies of its architecture and images. The study of the Cloud and the data center through the notion of the sublime, and the organizational structures of typology we can more critically assess architecture's relationship to this new phase of the Information Age.by Antonio Furgiuele.S.M.in History, Theory and Criticism of Art and Architectur
    corecore