47 research outputs found

    MOLAR: Modular Linux and Adaptive Runtime Support for HEC OS/R Research

    Full text link

    Automatsko proširenje i primjena računalnog grozda korištenjem dual-boot principa

    Get PDF
    The paper presents an innovative and simple way of creating computer clustering and obtaining HPC clusters using computers in the classroom, Ethernet facilities and open source software. Automatic enlarge of a computer cluster is a cost-effective way to increase available computing power. This is achieved by forming a computer cluster from the computers in the classroom. The main aim of this paper is to present a solution that will use existing resources in the computer classroom, for applying complex computer services/jobs under the Linux operating system. The execution of these services/jobs programs is performed at the time when the computing resources are not in used in learning under the Windows operating system. A real-life example of applying computers in dual-boot Windows/Linux work mode along with the developed software application support is presented in a teaching environment. The implementation also includes logistics and support for automatic computer clustering and for service/job programs execution. The primary goal is to use existing resources for useful applications in education in the image programming, simulations and volume rendering.U radu je prikazan inovativan i jednostavan način stvaranja računalnih klastera i dobivanja HPC klastera korištenjem postojećih računala u učionici, Ethernet objekata, i open source softvera. Automatsko oblikovanje računalnog klastera je ekonomičan način da se poveća raspoloživa računalna snaga. To se postiže formiranjem računalnog klastera od postojećih računala u učionici. Glavni cilj ovog rada je prikazati rješenje koja će koristiti postojeće resurse u okruženju računalne učionice za izvođenje složenih računalnih poslova koji zahtjevaju vec´u računalnu snagu pod Linux operativnim sustavom. Izvršenje tih poslova se obavlja u vrijeme kada se računalni resursi ne koriste za obrazovanje pod Windows operativnim sustavom. Prikazan je primjer primjene računala u dual-boot Windows/Linux modu rada uz razvijene originalne programske podrške za aplikacije u obrazovnom okruženju. Provedba također uključuje logistiku i podršku za automatsko klasteriranje računala i izvršavanje poslova. Primarni cilj je iskorištavanje postojećih resursa za korisne aplikacije u obrazovanju primarno u području programiranja slike, simulacije i renderiranja. Rezultati provedenog principa su prikazani

    TransCom: a virtual disk-based cloud computing platform for heterogeneous services

    Get PDF
    PublishedJournal ArticleThis paper presents the design, implementation, and evaluation of TransCom, a virtual disk (Vdisk) based cloud computing platform that supports heterogeneous services of operating systems (OSes) and their applications in enterprise environments. In TransCom, clients store all data and software, including OS and application software, on Vdisks that correspond to disk images located on centralized servers, while computing tasks are carried out by the clients. Users can choose to boot any client for using the desired OS, including Windows, and access software and data services from Vdisks as usual without consideration of any other tasks, such as installation, maintenance, and management. By centralizing storage yet distributing computing tasks, TransCom can greatly reduce the potential system maintenance and management costs. We have implemented a multi-platform TransCom prototype that supports both Windows and Linux services. The extensive evaluation based on both test-bed experiments and real-usage experiments has demonstrated that TransCom is a feasible, scalable, and efficient solution for successful real-world use. © 2004-2012 IEEE

    A Sorting Hat For Clusters. Dynamic Provisioning of Compute Nodes for Colocated Large Scale Computational Research Infrastructures

    Get PDF
    Current large scale computational research infrastructures are composed of multitudes of compute nodes fitted with similar or identical hardware. For practical purposes, the deployment of the software operating environment to each compute node is done in an automated fashion. If a data centre hosts more than one of these systems – for example cloud and HPC clusters – it is beneficial to use the same provisioning method for all of them. The uniform provisioning approach unifies administration of the various systems and allows flexible dedication and reconfiguration of computational resources. In particular, we will highlight the requirements on the underlying network infrastructure for unified remote boot but segregated service operations. Building upon this, we will present the Boot Selection Service, allowing for the addition, removal or rededication of a node to a given research infrastructure with a simple reconfiguration

    On a course on computer cluster configuration and administration

    Full text link
    [EN] Computer clusters are today a cost-effective way of providing either high-performance and/or high-availability. The flexibility of their configuration aims to fit the needs of multiple environments, from small servers to SME and large Internet servers. For these reasons, their usage has expanded not only in academia but also in many companies. However, each environment needs a different ¿cluster flavour¿. High-performance and high-throughput computing are required in universities and research centres while high-performance service and high-availability are usually reserved to use in companies. Despite this fact, most university cluster computing courses continue to cover only high-performance computing, usually ignoring other possibilities. In this paper, a master-level course which attempts to fill this gap is discussed. It explores the different types of cluster computing as well as their functional basis, from a very practical point of view. As part of the teaching methodology, each student builds from scratch a computer cluster based on a virtualization tool. The entire process is designed to be scalable. The goal is to be able to apply it to an actual computer cluster with a larger number of nodes, such as those the students may subsequently encounter in their professional life.This work was supported in part by the Spanish Ministerio de Economia y Competitividad (MINECO) and by FEDER funds under Grant TIN2015-66972-C5-1-R.López Rodríguez, PJ.; Baydal Cardona, ME. (2017). On a course on computer cluster configuration and administration. Journal of Parallel and Distributed Computing. 105:127-137. https://doi.org/10.1016/j.jpdc.2017.01.009S12713710

    Replication and Caching Systems for the support of VMs stored in File Systems with Snapshots

    Get PDF
    Recently, in a relatively short timeframe, there were fundamental changes in the way computing power is used. Virtualisation technology has changed both the model of a data centre’s infrastructure and the way physical computers are now managed. This shift is a consequence of today’s fast deployment rate of Virtual Machines (VM) in a high consolidation environment with minimal need for human management. New approaches to virtualisation techniques are being developed at a surprisingly fast rate, leading to a new exciting and vibrating ecosystem of platforms and services. We see the big industry players tackling problems such as Desktop Virtualisation with moderate success, but completely ignoring the computation power already present in their clients’ infrastructures and, instead, opting for a costly solution based on powerful new machines. There’s still room for improvement in Virtual Desktop Infrastructure (VDI) and development of new architectures that take advantage of the computation power available at the user’s desk, with a minimum effort on the management side; Infrastructure for Client-Based Desktops (iCBD) is one of these projects. This thesis focuses on the development of mechanisms for the replication and caching of VM images stored in a local filesystem, albeit one with the ability to perform snapshots. In this work, there are some challenges to address: the proposed architecture must be entirely distributed and completely integrated with the already existing client-based VDI platform; and it must be able to efficiently cope with very large, read-only files, (some of them snapshots) and handle their multiple versions. This work will also explore the challenges and advantages of deploying such a system in a high throughput network, with both high availability and scalability while efficiently supporting a large number of users (and their workstations)

    Red Storm usage model :Version 1.12.

    Full text link

    Algorithm-based Fault Tolerance for Dense Matrix Factorizations, Multiple Failures and Accuracy

    Get PDF
    Dense matrix factorizations, such as LU, Cholesky and QR, are widely used for scientific applications that require solving systems of linear equations, eigenvalues and linear least squares problems. Such computations are normally carried out on supercomputers, whose ever-growing scale induces a fast decline of the Mean Time To Failure (MTTF). This article proposes a new hybrid approach, based on Algorithm-Based Fault Tolerance (ABFT), to help matrix factorizations algorithms survive fail-stop failures. We consider extreme conditions, such as the absence of any reliable node and the possibility of losing both data and checksum from a single failure. We will present a generic solution for protecting the right factor, where the updates are applied, of all above mentioned factorizations. For the left factor, where the panel has been applied, we propose a scalable checkpointing algorithm. This algorithm features high degree of checkpointing parallelism and cooperatively utilizes the checksum storage leftover from the right factor protection. The fault-tolerant algorithms derived from this hybrid solution is applicable to a wide range of dense matrix factorizations, with minor modifications. Theoretical analysis shows that the fault tolerance overhead decreases inversely to the scaling in the number of computing units and the problem size. Experimental results of LU and QR factorization on the Kraken (Cray XT5) supercomputer validate the theoretical evaluation and confirm negligible overhead, with- and without-errors. Applicability to tolerate multiple failures and accuracy after multiple recovery is also considered.</jats:p

    Evolving an efficient and effective off-the-shelf computing infrastructure for schools in rural areas of South Africa

    Get PDF
    Upliftment of rural areas and poverty alleviation are priorities for development in South Africa. Information and knowledge are key strategic resources for social and economic development and ICTs act as tools to support them, enabling innovative and more cost effective approaches. In order for ICT interventions to be possible, infrastructure has to be deployed. For the deployment to be effective and sustainable, the local community needs to be involved in shaping and supporting it. This study describes the technical work done in the Siyakhula Living Lab (SLL), a long-term ICT4D experiment in the Mbashe Municipality, with a focus on the deployment of ICT infrastructure in schools, for teaching and learning but also for use by the communities surrounding the schools. As a result of this work, computing infrastructure was deployed, in various phases, in 17 schools in the area and a “broadband island” connecting them was created. The dissertation reports on the initial deployment phases, discussing theoretical underpinnings and policies for using technology in education as well various computing and networking technologies and associated policies available and appropriate for use in rural South African schools. This information forms the backdrop of a survey conducted with teachers from six schools in the SLL, together with experimental work towards the provision of an evolved, efficient and effective off-the-shelf computing infrastructure in selected schools, in order to attempt to address the shortcomings of the computing infrastructure deployed initially in the SLL. The result of the study is the proposal of an evolved computing infrastructure model for use in rural South African schools
    corecore