114 research outputs found

    High availability using virtualization

    Get PDF
    High availability has always been one of the main problems for a data center. Till now high availability was achieved by host per host redundancy, a highly expensive method in terms of hardware and human costs. A new approach to the problem can be offered by virtualization. Using virtualization, it is possible to achieve a redundancy system for all the services running on a data center. This new approach to high availability allows to share the running virtual machines over the servers up and running, by exploiting the features of the virtualization layer: start, stop and move virtual machines between physical hosts. The system (3RC) is based on a finite state machine with hysteresis, providing the possibility to restart each virtual machine over any physical host, or reinstall it from scratch. A complete infrastructure has been developed to install operating system and middleware in a few minutes. To virtualize the main servers of a data center, a new procedure has been developed to migrate physical to virtual hosts. The whole Grid data center SNS-PISA is running at the moment in virtual environment under the high availability system. As extension of the 3RC architecture, several storage solutions have been tested to store and centralize all the virtual disks, from NAS to SAN, to grant data safety and access from everywhere. Exploiting virtualization and ability to automatically reinstall a host, we provide a sort of host on-demand, where the action on a virtual machine is performed only when a disaster occurs.Comment: PhD Thesis in Information Technology Engineering: Electronics, Computer Science, Telecommunications, pp. 94, University of Pisa [Italy

    Consumer side resource accounting in cloud computing

    Get PDF
    PhD ThesisCloud computing services made available to consumers range from providing basic computational resources such as storage and compute power to sophisticated enterprise application services. A common business model is to charge consumers on a pay-per-use basis where they periodically pay for the resources they have consumed. The provider is responsible for measuring and collecting the resource usage data. This approach is termed provider-side accounting. A serious limitation of this approach is that consumers have no choice but to take whatever usage data that is made available by the provider as trustworthy. This thesis investigates whether it is possible to perform consumer-side resource accounting where a consumer independently collects, for a given cloud service, all the data required for calculating billing charges. If this were possible, then consumers will be able to perform reasonableness checks on the resource usage data available from service providers as well as raise alarms when apparent discrepancies are suspected in consumption figures. Two fundamental resources of cloud computing, namely, storage and computing are evaluated. The evaluation exercise reveals that the resource accounting models of popular cloud service providers, such as Amazon, are not entirely suited to consumer-side resource accounting, in that discrepancies between the data collected by the provider and the consumer can occur. The thesis precisely identifies the causes that could lead to such discrepancies and points out how the discrepancies can be resolved. The results from the thesis can be used by service providers to improve their resource accounting models. In particular, the thesis shows how an accounting model can be made strongly consumer–centric so that all the data that the model requires for calculating billing charges can be collected independently by the consumer. Strongly consumer–centric accounting models have the desirable property of openness and transparency, since service users are in a position to verify the charges billed to them.Cultural Affairs Department, Libyan Embassy, Londo

    A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

    Full text link
    Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

    Grid Computing, E-Science and Applications in Industry

    Get PDF

    Network services for large scale distributed science instruments

    Get PDF
    This research examined the requirements placed on computer networks by large scale scientific instrument collaborations and how they are met. Specifically, this study examined and characterized the networking requirements of the Large Hadron Collider (LHC) at CERN, an example of a large scientific instrument. A number of on-going research projects were surveyed to identify how the LHC requirements could be met. This included network measurement projects and circuit based services projects. Requirements not previously addressed, to measure and manage network outages, and maintenance events were identified. A database schema that supports outage and maintenance management across multiple domains was developed. A methodology for incorporating this capability into an existing network measurement framework was also explored

    EMMON - EMbedded MONitoring

    Get PDF
    Despite the steady increase in experimental deployments, most of research work on WSNs has focused only on communication protocols and algorithms, with a clear lack of effective, feasible and usable system architectures, integrated in a modular platform able to address both functional and non–functional requirements. In this paper, we outline EMMON [1], a full WSN-based system architecture for large–scale, dense and real–time embedded monitoring [3] applications. EMMON provides a hierarchical communication architecture together with integrated middleware and command and control software. Then, EM-Set, the EMMON engineering toolset will be presented. EM-Set includes a network deployment planning, worst–case analysis and dimensioning, protocol simulation and automatic remote programming and hardware testing tools. This toolset was crucial for the development of EMMON which was designed to use standard commercially available technologies, while maintaining as much flexibility as possible to meet specific applications requirements. Finally, the EMMON architecture has been validated through extensive simulation and experimental evaluation, including a 300+ nodes testbed

    Databases in High Energy Physics: a critial review

    Get PDF
    The year 2000 is marked by a plethora of significant milestones in the history of High Energy Physics. Not only the true numerical end to the second millennium, this watershed year saw the final run of CERN's Large Electron-Positron collider (LEP) - the world-class machine that had been the focus of the lives of many of us for such a long time. It is also closely related to the subject of this chapter in the following respects: - Classified as a nuclear installation, information on the LEP machine must be retained indefinitely. This represents a challenge to the database community that is almost beyond discussion - archiving of data for a relatively small number of years is indeed feasible, but retaining it for centuries, millennia or more is a very different issue; - There are strong scientific arguments as to why the data from the LEP machine should be retained for a short period. However, the complexity of the data itself, the associated metadata and the programs that manipulate it make even this a huge challenge; - The story of databases in HEP is closely linked to that of LEP itself: what were the basic requirements that were identified in the early years of LEP preparation? How well have these been satisfied? What are the remaining issues and key messages? - Finally, the year 2000 also marked the entry of Grid architectures into the central stage of HEP computing. How has the Grid affected the requirements on databases or the manner in which they are deployed? Furthermore, as the LEP tunnel and even parts of the detectors that it housed are readied for re-use for the Large Hadron Collider (LHC), how have our requirements on databases evolved at this new scale of computing? A number of the key players in the field of databases - as can be seen from the author list of the various publications - have since retired from the field or else this world. Given the fallibility of human memory, the need for a record of the use of databases for physics data processing is clearly needed before memories fade completely and the story is lost forever. It is necessarily somewhat CERN-centric, although effort has been made to cover important developments and events elsewhere. Frequent reference is made to the Computing in High Energy Physics (CHEP) conference series - the most accessible and consistent record of this field
    • …
    corecore