11,453 research outputs found

    A Holistic Approach to Log Data Analysis in High-Performance Computing Systems: The Case of IBM Blue Gene/Q

    Get PDF
    The complexity and cost of managing high-performance computing infrastructures are on the rise. Automating management and repair through predictive models to minimize human interventions is an attempt to increase system availability and contain these costs. Building predictive models that are accurate enough to be useful in automatic management cannot be based on restricted log data from subsystems but requires a holistic approach to data analysis from disparate sources. Here we provide a detailed multi-scale characterization study based on four datasets reporting power consumption, temperature, workload, and hardware/software events for an IBM Blue Gene/Q installation. We show that the system runs a rich parallel workload, with low correlation among its components in terms of temperature and power, but higher correlation in terms of events. As expected, power and temperature correlate strongly, while events display negative correlations with load and power. Power and workload show moderate correlations, and only at the scale of components. The aim of the study is a systematic, integrated characterization of the computing infrastructure and discovery of correlation sources and levels to serve as basis for future predictive modeling efforts.Comment: 12 pages, 7 Figure

    InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services

    Full text link
    Cloud computing providers have setup several data centers at different geographical locations over the Internet in order to optimally serve needs of their customers around the world. However, existing systems do not support mechanisms and policies for dynamically coordinating load distribution among different Cloud-based data centers in order to determine optimal location for hosting application services to achieve reasonable QoS levels. Further, the Cloud computing providers are unable to predict geographic distribution of users consuming their services, hence the load coordination must happen automatically, and distribution of services must change in response to changes in the load. To counter this problem, we advocate creation of federated Cloud computing environment (InterCloud) that facilitates just-in-time, opportunistic, and scalable provisioning of application services, consistently achieving QoS targets under variable workload, resource and network conditions. The overall goal is to create a computing environment that supports dynamic expansion or contraction of capabilities (VMs, services, storage, and database) for handling sudden variations in service demands. This paper presents vision, challenges, and architectural elements of InterCloud for utility-oriented federation of Cloud computing environments. The proposed InterCloud environment supports scaling of applications across multiple vendor clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that federated Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.Comment: 20 pages, 4 figures, 3 tables, conference pape

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene

    Securing critical utility systems & network infrastructures

    Get PDF
    Tese de mestrado, Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2009As infra-estruturas críticas de TI para serviços públicos são apoiadas por inúmeros sistemas complexos. Estes sistemas permitem a gestão e recolha de informação em tempo-real, constituindo a base para a gestão eficiente das operações. A utilização, cada vez mais frequente, de software e hardware (Commercial Off-The-Shelf, COTS) em sistemas SCADA permitiu grandes beneficios financeiros na aquisição e desenvolvimento de soluções técnicas que suportam os serviços públicos. O uso de hardware e software COTS em sistemas SCADA transferiu para as infra-estruturas críticas os problemas de segurança de uma infraestrutura de TI empresarial. Neste contexto, um desafio para as equipas de gestão operacional dos sistemas de TI é a gestão eficaz dos sistemas e redes que compõem as infra-estruturas críticas dos serviços públicos. Apesar de estas organizações adoptarem, cada vez mais, normas e melhores práticas que visam melhorar a gestão, operações e processos de configuração. Este projecto de investigação propõe-se a desenvolver um estudo comparativo de plataformas de gestão integrada no contexto dos sistemas SCADA que suportam serviços públicos. Adicionalmente, este projecto de investigação irá desenvolver estudos acerca de perfis operacionais dos Sistemas Operativos que suportam a infra-estrutura IT dos serviços públicos críticos. Este projecto de investigação irá descrever como as decisões estratégicas de gestão têm impacto nas operações de gestão de uma infra-estrutura TI.Modern critical utility IT infrastructures are supported by numerous complex systems. These systems allow real-time management and information collection, which is the basis of efficient service management operations. The usage of commercial off-the-shelf (COTS) hardware and software in SCADA systems allowed for major financial advantages in purchasing and developing technical solutions. On the other hand, this COTS hardware and software generalized usage in SCADA systems, exposed critical infrastructures to the security problems of a corporate IT infrastructure. A significant challenge for IT teams is managing critical utility IT infrastructures even upon adopting security best practices that help management, operations and configuration of the systems and network components that comprise those infrastructures. This research project proposes to survey integrated management software that can address the specific security constraints of a SCADA infrastructure supported by COTS software. Additionally, this research project proposes to investigate techniques that will allow the creation of operational profiles of Operating Systems supporting critical utility IT infrastructures. This research project will describe how the strategic management decisions impact tactical operations management of an IT environment. We will investigate desirable technical management elements in support of the operational management

    Technology infrastructure in information technology industries

    Get PDF
    Abstract not availableeconomics of technology business administration and economics

    Technical alignment

    Get PDF
    This essay discusses the importance of the areas of infrastructure and testing to help digital preservation services demonstrate reliability, transparency, and accountability. It encourages practitioners to build a strong culture in which transparency and collaborations between technical frameworks are valued highly. It also argues for devising and applying agreed-upon metrics that will enable the systematic analysis of preservation infrastructure. The essay begins by defining technical infrastructure and testing in the digital preservation context, provides case studies that exemplify both progress and challenges for technical alignment in both areas, and concludes with suggestions for achieving greater degrees of technical alignment going forward

    Data center virtualization and its economic implications for the companies

    Get PDF
    In the current situation of the economic crisis, when companies target budget cuttings in a context of an explosive data growth, the IT community must evaluate potential technology developments not only on their technical advantages, but on their economic effects as well.data centre; virtualization; tiered storage; provisioning software; unified computing.
    corecore