76,927 research outputs found

    Toward scalable management of multiple service levels in IP networks

    Get PDF
    This paper analyzes and discusses the role of a distributed and simple admission control (AC) model in achieving scalable management of multiple network service levels. The model design, covering explicit and implicit AC, exhibits relevant properties which allow managing QoS and SLSs in multiservice IP networks in a flexible and scalable manner. These properties stem from the way service-dependent AC and on-line service performance monitoring are proposed and articulated in the model's architecture and operation. The scalability debate, carried out at these two levels, highlights key steps toward performing self-adaptive service-oriented AC and low overhead multiservice monitoring. The performance evaluation results, illustrating the role and relevance of the defined AC rules, show that QoS and SLSs requirements can be efficiently satisfied or bounded, proving that the simplicity, flexibility and self-adaptability of the model can be explored to manage multiple service guarantees successfully

    MonALISA : A Distributed Monitoring Service Architecture

    Full text link
    The MonALISA (Monitoring Agents in A Large Integrated Services Architecture) system provides a distributed monitoring service. MonALISA is based on a scalable Dynamic Distributed Services Architecture which is designed to meet the needs of physics collaborations for monitoring global Grid systems, and is implemented using JINI/JAVA and WSDL/SOAP technologies. The scalability of the system derives from the use of multithreaded Station Servers to host a variety of loosely coupled self-describing dynamic services, the ability of each service to register itself and then to be discovered and used by any other services, or clients that require such information, and the ability of all services and clients subscribing to a set of events (state changes) in the system to be notified automatically. The framework integrates several existing monitoring tools and procedures to collect parameters describing computational nodes, applications and network performance. It has built-in SNMP support and network-performance monitoring algorithms that enable it to monitor end-to-end network performance as well as the performance and state of site facilities in a Grid. MonALISA is currently running around the clock on the US CMS test Grid as well as an increasing number of other sites. It is also being used to monitor the performance and optimize the interconnections among the reflectors in the VRVS system.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics (CHEP03), La Jolla, Ca, USA, March 2003, 8 pages, pdf. PSN MOET00

    C2MS: Dynamic Monitoring and Management of Cloud Infrastructures

    Full text link
    Server clustering is a common design principle employed by many organisations who require high availability, scalability and easier management of their infrastructure. Servers are typically clustered according to the service they provide whether it be the application(s) installed, the role of the server or server accessibility for example. In order to optimize performance, manage load and maintain availability, servers may migrate from one cluster group to another making it difficult for server monitoring tools to continuously monitor these dynamically changing groups. Server monitoring tools are usually statically configured and with any change of group membership requires manual reconfiguration; an unreasonable task to undertake on large-scale cloud infrastructures. In this paper we present the Cloudlet Control and Management System (C2MS); a system for monitoring and controlling dynamic groups of physical or virtual servers within cloud infrastructures. The C2MS extends Ganglia - an open source scalable system performance monitoring tool - by allowing system administrators to define, monitor and modify server groups without the need for server reconfiguration. In turn administrators can easily monitor group and individual server metrics on large-scale dynamic cloud infrastructures where roles of servers may change frequently. Furthermore, we complement group monitoring with a control element allowing administrator-specified actions to be performed over servers within service groups as well as introduce further customized monitoring metrics. This paper outlines the design, implementation and evaluation of the C2MS.Comment: Proceedings of the The 5th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2013), 8 page

    Monitoring Large-Scale Cloud Systems with Layered Gossip Protocols

    Full text link
    Monitoring is an essential aspect of maintaining and developing computer systems that increases in difficulty proportional to the size of the system. The need for robust monitoring tools has become more evident with the advent of cloud computing. Infrastructure as a Service (IaaS) clouds allow end users to deploy vast numbers of virtual machines as part of dynamic and transient architectures. Current monitoring solutions, including many of those in the open-source domain rely on outdated concepts including manual deployment and configuration, centralised data collection and adapt poorly to membership churn. In this paper we propose the development of a cloud monitoring suite to provide scalable and robust lookup, data collection and analysis services for large-scale cloud systems. In lieu of centrally managed monitoring we propose a multi-tier architecture using a layered gossip protocol to aggregate monitoring information and facilitate lookup, information collection and the identification of redundant capacity. This allows for a resource aware data collection and storage architecture that operates over the system being monitored. This in turn enables monitoring to be done in-situ without the need for significant additional infrastructure to facilitate monitoring services. We evaluate this approach against alternative monitoring paradigms and demonstrate how our solution is well adapted to usage in a cloud-computing context.Comment: Extended Abstract for the ACM International Symposium on High-Performance Parallel and Distributed Computing (HPDC 2013) Poster Trac

    Results readiness in social protection and labor operations

    Get PDF
    The main focus of the social protection and labor portfolio is on strengthening client's institutional capacity in the design and implementation of programs, but projects are not well equipped to track progress in this area. Correspondingly, there is a need to strengthen approaches to measuring and monitoring a'missing middle'of service delivery, precisely those areas for which counterpart institutions are responsible during the course of a project. In particular, better measures of the primary functions of social protection and labor agencies are needed, such as identifying and enrolling beneficiaries, targeting, payment systems, fraud and error control, performance monitoring of service delivery providers, responsiveness to citizens, transparency, efficiency, management information systems and monitoring and evaluation systems. New World Bank initiatives particularly standard core indicators by sector and the introduction of results based investment lending call for substantial improvements in the use of monitoring and evaluation (M&E). Impact evaluations are included in about half of projects and should continue to be used selectively and strategically, particularly when the program is innovative, replicable and/ or scalable to reach a broader set of beneficiaries, addresses a knowledge gap and is likely to have a substantial policy impact. Structuring evaluations around core themes with common outcome measures is fundamental to building a global knowledge base on development effectiveness.Poverty Monitoring&Analysis,Poverty and Social Impact Analysis,E-Business,Safety Nets and Transfers,Housing&Human Habitats

    A service-oriented admission control strategy for class-based IP networks

    Get PDF
    The clear trend toward the integration of current and emerging applications and services in the Internet launches new demands on service deployment and management. Distributed service-oriented traffic control mechanisms, operating with minimum impact on network performance, assume a crucial role as regards controlling services quality and network resources transparently and efficiently. In this paper, we describe and specify a lightweight distributed admission control (AC) model based on per-class monitoring feedback for ensuring the quality of distinct service levels in multiclass and multidomain environments. The model design, covering explicit and implicit AC, exhibits relevant properties that allow managing quality of service (QoS) and service-level specifications (SLSs) in multiservice IP networks in a flexible and scalable manner. These properties, stemming from the way service-dependent AC and on-line service performance monitoring are proposed and articulated in the model’s architecture and operation, allow a self-adaptive service and resource management, while abstracting from network core complexity and heterogeneity. A proof of concept is provided to illustrate the AC criteria ability in satisfying multiple service class commitments efficiently. The obtained results show that the self-adaptive behavior inherent to on-line measurement-based service management, combined with the established AC rules, is effective in controlling each class QoS and SLS commitments consistently

    Assessing the overhead and scalability of system monitors for large data centers

    Get PDF
    Current data centers are shifting towards cloud-based architectures as a means to obtain a scalable, cost-effective, robust service platform. In spite of this, the underlying management infrastructure has grown in terms of hardware resources and software complexity, making automated resource monitoring a necessity.There are several infrastructure monitoring tools designed to scale to a very high number of physical nodes. However, these tools either collect performance measure at a low frequency (missing the chance to capture the dynamics of a short-term management task) or are simply not equipped with instrumentation specific to cloud computing and virtualization. In this scenario, monitoring the correctness and efficiency of live migrations can become a nightmare. This situation will only worsen in the future, with the increased service demand due to spreading of the user base.In this paper, we assess the scalability of a prototype monitoring subsystem for different user scenarios. We also identify all the major bottlenecks and give insight on how to remove them

    RELEASE: A High-level Paradigm for Reliable Large-scale Server Software

    Get PDF
    Erlang is a functional language with a much-emulated model for building reliable distributed systems. This paper outlines the RELEASE project, and describes the progress in the rst six months. The project aim is to scale the Erlang's radical concurrency-oriented programming paradigm to build reliable general-purpose software, such as server-based systems, on massively parallel machines. Currently Erlang has inherently scalable computation and reliability models, but in practice scalability is constrained by aspects of the language and virtual machine. We are working at three levels to address these challenges: evolving the Erlang virtual machine so that it can work effectively on large scale multicore systems; evolving the language to Scalable Distributed (SD) Erlang; developing a scalable Erlang infrastructure to integrate multiple, heterogeneous clusters. We are also developing state of the art tools that allow programmers to understand the behaviour of massively parallel SD Erlang programs. We will demonstrate the e ectiveness of the RELEASE approach using demonstrators and two large case studies on a Blue Gene
    corecore