5,319 research outputs found

    InterCloud: Utility-Oriented Federation of Cloud Computing Environments for Scaling of Application Services

    Full text link
    Cloud computing providers have setup several data centers at different geographical locations over the Internet in order to optimally serve needs of their customers around the world. However, existing systems do not support mechanisms and policies for dynamically coordinating load distribution among different Cloud-based data centers in order to determine optimal location for hosting application services to achieve reasonable QoS levels. Further, the Cloud computing providers are unable to predict geographic distribution of users consuming their services, hence the load coordination must happen automatically, and distribution of services must change in response to changes in the load. To counter this problem, we advocate creation of federated Cloud computing environment (InterCloud) that facilitates just-in-time, opportunistic, and scalable provisioning of application services, consistently achieving QoS targets under variable workload, resource and network conditions. The overall goal is to create a computing environment that supports dynamic expansion or contraction of capabilities (VMs, services, storage, and database) for handling sudden variations in service demands. This paper presents vision, challenges, and architectural elements of InterCloud for utility-oriented federation of Cloud computing environments. The proposed InterCloud environment supports scaling of applications across multiple vendor clouds. We have validated our approach by conducting a set of rigorous performance evaluation study using the CloudSim toolkit. The results demonstrate that federated Cloud computing model has immense potential as it offers significant performance gains as regards to response time and cost saving under dynamic workload scenarios.Comment: 20 pages, 4 figures, 3 tables, conference pape

    Improving Energy Efficiency of MapReduce Framework using Dynamic Scheduling of Work

    Get PDF
    Most common huge volume data processing programs do counting, sorting, merging etc. Such programs require to perform first a computation on each record that is it requires to map an operation to each record. Then combine the output of these operations in appropriate way to get the answer that is apply a reduce operation to groups of records. MapReduce runtime environment takes care of parallelizing their execution and coordinating their inputs/outputs. Here we are concern about energy efficiency in MapReduce framework so we are proposing dynamic scheduling of workload which offers dynamic load balancing method. Load balancing is the methodology of distributing the load among different node of a distributed framework to enhance both resource usage and reaction time while likewise keeping away from a circumstance where a percentage of the node are intensely stacked while different node are sit out of gear or doing next to no work. An answer for unbalance circumstance is to utilize parallelization approaches yet at the same time node will stay overwhelming. In this paper, we propose an integrated. We are proposing a methodology where the MapReduce concept introduced into the MongoDB with NoSQL as a back end to implement the MapReduce

    ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

    Full text link
    ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

    Resource Management in Grids: Overview and a discussion of a possible approach for an Agent-Based Middleware

    Get PDF
    14 pagesInternational audienceResource management and job scheduling are important research issues in computational grids. When software agents are used as resource managers and brokers in the Grid a number of additional issues and possible approaches materialize. The aim of this chapter is twofold. First, we discuss traditional job scheduling in grids, and when agents are utilized as grid middleware. Second, we use this as a context for discussion of how job scheduling can be done in the agent-based system under development

    The role of the host in a cooperating mainframe and workstation environment, volumes 1 and 2

    Get PDF
    In recent years, advancements made in computer systems have prompted a move from centralized computing based on timesharing a large mainframe computer to distributed computing based on a connected set of engineering workstations. A major factor in this advancement is the increased performance and lower cost of engineering workstations. The shift to distributed computing from centralized computing has led to challenges associated with the residency of application programs within the system. In a combined system of multiple engineering workstations attached to a mainframe host, the question arises as to how does a system designer assign applications between the larger mainframe host and the smaller, yet powerful, workstation. The concepts related to real time data processing are analyzed and systems are displayed which use a host mainframe and a number of engineering workstations interconnected by a local area network. In most cases, distributed systems can be classified as having a single function or multiple functions and as executing programs in real time or nonreal time. In a system of multiple computers, the degree of autonomy of the computers is important; a system with one master control computer generally differs in reliability, performance, and complexity from a system in which all computers share the control. This research is concerned with generating general criteria principles for software residency decisions (host or workstation) for a diverse yet coupled group of users (the clustered workstations) which may need the use of a shared resource (the mainframe) to perform their functions
    corecore