243 research outputs found

    Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments

    Get PDF
    This chapter presents software architectures of the big data processing platforms. It will provide an in-depth knowledge on resource management techniques involved while deploying big data processing systems on cloud environment. It starts from the very basics and gradually introduce the core components of resource management which we have divided in multiple layers. It covers the state-of-art practices and researches done in SLA-based resource management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure

    Managing Mysql Cluster Data Using Cloudera Impala

    Get PDF
    AbstractMySQL Cluster is a widely used clustered database used to store and manipulate data which has a shared-nothing clustering for the MySql database management system providing high availability and high throughput with low latency. The problem with MySQL Cluster is that as the data grows larger, the time required to process the data increases and additional resources may be needed. With Hadoop and Impala,data processing time can be faster than MySql cluster and probably faster than Hive and Pig. This paper provides preliminary results. Evaluation results indicates that Impala achieves acceptable perfomance for some data analysis and processing tasks even compared with Hive and Pig and MySql cluster
    corecore