243 research outputs found
Resource Management and Scheduling for Big Data Applications in Cloud Computing Environments
This chapter presents software architectures of the big data processing
platforms. It will provide an in-depth knowledge on resource management
techniques involved while deploying big data processing systems on cloud
environment. It starts from the very basics and gradually introduce the core
components of resource management which we have divided in multiple layers. It
covers the state-of-art practices and researches done in SLA-based resource
management with a specific focus on the job scheduling mechanisms.Comment: 27 pages, 9 figure
Managing Mysql Cluster Data Using Cloudera Impala
AbstractMySQL Cluster is a widely used clustered database used to store and manipulate data which has a shared-nothing clustering for the MySql database management system providing high availability and high throughput with low latency. The problem with MySQL Cluster is that as the data grows larger, the time required to process the data increases and additional resources may be needed. With Hadoop and Impala,data processing time can be faster than MySql cluster and probably faster than Hive and Pig. This paper provides preliminary results. Evaluation results indicates that Impala achieves acceptable perfomance for some data analysis and processing tasks even compared with Hive and Pig and MySql cluster
- …