Search CORE

1,242 research outputs found

Model-driven Scheduling for Distributed Stream Processing Systems

Author: Shukla Anshu
Simmhan Yogesh
Publication venue: 'Elsevier BV'
Publication date: 06/02/2017
Field of study

Distributed Stream Processing frameworks are being commonly used with the evolution of Internet of Things(IoT). These frameworks are designed to adapt to the dynamic input message rate by scaling in/out.Apache Storm, originally developed by Twitter is a widely used stream processing engine while others includes Flink, Spark streaming. For running the streaming applications successfully there is need to know the optimal resource requirement, as over-estimation of resources adds extra cost.So we need some strategy to come up with the optimal resource requirement for a given streaming application. In this article, we propose a model-driven approach for scheduling streaming applications that effectively utilizes a priori knowledge of the applications to provide predictable scheduling behavior. Specifically, we use application performance models to offer reliable estimates of the resource allocation required. Further, this intuition also drives resource mapping, and helps narrow the estimated and actual dataflow performance and resource utilization. Together, this model-driven scheduling approach gives a predictable application performance and resource utilization behavior for executing a given DSPS application at a target input stream rate on distributed resources.Comment: 54 page

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

A New Efficient Cloud Model for Data Intensive Application

Author: Dr. N P Kavya
Rama Satish K V
Rama Satish K V
Publication venue: Global Journals Inc. (US)
Publication date: 15/01/2015
Field of study

Cloud computing play an important role in data intensive application since it provide a consistent performance over time and it provide scalability and good fault tolerant mechanism Hadoop provide a scalable data intensive map reduce architecture Hadoop map task are executed on large cluster and consumes lot of energy and resources Executing these tasks requires lot of resource and energy which are expensive so minimizing the cost and resource is critical for a map reduce application So here in this paper we propose a new novel efficient cloud structure algorithm for data processing or computation on azure cloud Here we propose an efficient BSP based dynamic scheduling algorithm for iterative MapReduce for data intensive application on Microsoft azure cloud platform Our framework can be used on different domain application such as data analysis medical research dataminining etc Here we analyze the performance of our system by using a co-located cashing on the worker role and how it is improving the performance of data intensive application over Hadoop map reduce data intrinsic application The experimental result shows that our proposed framework properly utilizes cloud infrastructure service management overheads bandwith bottleneck and it is high scalable fault tolerant and efficien

Global Journal of Computer Science and Technology (GJCST)