81,604 research outputs found
Performance Prediction of Cloud-Based Big Data Applications
Big data analytics have become widespread as a means to extract knowledge from large datasets. Yet, the heterogeneity and irregular- ity usually associated with big data applications often overwhelm the existing software and hardware infrastructures. In such con- text, the exibility and elasticity provided by the cloud computing paradigm o er a natural approach to cost-e ectively adapting the allocated resources to the application’s current needs. However, these same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a key step to proper management and planning. This paper explores three modeling approaches for performance prediction of cloud-based big data applications. We evaluate two queuing-based analytical models and a novel fast ad hoc simulator in various scenarios based on di erent applications and infrastructure setups. The three ap- proaches are compared in terms of prediction accuracy, nding that our best approaches can predict average application execution times with 26% relative error in the very worst case and about 7% on average
Layered performance modelling and evaluation for cloud topic detection and tracking based big data applications
“Big Data” best characterized by its three features namely
“Variety”, “Volume” and “Velocity” is revolutionizing
nearly every aspect of our lives ranging from enterprises to
consumers, from science to government. A fourth characteristic
namely “value” is delivered via the use of smart data
analytics over Big Data. One such Big Data Analytics application
considered in this thesis is Topic Detection and Tracking (TDT).
The characteristics of Big Data brings with it unprecedented
challenges such as too large for traditional devices to process
and store (volume), too fast for traditional methods to scale
(velocity), and heterogeneous data (variety). In recent times,
cloud computing has emerged as a practical and technical solution
for processing big data. However, while deploying Big data
analytics applications such as TDT in cloud (called cloud-based
TDT), the challenge is to cost-effectively orchestrate and
provision Cloud resources to meet performance Service Level
Agreements (SLAs). Although there exist limited work on
performance modeling of cloud-based TDT applications none of
these methods can be directly applied to guarantee the
performance SLA of cloud-based TDT applications. For instance,
current literature lacks a systematic, reliable and accurate
methodology to measure, predict and finally guarantee
performances of TDT applications. Furthermore, existing
performance models fail to consider the end-to-end complexity of
TDT applications and focus only on the individual processing
components (e.g. map reduce).
To tackle this challenge, in this thesis, we develop a layered
performance model of cloud-based TDT applications that take into
account big data characteristics, the data and event flow across
myriad cloud software and hardware resources and diverse SLA
considerations. In particular, we propose and develop models to
capture in detail with great accuracy, the factors having a
pivotal role in performances of cloud-based TDT applications and
identify ways in which these factors affect the performance and
determine the dependencies between the factors. Further, we have
developed models to predict the performance of cloud-based TDT
applications under uncertainty conditions imposed by Big Data
characteristics. The model developed in this thesis is aimed to
be generic allowing its application to other cloud-based data
analytics applications. We have demonstrated the feasibility,
efficiency, validity and prediction accuracy of the proposed
models via experimental evaluations using a real-world Flu
detection use-case on Apache Hadoop Map Reduce, HDFS and Mahout
Frameworks
Big Data Application and System Co-optimization in Cloud and HPC Environment
The emergence of big data requires powerful computational resources and memory subsystems that can be scaled efficiently to accommodate its demands. Cloud is a new well-established computing paradigm that can offer customized computing and memory resources to meet the scalable demands of big data applications. In addition, the flexible pay-as-you-go pricing model offers opportunities for using large scale of resources with low cost and no infrastructure maintenance burdens. High performance computing (HPC) on the other hand also has powerful infrastructure that has potential to support big data applications. In this dissertation, we explore the application and system co-optimization opportunities to support big data in both cloud and HPC environments.
Specifically, we explore the unique features of both application and system to seek overlooked optimization opportunities or tackle challenges that are difficult to be addressed by only looking at the application or system individually. Based on the characteristics of the workloads and their underlying systems to derive the optimized deployment and runtime schemes, we divide the workflow into four categories: 1) memory intensive applications; 2) compute intensive applications; 3) both memory and compute intensive applications; 4) I/O intensive applications.When deploying memory intensive big data applications to the public clouds, one important yet challenging problem is selecting a specific instance type whose memory capacity is large enough to prevent out-of-memory errors while the cost is minimized without violating performance requirements. In this dissertation, we propose two techniques for efficient deployment of big data applications with dynamic and intensive memory footprint in the cloud. The first approach builds a performance-cost model that can accurately predict how, and by how much, virtual memory size would slow down the application and consequently, impact the overall monetary cost. The second approach employs a lightweight memory usage prediction methodology based on dynamic meta-models adjusted by the application's own traits. The key idea is to eliminate the periodical checkpointing and migrate the application only when the predicted memory usage exceeds the physical allocation. When applying compute intensive applications to the clouds, it is critical to make the applications scalable so that it can benefit from the massive cloud resources. In this dissertation, we first use the Kirchhoff law, which is one of the most widely used physical laws in many engineering principles, as an example workload for our study. The key challenge of applying the Kirchhoff law to real-world applications at scale lies in the high, if not prohibitive, computational cost to solve a large number of nonlinear equations. In this dissertation, we propose a high-performance deep-learning-based approach for Kirchhoff analysis, namely HDK. HDK employs two techniques to improve the performance: (i) early pruning of unqualified input candidates which simplify the equation and select a meaningful input data range; (ii) parallelization of forward labelling which execute steps of the problem in parallel. When it comes to both memory and compute intensive applications in clouds, we use blockchain system as a benchmark. Existing blockchain frameworks exhibit a technical barrier for many users to modify or test out new research ideas in blockchains. To make it worse, many advantages of blockchain systems can be demonstrated only at large scales, which are not always available to researchers. In this dissertation, we develop an accurate and efficient emulating system to replay the execution of large-scale blockchain systems on tens of thousands of nodes in the cloud. For I/O intensive applications, we observe one important yet often neglected side effect of lossy scientific data compression. Lossy compression techniques have demonstrated promising results in significantly reducing the scientific data size while guaranteeing the compression error bounds, but the compressed data size is often highly skewed and thus impact the performance of parallel I/O. Therefore, we believe it is critical to pay more attention to the unbalanced parallel I/O caused by lossy scientific data compression
HealthFog: An ensemble deep learning based Smart Healthcare System for Automatic Diagnosis of Heart Diseases in integrated IoT and fog computing environments
Cloud computing provides resources over the Internet and allows a plethora of
applications to be deployed to provide services for different industries. The
major bottleneck being faced currently in these cloud frameworks is their
limited scalability and hence inability to cater to the requirements of
centralized Internet of Things (IoT) based compute environments. The main
reason for this is that latency-sensitive applications like health monitoring
and surveillance systems now require computation over large amounts of data
(Big Data) transferred to centralized database and from database to cloud data
centers which leads to drop in performance of such systems. The new paradigms
of fog and edge computing provide innovative solutions by bringing resources
closer to the user and provide low latency and energy-efficient solutions for
data processing compared to cloud domains. Still, the current fog models have
many limitations and focus from a limited perspective on either accuracy of
results or reduced response time but not both. We proposed a novel framework
called HealthFog for integrating ensemble deep learning in Edge computing
devices and deployed it for a real-life application of automatic Heart Disease
analysis. HealthFog delivers healthcare as a fog service using IoT devices and
efficiently manages the data of heart patients, which comes as user requests.
Fog-enabled cloud framework, FogBus is used to deploy and test the performance
of the proposed model in terms of power consumption, network bandwidth,
latency, jitter, accuracy and execution time. HealthFog is configurable to
various operation modes that provide the best Quality of Service or prediction
accuracy, as required, in diverse fog computation scenarios and for different
user requirements
End-to-End Trust Fulfillment of Big Data Workflow Provisioning over Competing Clouds
Cloud Computing has emerged as a promising and powerful paradigm for delivering data- intensive, high performance computation, applications and services over the Internet. Cloud Computing has enabled the implementation and success of Big Data, a relatively recent phenomenon consisting of the generation and analysis of abundant data from various sources. Accordingly, to satisfy the growing demands of Big Data storage, processing, and analytics, a large market has emerged for Cloud Service Providers, offering a myriad of resources, platforms, and infrastructures. The proliferation of these services often makes it difficult for consumers to select the most suitable and trustworthy provider to fulfill the requirements of building complex workflows and applications in a relatively short time.
In this thesis, we first propose a quality specification model to support dual pre- and post-cloud workflow provisioning, consisting of service provider selection and workflow quality enforcement and adaptation. This model captures key properties of the quality of work at different stages of the Big Data value chain, enabling standardized quality specification, monitoring, and adaptation.
Subsequently, we propose a two-dimensional trust-enabled framework to facilitate end-to-end Quality of Service (QoS) enforcement that: 1) automates cloud service provider selection for Big Data workflow processing, and 2) maintains the required QoS levels of Big Data workflows during runtime through dynamic orchestration using multi-model architecture-driven workflow monitoring, prediction, and adaptation.
The trust-based automatic service provider selection scheme we propose in this thesis is comprehensive and adaptive, as it relies on a dynamic trust model to evaluate the QoS of a cloud provider prior to taking any selection decisions. It is a multi-dimensional trust model for Big Data workflows over competing clouds that assesses the trustworthiness of cloud providers based on three trust levels: (1) presence of the most up-to-date cloud resource verified capabilities, (2) reputational evidence measured by neighboring users and (3) a recorded personal history of experiences with the cloud provider.
The trust-based workflow orchestration scheme we propose aims to avoid performance degradation or cloud service interruption. Our workflow orchestration approach is not only based on automatic adaptation and reconfiguration supported by monitoring, but also on predicting cloud resource shortages, thus preventing performance degradation. We formalize the cloud resource orchestration process using a state machine that efficiently captures different dynamic properties of the cloud execution environment. In addition, we use a model checker to validate our monitoring model in terms of reachability, liveness, and safety properties.
We evaluate both our automated service provider selection scheme and cloud workflow orchestration, monitoring and adaptation schemes on a workflow-enabled Big Data application. A set of scenarios were carefully chosen to evaluate the performance of the service provider selection, workflow monitoring and the adaptation schemes we have implemented. The results demonstrate that our service selection outperforms other selection strategies and ensures trustworthy service provider selection. The results of evaluating automated workflow orchestration further show that our model is self-adapting, self-configuring, reacts efficiently to changes and adapts accordingly while enforcing QoS of workflows
Simulation of the performance of complex data-intensive workflows
PhD ThesisRecently, cloud computing has been used for analytical and data-intensive processes
as it offers many attractive features, including resource pooling, on-demand capability
and rapid elasticity. Scientific workflows use these features to tackle the problems of
complex data-intensive applications. Data-intensive workflows are composed of many
tasks that may involve large input data sets and produce large amounts of data as
output, which typically runs in highly dynamic environments. However, the resources
should be allocated dynamically depending on the demand changes of the work
flow, as over-provisioning increases the cost and under-provisioning causes Service Level
Agreement (SLA) violation and poor Quality of Service (QoS). Performance prediction
of complex workflows is a necessary step prior to the deployment of the workflow.
Performance analysis of complex data-intensive workflows is a challenging task due
to the complexity of their structure, diversity of big data, and data dependencies, in
addition to the required examination to the performance and challenges associated
with running their workflows in the real cloud.
In this thesis, a solution is explored to address these challenges, using a Next Generation
Sequencing (NGS) workflow pipeline as a case study, which may require hundreds/
thousands of CPU hours to process a terabyte of data. We propose a methodology to
model, simulate and predict runtime and the number of resources used by the complex
data-intensive workflows. One contribution of our simulation methodology is that it
provides an ability to extract the simulation parameters (e.g., MIPs and BW values)
that are required for constructing a training set and a fairly accurate prediction of
the run time for input for cluster sizes much larger than ones used in training of the
prediction model. The proposed methodology permits the derivation of run time prediction
based on historical data from the provenance fi les. We present the run time
prediction of the complex workflow by considering different cases of its running in the
cloud such as execution failure and library deployment time. In case of failure, the
framework can apply the prediction only partially considering the successful parts of
the pipeline, in the other case the framework can predict with or without considering
the time to deploy libraries. To further improve the accuracy of prediction, we propose
a simulation model that handles I/O contention
Real-time performance diagnosis and evaluation of big data systems in cloud datacenters
PhD ThesisModern big data processing systems are becoming very complex in terms of largescale, high-concurrency and multiple talents. Thus, many failures and performance
reductions only happen at run-time and are very difficult to capture. Moreover, some
issues may only be triggered when some components are executed. To analyze the root
cause of these types of issues, we have to capture the dependencies of each component
in real-time.
Big data processing systems, such as Hadoop and Spark, usually work in large-scale,
highly-concurrent, and multi-tenant environments that can easily cause hardware and
software malfunctions or failures, thereby leading to performance degradation. Several systems and methods exist to detect big data processing systems’ performance
degradation, perform root-cause analysis, and even overcome the issues causing such
degradation. However, these solutions focus on specific problems such as stragglers and
inefficient resource utilization. There is a lack of a generic and extensible framework
to support the real-time diagnosis of big data systems.
Performance diagnosis and prediction of big data systems are highly complex as these
frameworks are typically deployed in cloud data centers that are large-scale, highly
concurrent, and follows a multi-tenant model. Several factors, including hardware
heterogeneity, stochastic networks and application workloads may impact the performance of big data systems. The current state-of-the-art does not sufficiently address
the challenge of determining complex, usually stochastic and hidden relationships between these factors.
To handle performance diagnosis and evaluation of big data systems in cloud environments, this thesis proposes multilateral research towards monitoring and performance
diagnosis and prediction in cloud-based large-scale distributed systems by involving a
novel combination of an effective and efficient deployment pipeline.The key contributions of this dissertation are listed below:
- i -
• Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource
utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs).
• Developing AutoDiagn, an automated real-time diagnosis framework for big data
systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online
root-cause analysis for a big data system.
• Designing a novel root-cause analysis technique/system called BigPerf for big
data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex
relationships between performance related factors.
The key contributions of this dissertation are listed below:
- i -
• Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource
utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs).
• Developing AutoDiagn, an automated real-time diagnosis framework for big data
systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online
root-cause analysis for a big data system.
• Designing a novel root-cause analysis technique/system called BigPerf for big
data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex
relationships between performance related factors.
The key contributions of this dissertation are listed below:
- i -
• Designing a real-time big data monitoring system called SmartMonit that efficiently collects the runtime system information including computing resource
utilization and job execution information and then interacts the collected information with the Execution Graph modeled as directed acyclic graphs (DAGs).
• Developing AutoDiagn, an automated real-time diagnosis framework for big data
systems, that automatically detects performance degradation and inefficient resource utilization problems, while providing an online detection and semi-online
root-cause analysis for a big data system.
• Designing a novel root-cause analysis technique/system called BigPerf for big
data systems that analyzes and characterizes the performance of big data applications by incorporating Bayesian networks to determine uncertain and complex
relationships between performance related factors.State of the Republic of Turkey and the Turkish Ministry
of National Educatio
- …