4 research outputs found
BetterLife 2.0: large-scale social intelligence reasoning on cloud
This paper presents the design of the BetterLife 2.0 framework, which facilitates implementation of large-scale social intelligence application in cloud environment. We argued that more and more mobile social applications in pervasive computing need to be implemented this way, with a lot of user generated activities in social networking websites. We adopted the Case-based Reasoning technique to provide logical reasoning and outlined design considerations when porting a typical CBR framework jCOLIBRI2 to cloud, using Hadoop's various services (HDFS, HBase). These services allow efficient case base management (e.g. case insertion) and distribution of computational intensive jobs to speed up reasoning process more than 5 times. With the scalability merit of MapReduce, we can improve recommendation service with social network analysis that needs to handle millions of users' social activities. © 2010 IEEE.published_or_final_versionThe 2nd IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2010), Indianapolis, IN., 30 November-3 December 2010. In Proceedings of the 2nd CloudCom, 2010, p. 529-53
BetterLife 2.0: Large-scale Social Intelligence Reasoning on Cloud
Abstract—This paper presents the design of the BetterLife 2.0 framework, which facilitates implementation of large-scale social intelligence application in cloud environment. We argued that more and more mobile social applications in pervasive computing need to be implemented this way, with a lot of user generated activities in social networking websites. We adopted the Case-based Reasoning technique to provide logical reasoning and outlined design considerations when porting a typical CBR framework jCOLIBRI2 to cloud, using Hadoop’s various services (HDFS, HBase). These services allow efficient case base management (e.g. case insertion) and distribution of computational intensive jobs to speed up reasoning process more than 5 times. With the scalabilitymeritof MapReduce,wecanimprove recommendation service with social network analysis that needs to handle millions of users ’ social activities. I
Simulation of the performance of complex data-intensive workflows
PhD ThesisRecently, cloud computing has been used for analytical and data-intensive processes
as it offers many attractive features, including resource pooling, on-demand capability
and rapid elasticity. Scientific workflows use these features to tackle the problems of
complex data-intensive applications. Data-intensive workflows are composed of many
tasks that may involve large input data sets and produce large amounts of data as
output, which typically runs in highly dynamic environments. However, the resources
should be allocated dynamically depending on the demand changes of the work
flow, as over-provisioning increases the cost and under-provisioning causes Service Level
Agreement (SLA) violation and poor Quality of Service (QoS). Performance prediction
of complex workflows is a necessary step prior to the deployment of the workflow.
Performance analysis of complex data-intensive workflows is a challenging task due
to the complexity of their structure, diversity of big data, and data dependencies, in
addition to the required examination to the performance and challenges associated
with running their workflows in the real cloud.
In this thesis, a solution is explored to address these challenges, using a Next Generation
Sequencing (NGS) workflow pipeline as a case study, which may require hundreds/
thousands of CPU hours to process a terabyte of data. We propose a methodology to
model, simulate and predict runtime and the number of resources used by the complex
data-intensive workflows. One contribution of our simulation methodology is that it
provides an ability to extract the simulation parameters (e.g., MIPs and BW values)
that are required for constructing a training set and a fairly accurate prediction of
the run time for input for cluster sizes much larger than ones used in training of the
prediction model. The proposed methodology permits the derivation of run time prediction
based on historical data from the provenance fi les. We present the run time
prediction of the complex workflow by considering different cases of its running in the
cloud such as execution failure and library deployment time. In case of failure, the
framework can apply the prediction only partially considering the successful parts of
the pipeline, in the other case the framework can predict with or without considering
the time to deploy libraries. To further improve the accuracy of prediction, we propose
a simulation model that handles I/O contention