1 research outputs found
A Time-driven Data Placement Strategy for a Scientific Workflow Combining Edge Computing and Cloud Computing
Compared to traditional distributed computing environments such as grids,
cloud computing provides a more cost-effective way to deploy scientific
workflows. Each task of a scientific workflow requires several large datasets
that are located in different datacenters from the cloud computing environment,
resulting in serious data transmission delays. Edge computing reduces the data
transmission delays and supports the fixed storing manner for scientific
workflow private datasets, but there is a bottleneck in its storage capacity.
It is a challenge to combine the advantages of both edge computing and cloud
computing to rationalize the data placement of scientific workflow, and
optimize the data transmission time across different datacenters. Traditional
data placement strategies maintain load balancing with a given number of
datacenters, which results in a large data transmission time. In this study, a
self-adaptive discrete particle swarm optimization algorithm with genetic
algorithm operators (GA-DPSO) was proposed to optimize the data transmission
time when placing data for a scientific workflow. This approach considered the
characteristics of data placement combining edge computing and cloud computing.
In addition, it considered the impact factors impacting transmission delay,
such as the band-width between datacenters, the number of edge datacenters, and
the storage capacity of edge datacenters. The crossover operator and mutation
operator of the genetic algorithm were adopted to avoid the premature
convergence of the traditional particle swarm optimization algorithm, which
enhanced the diversity of population evolution and effectively reduced the data
transmission time. The experimental results show that the data placement
strategy based on GA-DPSO can effectively reduce the data transmission time
during workflow execution combining edge computing and cloud computing