27 research outputs found
Expedited Data Transfers for Serverless Clouds
Serverless computing has emerged as a popular cloud deployment paradigm. In
serverless, the developers implement their application as a set of chained
functions that form a workflow in which functions invoke each other. The cloud
providers are responsible for automatically scaling the number of instances for
each function on demand and forwarding the requests in a workflow to the
appropriate function instance. Problematically, today's serverless clouds lack
efficient support for cross-function data transfers in a workflow, preventing
the efficient execution of data-intensive serverless applications. In
production clouds, functions transmit intermediate, i.e., ephemeral, data to
other functions either as part of invocation HTTP requests (i.e., inline) or
via third-party services, such as AWS S3 storage or AWS ElastiCache in-memory
cache. The former approach is restricted to small transfer sizes, while the
latter supports arbitrary transfers but suffers from performance and cost
overheads. This work introduces Expedited Data Transfers (XDT), an
API-preserving high-performance data communication method for serverless that
enables direct function-to-function transfers. With XDT, a trusted component of
the sender function buffers the payload in its memory and sends a secure
reference to the receiver, which is picked by the load balancer and autoscaler
based on the current load. Using the reference, the receiver instance pulls the
transmitted data directly from the sender's memory. XDT is natively compatible
with existing autoscaling infrastructure, preserves function invocation
semantics, is secure, and avoids the cost and performance overheads of using an
intermediate service for data transfers. We prototype our system in
vHive/Knative deployed on a cluster of AWS EC2 nodes, showing that XDT improves
latency, bandwidth, and cost over AWS S3 and ElasticCache.Comment: latest versio
DFlow: Efficient Dataflow-based Invocation Workflow Execution for Function-as-a-Service
The Serverless Computing is becoming increasingly popular due to its ease of
use and fine-grained billing. These features make it appealing for stateful
application or serverless workflow. However, current serverless workflow
systems utilize a controlflow-based invocation pattern to invoke functions. In
this execution pattern, the function invocation depends on the state of the
function. A function can only begin executing once all its precursor functions
have completed. As a result, this pattern may potentially lead to longer
end-to-end execution time. We design and implement the DFlow, a novel
dataflow-based serverless workflow system that achieves high performance for
serverless workflow. DFlow introduces a distributed scheduler (DScheduler) by
using the dataflow-based invocation pattern to invoke functions. In this
pattern, the function invocation depends on the data dependency between
functions. The function can start to execute even its precursor functions are
still running. DFlow further features a distributed store (DStore) that
utilizes effective fine-grained optimization techniques to eliminate function
interaction, thereby enabling efficient data exchange. With the support of
DScheduler and DStore, DFlow can achieving an average improvement of 60% over
CFlow, 40% over FaaSFlow, 25% over FaasFlowRedis, and 40% over KNIX on 99%-ile
latency respectively. Further, it can improve network bandwidth utilization by
2x-4x over CFlow and 1.5x-3x over FaaSFlow, FaaSFlowRedis and KNIX,
respectively. DFlow effectively reduces the cold startup latency, achieving an
average improvement of 5.6x over CFlow and 1.1x over FaaSFlowComment: 22 pages, 13 figure
GeoGauss: Strongly Consistent and Light-Coordinated OLTP for Geo-Replicated SQL Database
Multinational enterprises conduct global business that has a demand for
geo-distributed transactional databases. Existing state-of-the-art databases
adopt a sharded master-follower replication architecture. However, the
single-master serving mode incurs massive cross-region writes from clients, and
the sharded architecture requires multiple round-trip acknowledgments (e.g.,
2PC) to ensure atomicity for cross-shard transactions. These limitations drive
us to seek yet another design choice. In this paper, we propose a strongly
consistent OLTP database GeoGauss with full replica multi-master architecture.
To efficiently merge the updates from different master nodes, we propose a
multi-master OCC that unifies data replication and concurrent transaction
processing. By leveraging an epoch-based delta state merge rule and the
optimistic asynchronous execution, GeoGauss ensures strong consistency with
light-coordinated protocol and allows more concurrency with weak isolation,
which are sufficient to meet our needs. Our geo-distributed experimental
results show that GeoGauss achieves 7.06X higher throughput and 17.41X lower
latency than the state-of-the-art geo-distributed database CockroachDB on the
TPC-C benchmark
Cost and Latency Optimized Edge Computing Platform
Latency-critical applications, e.g., automated and assisted driving services, can now be deployed in fog or edge computing environments, offloading energy-consuming tasks from end devices. Besides the proximity, though, the edge computing platform must provide the necessary operation techniques in order to avoid added delays by all means. In this paper, we propose an integrated edge platform that comprises orchestration methods with such objectives, in terms of handling the deployment of both functions and data. We show how the integration of the function orchestration solution with the adaptive data placement of a distributed key–value store can lead to decreased end-to-end latency even when the mobility of end devices creates a dynamic set of requirements. Along with the necessary monitoring features, the proposed edge platform is capable
of serving the nomad users of novel applications with low latency requirements. We showcase this capability in several scenarios, in which we articulate the end-to-end latency performance of our platform by comparing delay measurements with the benchmark of a Redis-based setup lacking the
adaptive nature of data orchestration. Our results prove that the stringent delay requisites necessitate
the close integration that we present in this paper: functions and data must be orchestrated in sync in
order to fully exploit the potential that the proximity of edge resources enables
FSRM-STS: Cross-dataset pedestrian retrieval based on a four-stage retrieval model with Selection–Translation–Selection
Pedestrian retrieval is widely used in intelligent video surveillance and is closely related to people’s lives. Although pedestrian retrieval from a single dataset has improved in recent years, obstacles such as a lack of sample data, domain gaps within and between datasets (arising from factors such as variation in lighting conditions, resolution, season and background etc.), reduce the generalizability of existing models. Factors such as these can act as barriers to the practical use of this technology. Cross-dataset learning is a way to obtain high-quality images from source datasets and can assist the learning of target datasets, thus helping to address the above problem. Existing studies of cross-dataset learning directly apply translated images from source datasets to target datasets, and seldom consider systematic strategies for further improving the quality of the translated images. There is therefore room for improvement in cross-dataset learning. This paper proposes a four-stage retrieval model based on Selection–Translation–Selection (FSRM-STS), to help address this problem. In the first stage of the model, images in pedestrian retrieval datasets are semantically segmented to provide information for image-translation. In the second stage, STS is proposed, based on four strategies to obtain high quality translation results from all source datasets and to generate auxiliary datasets. In the third stage, a pedestrian feature extraction model is proposed, based on both the auxiliary and target datasets. This converts each image in target datasets into an n-dimensional vector. In the final stage, the extracted image vectors are used for cross-dataset pedestrian retrieval. As the translation quality is improved, FSRM-STS achieves promising results for the cross-dataset pedestrian retrieval. Experimental results on four benchmark datasets Market-1501, DukeMTMC-reID, CUHK03 and VIPeR show the effectiveness of the proposed model. Finally, the use of parallel computing for accelerating the training speed and for realizing online applications is also discussed. A primary demo based on cloud computing is designed to verify the engineering solution in the future