4 research outputs found
ComFlux: External Composition and Adaptation of Pervasive Applications
Technology is becoming increasingly pervasive. At present, the system
components working together to provide functionality, be they purely software
or with a physical element, tend to operate within silos, bound to a particular
application or usage.
This is counter to the wider vision of pervasive computing, where a
potentially limitless number of applications can be realised through the
dynamic and seamless interactions of system components. We believe this
application composition should be externally controlled, driven by policy and
subject to access control. We present ComFlux, our open source middleware, and
show through a number of designs and implementations, how it supports this
functionality with acceptable overhead
Enabling efficient application monitoring in cloud data centers using SDN
Software Defined Networking (SDN) not only enables agility through the
realization of part of the network functionality in software but also
facilitates offering advanced features at the network layer. Hence, SDN can
support a wide range of middleware services; network performance monitoring is
an example of these services that are already deployed in practice. In this
paper, we exploit the use of SDNs to efficiently provide application monitoring
functionality. The recent rise of complex cloud applications has made
performance monitoring a major issue. We show that many performance indicators
can be inferred from messages exchanged among application components. By
analyzing these messages, we argue that the overhead of performance monitoring
could be effectively moved from the end hosts into the SDN middleware of the
cloud infrastructure which enables more flexible placement of logging
functionality. This paper explores several approaches for supporting
application monitoring through SDN. In particular, we combine selective
forwarding in SDN to enable message filtering and reformatting, and propose a
customized port sniffing technique. We describe the implementation of the
approach within the standard SDN software, namely OVS. We further provide a
comprehensive performance evaluation to analyze advantages and disadvantages of
our approach, and highlight the trade-offs
Effective Elastic Scaling of Deep Learning Workloads
The increased use of deep learning (DL) in academia, government and industry
has, in turn, led to the popularity of on-premise and cloud-hosted deep
learning platforms, whose goals are to enable organizations utilize expensive
resources effectively, and to share said resources among multiple teams in a
fair and effective manner.
In this paper, we examine the elastic scaling of Deep Learning (DL) jobs over
large-scale training platforms and propose a novel resource allocation strategy
for DL training jobs, resulting in improved job run time performance as well as
increased cluster utilization. We begin by analyzing DL workloads and exploit
the fact that DL jobs can be run with a range of batch sizes without affecting
their final accuracy. We formulate an optimization problem that explores a
dynamic batch size allocation to individual DL jobs based on their scaling
efficiency, when running on multiple nodes. We design a fast dynamic
programming based optimizer to solve this problem in real-time to determine
jobs that can be scaled up/down, and use this optimizer in an autoscaler to
dynamically change the allocated resources and batch sizes of individual DL
jobs.
We demonstrate empirically that our elastic scaling algorithm can complete up
to as many jobs as compared to a strong baseline algorithm
that also scales the number of GPUs but does not change the batch size. We also
demonstrate that the average completion time with our algorithm is up to
faster than that of the baseline
Authenticated Key-Value Stores with Hardware Enclaves
Authenticated data storage on an untrusted platform is an important computing
paradigm for cloud applications ranging from big-data outsourcing, to
cryptocurrency and certificate transparency log. These modern applications
increasingly feature update-intensive workloads, whereas existing authenticated
data structures (ADSs) designed with in-place updates are inefficient to handle
such workloads. In this paper, we address this issue and propose a novel
authenticated log-structured merge tree (eLSM) based key-value store by
leveraging Intel SGX enclaves.
We present a system design that runs the code of eLSM store inside enclave.
To circumvent the limited enclave memory (128 MB with the latest Intel CPUs),
we propose to place the memory buffer of the eLSM store outside the enclave and
protect the buffer using a new authenticated data structure by digesting
individual LSM-tree levels. We design protocols to support query authentication
in data integrity, completeness (under range queries), and freshness. The proof
in our protocol is made small by including only the Merkle proofs at selective
levels.
We implement eLSM on top of Google LevelDB and Facebook RocksDB with minimal
code change and performance interference. We evaluate the performance of eLSM
under the YCSB workload benchmark and show a performance advantage of up to
4.5X speedup.Comment: eLSM, Enclave, key-value store, ADS, 18 page