372,966 research outputs found
Towards An Efficient Cloud Computing System: Data Management, Resource Allocation and Job Scheduling
Cloud computing is an emerging technology in distributed computing, and it has proved to be an effective infrastructure to provide services to users. Cloud is developing day by day and faces many challenges. One of challenges is to build cost-effective data management system that can ensure high data availability while maintaining consistency. Another challenge in cloud is efficient resource allocation which ensures high resource utilization and high SLO availability. Scheduling, referring to a set of policies to control the order of the work to be performed by a computer system, for high throughput is another challenge. In this dissertation, we study how to manage data and improve data availability while reducing cost (i.e., consistency maintenance cost and storage cost); how to efficiently manage the resource for processing jobs and increase the resource utilization with high SLO availability; how to design an efficient scheduling algorithm which provides high throughput, low overhead while satisfying the demands on completion time of jobs.
Replication is a common approach to enhance data availability in cloud storage systems. Previously proposed replication schemes cannot effectively handle both correlated and non-correlated machine failures while increasing the data availability with the limited resource. The schemes for correlated machine failures must create a constant number of replicas for each data object, which neglects diverse data popularities and cannot utilize the resource to maximize the expected data availability. Also, the previous schemes neglect the consistency maintenance cost and the storage cost caused by replication. It is critical for cloud providers to maximize data availability hence minimize SLA (Service Level Agreement) violations while minimize cost caused by replication in order to maximize the revenue. In this dissertation, we build a nonlinear programming model to maximize data availability in both types of failures and minimize the cost caused by replication. Based on the model\u27s solution for the replication degree of each data object, we propose a low-cost multi-failure resilient replication scheme (MRR). MRR can effectively handle both correlated and non-correlated machine failures, considers data popularities to enhance data availability, and also tries to minimize consistency maintenance and storage cost.
In current cloud, providers still need to reserve resources to allow users to scale on demand. The capacity offered by cloud offerings is in the form of pre-defined virtual machine (VM) configurations. This incurs resource wastage and results in low resource utilization when the users actually consume much less resource than the VM capacity. Existing works either reallocate the unused resources with no Service Level Objectives (SLOs) for availability\footnote{Availability refers to the probability of an allocated resource being remain operational and accessible during the validity of the contract~\cite{CarvalhoCirne14}.} or consider SLOs to reallocate the unused resources for long-running service jobs. This approach increases the allocated resource whenever it detects that SLO is violated in order to achieve SLO in the long term, neglecting the frequent fluctuations of jobs\u27 resource requirements in real-time application especially for short-term jobs that require fast responses and decision making for resource allocation. Thus, this approach cannot fully utilize the resources to process data because they cannot quickly adjust the resource allocation strategy dealing with the fluctuations of jobs\u27 resource requirements. What\u27s more, the previous opportunistic based resource allocation approach aims at providing long-term availability SLOs with good QoS for long-running jobs, which ensures that the jobs can be finished within weeks or months by providing slighted degraded resources with moderate availability guarantees, but it ignores deadline constraints in defining Quality of Service (QoS) for short-lived jobs requiring online responses in real-time application, thus it cannot truly guarantee the QoS and long-term availability SLOs. To overcome the drawbacks of previous works, we adequately consider the fluctuations of unused resource caused by bursts of jobs\u27 resource demands, and present a cooperative opportunistic resource provisioning (CORP) scheme to dynamically allocate the resource to jobs. CORP leverages complementarity of jobs\u27 requirements on different resource types and utilizes the job packing to reduce the resource wastage and increase the resource utilization.
An increasing number of large-scale data analytics frameworks move towards larger degrees of parallelism aiming at high throughput. Scheduling that assigns tasks to workers and preemption that suspends low-priority tasks and runs high-priority tasks are two important functions in such frameworks. There are many existing works on scheduling and preemption in literature to provide high throughput. However, previous works do not substantially consider dependency in increasing throughput in scheduling or preemption. Considering dependency is crucial to increase the overall throughput. Besides, extensive task evictions for preemption increase context switches, which may decrease the throughput. To address the above problems, we propose an efficient scheduling system Dependency-aware Scheduling and Preemption (DSP) to achieve high throughput in scheduling and preemption. First, we build a mathematical model to minimize the makespan with the consideration of task dependency, and derive the target workers for tasks which can minimize the makespan; second, we utilize task dependency information to determine tasks\u27 priorities for preemption; finally, we present a probabilistic based preemption to reduce the numerous preemptions, while satisfying the demands on completion time of jobs.
We conduct trace driven simulations on a real-cluster and real-world experiments on Amazon S3/EC2 to demonstrate the efficiency and effectiveness of our proposed system in comparison with other systems. The experimental results show the superior performance of our proposed system. In the future, we will further consider data update frequency to reduce consistency maintenance cost, and we will consider the effects of node joining and node leaving. Also we will consider energy consumption of machines and design an optimal replication scheme to improve data availability while saving power. For resource allocation, we will consider using the greedy approach for deep learning to reduce the computation overhead caused by the deep neural network. Also, we will additionally consider the heterogeneity of jobs (i.e., short jobs and long jobs), and use a hybrid resource allocation strategy to provide SLO availability customization for different job types while increasing the resource utilization. For scheduling, we will aim to handle scheduling tasks with partial dependency, worker failures in scheduling and make our DSP fully distributed to increase its scalability. Finally, we plan to use different workloads and real-world experiment to fully test the performance of our methods and make our preliminary system design more mature
Trade-off between end-to-end reliable and cost-effective TDMA/WDM passive optical networks
Hybrid TDMA/VVDM (TWDM) Passive Optical Network (PON) is a promising candidate for Next-Generation PON (NG-PON) solutions. We propose end-to end reliable architectures for business users and a cost-effective network for residential users. We evaluate the proposed reliable architectures in terms of protection coverage, connection availability, impact of failure (i.e. to avoid a huge number of end users being affected by any single failure) and cost in different populated scenarios
When the Hammer Meets the Nail: Multi-Server PIR for Database-Driven CRN with Location Privacy Assurance
We show that it is possible to achieve information theoretic location privacy
for secondary users (SUs) in database-driven cognitive radio networks (CRNs)
with an end-to-end delay less than a second, which is significantly better than
that of the existing alternatives offering only a computational privacy. This
is achieved based on a keen observation that, by the requirement of Federal
Communications Commission (FCC), all certified spectrum databases synchronize
their records. Hence, the same copy of spectrum database is available through
multiple (distinct) providers. We harness the synergy between multi-server
private information retrieval (PIR) and database- driven CRN architecture to
offer an optimal level of privacy with high efficiency by exploiting this
observation. We demonstrated, analytically and experimentally with deployments
on actual cloud systems that, our adaptations of multi-server PIR outperform
that of the (currently) fastest single-server PIR by a magnitude of times with
information theoretic security, collusion resiliency, and fault-tolerance
features. Our analysis indicates that multi-server PIR is an ideal
cryptographic tool to provide location privacy in database-driven CRNs, in
which the requirement of replicated databases is a natural part of the system
architecture, and therefore SUs can enjoy all advantages of multi-server PIR
without any additional architectural and deployment costs.Comment: 10 pages, double colum
Meta Learning MPC using Finite-Dimensional Gaussian Process Approximations
Data availability has dramatically increased in recent years, driving
model-based control methods to exploit learning techniques for improving the
system description, and thus control performance. Two key factors that hinder
the practical applicability of learning methods in control are their high
computational complexity and limited generalization capabilities to unseen
conditions. Meta-learning is a powerful tool that enables efficient learning
across a finite set of related tasks, easing adaptation to new unseen tasks.
This paper makes use of a meta-learning approach for adaptive model predictive
control, by learning a system model that leverages data from previous related
tasks, while enabling fast fine-tuning to the current task during closed-loop
operation. The dynamics is modeled via Gaussian process regression and,
building on the Karhunen-Lo{\`e}ve expansion, can be approximately reformulated
as a finite linear combination of kernel eigenfunctions. Using data collected
over a set of tasks, the eigenfunction hyperparameters are optimized in a
meta-training phase by maximizing a variational bound for the log-marginal
likelihood. During meta-testing, the eigenfunctions are fixed, so that only the
linear parameters are adapted to the new unseen task in an online adaptive
fashion via Bayesian linear regression, providing a simple and efficient
inference scheme. Simulation results are provided for autonomous racing with
miniature race cars adapting to unseen road conditions
Protection strategies for next generation passive optical networks -2
Next Generation Passive Optical Networks-2 (NGPON2) are being considered to upgrade the current PON technology to meet the ever increasing bandwidth requirements of the end users while optimizing the network operators' investment. Reliability performance of NG-PON2 is very important due to the extended reach and, consequently, large number of served customers per PON segment. On the other hand, the use of more complex and hence more failure prone components than in the current PON systems may degrade reliability performance of the network. Thus designing reliable NG-PON2 architectures is of a paramount importance. Moreover, for appropriately evaluating network reliability performance, new models are required. For example, the commonly used reliability parameter, i.e., connection availability, defined as the percentage of time for which a connection remains operable, doesn't reflect the network wide reliability performance. The network operators are often more concerned about a single failure affecting a large number of customers than many uncorrelated failures disconnecting fewer customers while leading to the same average failure time. With this view, we introduce a new parameter for reliability performance evaluation, referred to as the failure impact. In this paper, we propose several reliable architectures for two important NGPON2 candidates: wavelength division multiplexed (WDM) PON and time and wavelength division multiplexed (TWDM) PON. Furthermore, we evaluate protection coverage, availability, failure impact and cost of the proposed schemes in order to identify the most efficient protection architecture
A secure data outsourcing scheme based on Asmuth – Bloom secret sharing
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Data outsourcing is an emerging paradigm for data management in which a database is provided as a service by third-party service providers. One of the major benefits of offering database as a service is to provide organisations, which are unable to purchase expensive hardware and software to host their databases, with efficient data storage accessible online at a cheap rate. Despite that, several issues of data confidentiality, integrity, availability and efficient indexing of users’ queries at the server side have to be addressed in the data outsourcing paradigm. Service providers have to guarantee that their clients’ data are secured against internal (insider) and external attacks. This paper briefly analyses the existing indexing schemes in data outsourcing and highlights their advantages and disadvantages. Then, this paper proposes a secure data outsourcing scheme based on Asmuth–Bloom secret sharing which tries to address the issues in data outsourcing such as data confidentiality, availability and order preservation for efficient indexing
- …