Search CORE

270 research outputs found

Optimizing Data Placement for Cost Effective and High Available Multi-Cloud Storage

Author: Chen Zhen
Liu Wenqiang
Wang Pengwei
Zhang Zhaohui
Zhao Caihui
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 29/02/2020
Field of study

With the advent of big data age, data volume has been changed from trillionbyte to petabyte with incredible speed. Owing to the fact that cloud storage offers the vision of a virtually infinite pool of storage resources, data can be stored and accessed with high scalability and availability. But a single cloud-based data storage has risks like vendor lock-in, privacy leakage, and unavailability. Multi-cloud storage can mitigate these risks with geographically located cloud storage providers. In this storage scheme, one important challenge is how to place a user's data cost-effectively with high availability. In this paper, an architecture for multi-cloud storage is presented. Next, a multi-objective optimization problem is defined to minimize total cost and maximize data availability simultaneously, which can be solved by an approach based on the non-dominated sorting genetic algorithm II (NSGA-II) and obtain a set of non-dominated solutions called the Pareto-optimal set. Then, a method is proposed which is based on the entropy method to determine the most suitable solution for users who cannot choose one from the Pareto-optimal set directly. Finally, the performance of the proposed algorithm is validated by extensive experiments based on real-world multiple cloud storage scenarios

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

A survey and classification of software-defined storage systems

Author: Alysson Bessani
Angel Sebastian
Anwar Ali
Anwar Ali
Belaramani Nalini M.
Belay Adam
Carl
Cully Brendan
Frank
Ghodsi Ali
Gracia-Tinedo Raúl
Gulati Ajay
Gulati Ajay
Hat Red
Hsu Chin-Jung
Hunt Patrick
José Pereira
João Paulo
Kim Hyeong-Jun
Klimovic Ana
Koponen Teemu
Li Ning
Lumb Christopher R.
Mace Jonathan
Mesnier Michael
Murugan Muthukumar
Ongaro Diego
Peter Simon
Qian Yingjin
Raghavan Ajaykrishna
Ricardo Macedo
Riedel Erik
Schroeder Bianca
Schwan Philip
Seshadri Sudharsan
Sevilla Michael A.
Shan Yizhou
Shue David
Shue David
Soheil
Song Huaiming
Stefanovici Ioan
Weil Sage A.
Wires Jake
Yang Bin
Yang Suli
Zhang Xuechen
Zhu Timothy
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.This work was financed by the Portuguese funding agency FCT-Fundacao para a Ciencia e a Tecnologia through national funds, the PhD grant SFRH/BD/146059/2019, the project ThreatAdapt (FCT-FNR/0002/2018), the LASIGE Research Unit (UIDB/00408/2020), and cofunded by the FEDER, where applicable

Universidade do Minho: RepositoriUM

Crossref

An optimal VM Placement, Energy Efficient and SLA at Cloud Environment - A Comparative Analysis

Author: Md. Rafeeq, Dr. C. Sunil Kumar, Dr. N. Subhash Chandra
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/03/2018
Field of study

In the cloud computing framework, computing resources can be increased or decreased in response to the users’ different application loads. The data is stored and the applications are running on the servers in the clouds. Users do not have to worry about lost or corrupt data. The clouds can distribute computing resources according to the users’ needs or preferences to provide fl exible management. Users do not have to buy expensive computing devices. They only need to pay for the computing services provided by the clouds. Cloud computing provides a platform for computational experiments with abundant computing and storage resources. The system can be considered as a whole and the control and management decisions are sent as services to agents. The challenge in the present study is to reduce energy consumption thus guarantee Service Level Agreement (SLA) at its highest level

International Journal on Recent and Innovation Trends in Computing and Communication

CloudJet4BigData: Streamlining Big Data via an Accelerated Socket Interface

Author: Dimitrakos Theo
Helian Na
Li Ling
Wang Frank Zhigang
Wu Sining
Yates Rodric
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2014
Field of study

Big data needs to feed users with fresh processing results and cloud platforms can be used to speed up big data applications. This paper describes a new data communication protocol (CloudJet) for long distance and large volume big data accessing operations to alleviate the large latencies encountered in sharing big data resources in the clouds. It encapsulates a dynamic multi-stream/multi-path engine at the socket level, which conforms to Portable Operating System Interface (POSIX) and thereby can accelerate any POSIX-compatible applications across IP based networks. It was demonstrated that CloudJet accelerates typical big data applications such as very large database (VLDB), data mining, media streaming and office applications by up to tenfold in real-world tests

Crossref

Kent Academic Repository

A manifesto for future generation cloud computing::research directions for the next decade

Author: Buyya Rajkumar
Srirama Satish Narayana
Publication venue
Publication date: 01/11/2018
Field of study

University of Birmingham Research Portal

From security to assurance in the cloud: a survey

Author: C.A. Ardagna
E. Damiani
Q.H. Vu
R. Asal
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2015
Field of study

The cloud computing paradigm has become a mainstream solution for the deployment of business processes and applications. In the public cloud vision, infrastructure, platform, and software services are provisioned to tenants (i.e., customers and service providers) on a pay-as-you-go basis. Cloud tenants can use cloud resources at lower prices, and higher performance and flexibility, than traditional on-premises resources, without having to care about infrastructure management. Still, cloud tenants remain concerned with the cloud's level of service and the nonfunctional properties their applications can count on. In the last few years, the research community has been focusing on the nonfunctional aspects of the cloud paradigm, among which cloud security stands out. Several approaches to security have been described and summarized in general surveys on cloud security techniques. The survey in this article focuses on the interface between cloud security and cloud security assurance. First, we provide an overview of the state of the art on cloud security. Then, we introduce the notion of cloud security assurance and analyze its growing impact on cloud security approaches. Finally, we present some recommendations for the development of next-generation cloud security and assurance solutions

AIR Universita degli studi di Milano

A systematic review on cloud storage mechanisms concerning e-healthcare systems

Author: Ahmad Arshad
Chen Fei
Khan Habib Ullah
Ming Zhong
Nazir Shah
Shafiq Muhammad
Tahir Adnan
Publication venue: 'MDPI AG'
Publication date: 21/09/2020
Field of study

As the expenses of medical care administrations rise and medical services experts are becoming rare, it is up to medical services organizations and institutes to consider the implementation of medical Health Information Technology (HIT) innovation frameworks. HIT permits health associations to smooth out their considerable cycles and offer types of assistance in a more productive and financially savvy way. With the rise of Cloud Storage Computing (CSC), an enormous number of associations and undertakings have moved their healthcare data sources to distributed storage. As the information can be mentioned whenever universally, the accessibility of information becomes an urgent need. Nonetheless, outages in cloud storage essentially influence the accessibility level. Like the other basic variables of cloud storage (e.g., reliability quality, performance, security, and protection), availability also directly impacts the data in cloud storage for e-Healthcare systems. In this paper, we systematically review cloud storage mechanisms concerning the healthcare environment. Additionally, in this paper, the state-of-the-art cloud storage mechanisms are critically reviewed for e-Healthcare systems based on their characteristics. In short, this paper summarizes existing literature based on cloud storage and its impact on healthcare, and it likewise helps researchers, medical specialists, and organizations with a solid foundation for future studies in the healthcare environment.Qatar University [IRCC-2020-009]

Qatar University Institutional Repository

What broke where for distributed and parallel applications — a whodunit story

Author: Mitra Subrata
Publication venue: 'Purdue University (bepress)'
Publication date: 01/01/2016
Field of study

Detection, diagnosis and mitigation of performance problems in today\u27s large-scale distributed and parallel systems is a difficult task. These large distributed and parallel systems are composed of various complex software and hardware components. When the system experiences some performance or correctness problem, developers struggle to understand the root cause of the problem and fix in a timely manner. In my thesis, I address these three components of the performance problems in computer systems. First, we focus on diagnosing performance problems in large-scale parallel applications running on supercomputers. We developed techniques to localize the performance problem for root-cause analysis. Parallel applications, most of which are complex scientific simulations running in supercomputers, can create up to millions of parallel tasks that run on different machines and communicate using the message passing paradigm. We developed a highly scalable and accurate automated debugging tool called PRODOMETER, which uses sophisticated algorithms to first, create a logical progress dependency graph of the tasks to highlight how the problem spread through the system manifesting as a system-wide performance issue. Second, uses this logical progress dependence graph to identify the task where the problem originated. Finally, PRODOMETER pinpoints the code region corresponding to the origin of the bug. Second, we developed a tool-chain that can detect performance anomaly using machine-learning techniques and can achieve very low false positive rate. Our input-aware performance anomaly detection system consists of a scalable data collection framework to collect performance related metrics from different granularity of code regions, an offline model creation and prediction-error characterization technique, and a threshold based anomaly-detection-engine for production runs. Our system requires few training runs and can handle unknown inputs and parameter combinations by dynamically calibrating the anomaly detection threshold according to the characteristics of the input data and the characteristics of the prediction-error of the models. Third, we developed performance problem mitigation scheme for erasure-coded distributed storage systems. Repair operations of the failed blocks in erasure-coded distributed storage system take really long time in networked constrained data-centers. The reason being, during the repair operation for erasure-coded distributed storage, a lot of data from multiple nodes are gathered into a single node and then a mathematical operation is performed to reconstruct the missing part. This process severely congests the links toward the destination where newly recreated data is to be hosted. We proposed a novel distributed repair technique, called Partial-Parallel-Repair (PPR) that performs this reconstruction in parallel on multiple nodes and eliminates network bottlenecks, and as a result, greatly speeds up the repair process. Fourth, we study how for a class of applications, performance can be improved (or performance problems can be mitigated) by selectively approximating some of the computations. For many applications, the main computation happens inside a loop that can be logically divided into a few temporal segments, we call phases. We found that while approximating the initial phases might severely degrade the quality of the results, approximating the computation for the later phases have very small impact on the final quality of the result. Based on this observation, we developed an optimization framework that for a given budget of quality-loss, would find the best approximation settings for each phase in the execution

Purdue E-Pubs

Stealth databases : ensuring user-controlled queries in untrusted cloud environments

Author: Beck Martin
Bohnert Thomas Michael
Schill Alexander
Spillner Josef
Publication venue: IEEE
Publication date: 01/01/2015
Field of study

Sensitive data is increasingly being hosted online in ubiquitous cloud storage services. Recent advances in multi-cloud service integration through provider multiplexing and data dispersion have alleviated most of the associated risks for hosting files which are retrieved by users for further processing. However, for structured data managed in databases, many issues remain, including the need to perform operations directly on the remote data to avoid costly transfers. In this paper, we motivate the need for distributed stealth databases which combine properties from structure-preserving dispersed file storage for capacity-saving increased availability with emerging work on structure-preserving encryption for on-demand increased confidentiality with controllable performance degradation. We contribute an analysis of operators executing in map-reduce or map-carry-reduce phases and derive performance statistics. Our prototype, StealthDB, demonstrates that for typical amounts of personal structured data, stealth databases are a convincing concept for taming untrusted and unsafe cloud environments

ZHAW digitalcollection