Search CORE

24,841 research outputs found

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

A Framework for Developing Real-Time OLAP algorithm using Multi-core processing and GPU: Heterogeneous Computing

Author: Alzeini H I
Habaebi M H
Hameed Sh A
Publication venue
Publication date: 01/12/2013
Field of study

The overwhelmingly increasing amount of stored data has spurred researchers seeking different methods in order to optimally take advantage of it which mostly have faced a response time problem as a result of this enormous size of data. Most of solutions have suggested materialization as a favourite solution. However, such a solution cannot attain Real- Time answers anyhow. In this paper we propose a framework illustrating the barriers and suggested solutions in the way of achieving Real-Time OLAP answers that are significantly used in decision support systems and data warehouses

arXiv.org e-Print Archive

Crossref

The International Islamic University Malaysia Repository

Cloud Storage and Bioinformatics in a private cloud deployment: Lessons for Data Intensive research

Author: Chang Victor
Walters Robert John
Wills Gary
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

This paper describes service portability for a private cloud deployment, including a detailed case study about Cloud Storage and bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). Our Cloud Storage design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture and user support. Experiments for data services (backup automation, data recovery and data migration) are performed and results confirm backup automation is completed swiftly and is reliable for data-intensive research. The data recovery result confirms that execution time is in proportion to quantity of recovered data, but the failure rate increases in an exponential manner. The data migration result confirms execution time is in proportion to disk volume of migrated data, but again the failure rate increases in an exponential manner. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules and simulations for medical training. Our Cloud Storage solution described here offers cost reduction, time-saving and user friendliness

Southampton (e-Prints Soton)

Crossref

Teeside University's Research Repository

Cloud Bioinformatics in a private cloud deployment

Author: Chang Victor
Publication venue: 'IGI Global'
Publication date: 15/11/2013
Field of study

This chapter describes service portability for a private cloud deployment, including a detailed case study about Cloud Bioinformatics services developed as part of the Cloud Computing Adoption Framework (CCAF). The Cloud Bioinformatics design and deployment is based on Storage Area Network (SAN) technologies, details of which include functionalities, technical implementation, architecture, and user support. Bioinformatics applications are written on the SAN-based private cloud, which can simulate complex biological sciences and present them in a way that anyone without prior knowledge can understand. Several bioinformatics results are discussed, particularly brain segmentation, which demonstrates different parts of the brain simulated by the private cloud. In addition, benefits of CCAF are illustrated using several bioinformatics examples such as tumour modelling, brain imaging, insulin molecules, and simulations for medical training. The Cloud Bioinformatics solution offers cost reduction, time-saving, and user friendliness. </jats:p

Southampton (e-Prints Soton)

Crossref