Search CORE

24,525 research outputs found

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX

WiseEye: next generation expandable and programmable camera trap platform for wildlife research

Author: A Ahmadi
AC Burton
AF O'Connell
AS Glen
C Rutz
C Rutz
CL Tan
DE Swann
DJ Welbourne
EH Fegraus
F Rovero
F Rovero
Fabio Verdicchio
G Harris
Gorry Fairhurst
Houbing Song
J Porter
J Xiong
J Yang
JA Ahumada
JD Nichols
JLP Tack
JM Rowcliffe
JR Willmott
KRR Swinnen
KU Karanth
M Abi-Said
M Zárybnická
MW Tobler
Paul Davidson
PD Meek
PD Meek
R Kays
R Van der Wal
R Van der Wal
R. Justin Irvine
René van der Wal
RW Kays
S Hamel
S Nazir
S Newey
Sajid Nazir
Scott Newey
SR Sundarasen
TE Kucera
X Yu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2017
Field of study

Funding: The work was supported by the RCUK Digital Economy programme to the dot.rural Digital Economy Hub; award reference: EP/G066051/1. The work of S. Newey and RJI was part funded by the Scottish Government's Rural and Environment Science and Analytical Services (RESAS). Details published as an Open Source Toolkit, PLOS Journals at: http://dx.doi.org/10.1371/journal.pone.0169758Peer reviewedPublisher PD

Aberdeen University Research

Public Library of Science (PLOS)

Crossref

Brage INN

Directory of Open Access Journals

PubMed Central

NORA - Norwegian Open Research Archives

ResearchOnline@GCU

FigShare

Review of the mathematical foundations of data fusion techniques in surface metrology

Author: Ai C
Barker R M
Boudjemaa R
De Boor C
D𝚤ez D C
Fabio R
GmbH W
Golub G H
Hansen C D
Hao C
Hemsley R
Inhull
ISO 25178-6
ISO 25178-604
ISO/DIS 25178-606
JDL
Jiang X
Kjer H M
Leica Microsystems
Low K-L
Mathworks
Raid I
Ramasamy S K
Rasmussen C E
Rasmussen C E
Schabenberger O
Schmit J
Seeger S
Shekhar S
Strutz T
Sun W
Wang J
Weisstein E W
Yamany S M
Yan-Mei C
Publication venue: 'IOP Publishing'
Publication date: 01/04/2015
Field of study

The recent proliferation of engineered surfaces, including freeform and structured surfaces, is challenging current metrology techniques. Measurement using multiple sensors has been proposed to achieve enhanced benefits, mainly in terms of spatial frequency bandwidth, which a single sensor cannot provide. When using data from different sensors, a process of data fusion is required and there is much active research in this area. In this paper, current data fusion methods and applications are reviewed, with a focus on the mathematical foundations of the subject. Common research questions in the fusion of surface metrology data are raised and potential fusion algorithms are discussed

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

State of The Art and Hot Aspects in Cloud Data Storage Security

Author: Gehrmann Christian
Morenius Fredric
Paladi Nicolae
Publication venue: Swedish Institute of Computer Science
Publication date: 01/01/2013
Field of study

Along with the evolution of cloud computing and cloud storage towards matu- rity, researchers have analyzed an increasing range of cloud computing security aspects, data security being an important topic in this area. In this paper, we examine the state of the art in cloud storage security through an overview of selected peer reviewed publications. We address the question of defining cloud storage security and its different aspects, as well as enumerate the main vec- tors of attack on cloud storage. The reviewed papers present techniques for key management and controlled disclosure of encrypted data in cloud storage, while novel ideas regarding secure operations on encrypted data and methods for pro- tection of data in fully virtualized environments provide a glimpse of the toolbox available for securing cloud storage. Finally, new challenges such as emergent government regulation call for solutions to problems that did not receive enough attention in earlier stages of cloud computing, such as for example geographical location of data. The methods presented in the papers selected for this review represent only a small fraction of the wide research effort within cloud storage security. Nevertheless, they serve as an indication of the diversity of problems that are being addressed

CiteSeerX

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Peer-To-Peer Backup for Personal Area Networks

Author: Borriello Gaetano
LaMarca Anthony
Loo Boon Thau
Publication venue: ScholarlyCommons
Publication date: 01/05/2003
Field of study

FlashBack is a peer-to-peer backup algorithm designed for power-constrained devices running in a personal area network (PAN). Backups are performed transparently as local updates initiate the spread of backup data among a subset of the currently available peers. Flashback limits power usage by avoiding flooding and keeping small neighbor sets. Flashback has also been designed to utilize powered infrastructure when possible to further extend device lifetime. We propose our architecture and algorithms, and present initial experimental results that illustrate FlashBack’s performance characteristic

ScholarlyCommons@Penn

NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Applications, volume 2

Author: Blasso L. G.
Hariharan P. C.
Kobler Ben
Publication venue
Publication date
Field of study

This report contains copies of nearly all of the technical papers and viewgraphs presented at the NSSDC Conference on Mass Storage Systems and Technologies for Space and Earth Science Application. This conference served as a broad forum for the discussion of a number of important issues in the field of mass storage systems. Topics include the following: magnetic disk and tape technologies; optical disk and tape; software storage and file management systems; and experiences with the use of a large, distributed storage system. The technical presentations describe, among other things, integrated mass storage systems that are expected to be available commercially. Also included is a series of presentations from Federal Government organizations and research institutions covering their mass storage requirements for the 1990's

NASA Technical Reports Server