4,986 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Waveform Signal Entropy and Compression Study of Whole-Building Energy Datasets
Electrical energy consumption has been an ongoing research area since the
coming of smart homes and Internet of Things devices. Consumption
characteristics and usages profiles are directly influenced by building
occupants and their interaction with electrical appliances. Extracted
information from these data can be used to conserve energy and increase user
comfort levels. Data analysis together with machine learning models can be
utilized to extract valuable information for the benefit of occupants
themselves, power plants, and grid operators. Public energy datasets provide a
scientific foundation to develop and benchmark these algorithms and techniques.
With datasets exceeding tens of terabytes, we present a novel study of five
whole-building energy datasets with high sampling rates, their signal entropy,
and how a well-calibrated measurement can have a significant effect on the
overall storage requirements. We show that some datasets do not fully utilize
the available measurement precision, therefore leaving potential accuracy and
space savings untapped. We benchmark a comprehensive list of 365 file formats,
transparent data transformations, and lossless compression algorithms. The
primary goal is to reduce the overall dataset size while maintaining an
easy-to-use file format and access API. We show that with careful selection of
file format and encoding scheme, we can reduce the size of some datasets by up
to 73%
A Parallel Data Compression Framework for Large Scale 3D Scientific Data
Large scale simulations of complex systems ranging from climate and
astrophysics to crowd dynamics, produce routinely petabytes of data and are
projected to reach the zettabytes level in the coming decade. These simulations
enable unprecedented insights but at the same their effectiveness is hindered
by the enormous data sizes associated with the computational elements and
respective output quantities of interest that impose severe constraints on
storage and I/O time. In this work, we address these challenges through a novel
software framework for scientific data compression. The software (CubismZ)
incorporates efficient wavelet based techniques and the state-of-the-art ZFP,
SZ and FPZIP floating point compressors. The framework relies on a
block-structured data layout, benefits from OpenMP and MPI and targets
supercomputers based on multicores. CubismZ can be used as a tool for ex situ
(offline) compression of scientific datasets and supports conventional
Computational Fluid Dynamics (CFD) file formats. Moreover, it provides a
testbed of comparison, in terms of compression factor and peak signal-to-noise
ratio, for a number of available data compression methods. The software yields
in situ compression ratios of 100x or higher for fluid dynamics data produced
by petascale simulations of cloud cavitation collapse using
grid cells, with negligible impact on the total
simulation time.Comment: 26 pages, 12 figures, open-source softwar
Improving Quality of Service and Reducing Power Consumption with WAN accelerator in Cloud Computing Environments
The widespread use of cloud computing services is expected to deteriorate a
Quality of Service and toincrease the power consumption of ICT devices, since
the distance to a server becomes longer than before. Migration of virtual
machines over a wide area can solve many problems such as load balancing and
power saving in cloud computing environments.
This paper proposes to dynamically apply WAN accelerator within the network
when a virtual machine is moved to a distant center, in order to prevent the
degradation in performance after live migration of virtual machines over a wide
area. mSCTP-based data transfer using different TCP connections before and
after migration is proposed in order to use a currently available WAN
accelerator. This paper does not consider the performance degradation of live
migration itself. Then, this paper proposes to reduce the power consumption of
ICT devices, which consists of installing WAN accelerators as part of cloud
resources actively and increasing the packet transfer rate of communication
link temporarily. It is demonstrated that the power consumption with WAN
accelerator could be reduced to one-tenth of that without WAN accelerator.Comment: 12 pages, International Journal of Computer Networks & Communications
(IJCNC) Vol.5, No.1, January 201
Towards Media Intercloud Standardization Evaluating Impact of Cloud Storage Heterogeneity
Digital media has been increasing very rapidly, resulting in cloud
computing's popularity gain. Cloud computing provides ease of management of
large amount of data and resources. With a lot of devices communicating over
the Internet and with the rapidly increasing user demands, solitary clouds have
to communicate to other clouds to fulfill the demands and discover services
elsewhere. This scenario is called intercloud computing or cloud federation.
Intercloud computing still lacks standard architecture. Prior works discuss
some of the architectural blueprints, but none of them highlight the key issues
involved and their impact, so that a valid and reliable architecture could be
envisioned. In this paper, we discuss the importance of intercloud computing
and present in detail its architectural components. Intercloud computing also
involves some issues. We discuss key issues as well and present impact of
storage heterogeneity. We have evaluated some of the most noteworthy cloud
storage services, namely Dropbox, Amazon CloudDrive, GoogleDrive, Microsoft
OneDrive (formerly SkyDrive), Box, and SugarSync in terms of Quality of
Experience (QoE), Quality of Service (QoS), and storage space efficiency.
Discussion on the results shows the acceptability level of these storage
services and the shortcomings in their design.Comment: 13 pages. 14 figures, Springer Journal of Grid Computing, 201
Analysis of Cloud Storage Information Security and It’s Various Methods
Cloud computing is the latest paradigm in IT field promising trends. It provides the resources similar to accessibility of data, minimum cost and several other uses. But the major issue for cloud is the security of the information which is stored in the cloud. Various methods and specialized techniques are combined together for providing information security to data which is stored in cloud in this paper. The aim of this paper is to analyze various cryptographic techniques and to discuss about various security techniques over cloud and user authentication which is most helpful and useful in the information security over cloud.
DOI: 10.17762/ijritcc2321-8169.15028
TrIMS: Transparent and Isolated Model Sharing for Low Latency Deep LearningInference in Function as a Service Environments
Deep neural networks (DNNs) have become core computation components within
low latency Function as a Service (FaaS) prediction pipelines: including image
recognition, object detection, natural language processing, speech synthesis,
and personalized recommendation pipelines. Cloud computing, as the de-facto
backbone of modern computing infrastructure for both enterprise and consumer
applications, has to be able to handle user-defined pipelines of diverse DNN
inference workloads while maintaining isolation and latency guarantees, and
minimizing resource waste. The current solution for guaranteeing isolation
within FaaS is suboptimal -- suffering from "cold start" latency. A major cause
of such inefficiency is the need to move large amount of model data within and
across servers. We propose TrIMS as a novel solution to address these issues.
Our proposed solution consists of a persistent model store across the GPU, CPU,
local storage, and cloud storage hierarchy, an efficient resource management
layer that provides isolation, and a succinct set of application APIs and
container technologies for easy and transparent integration with FaaS, Deep
Learning (DL) frameworks, and user code. We demonstrate our solution by
interfacing TrIMS with the Apache MXNet framework and demonstrate up to 24x
speedup in latency for image classification models and up to 210x speedup for
large models. We achieve up to 8x system throughput improvement.Comment: In Proceedings CLOUD 201
Vignette: Perceptual Compression for Video Storage and Processing Systems
Compressed videos constitute 70% of Internet traffic, and video upload growth
rates far outpace compute and storage improvement trends. Past work in
leveraging perceptual cues like saliency, i.e., regions where viewers focus
their perceptual attention, reduces compressed video size while maintaining
perceptual quality, but requires significant changes to video codecs and
ignores the data management of this perceptual information.
In this paper, we propose Vignette, a compression technique and storage
manager for perception-based video compression. Vignette complements
off-the-shelf compression software and hardware codec implementations.
Vignette's compression technique uses a neural network to predict saliency
information used during transcoding, and its storage manager integrates
perceptual information into the video storage system to support a perceptual
compression feedback loop. Vignette's saliency-based optimizations reduce
storage by up to 95% with minimal quality loss, and Vignette videos lead to
power savings of 50% on mobile phones during video playback. Our results
demonstrate the benefit of embedding information about the human visual system
into the architecture of video storage systems
Recent Developments in Cloud Based Systems: State of Art
Cloud computing is the new buzzword in the head of the techies round the
clock these days. The importance and the different applications of cloud
computing are overwhelming and thus, it is a topic of huge significance. It
provides several astounding features like Multitenancy, on demand service, pay
per use etc. This manuscript presents an exhaustive survey on cloud computing
technology and potential research issues in cloud computing that needs to be
addressed
Power quality and electromagnetic compatibility: special report, session 2
The scope of Session 2 (S2) has been defined as follows by the Session Advisory Group and the Technical Committee: Power Quality (PQ), with the more general concept of electromagnetic compatibility (EMC) and with some related safety problems in electricity distribution systems.
Special focus is put on voltage continuity (supply reliability, problem of outages) and voltage quality (voltage level, flicker, unbalance, harmonics). This session will also look at electromagnetic compatibility (mains frequency to 150 kHz), electromagnetic interferences and electric and magnetic fields issues. Also addressed in this session are electrical safety and immunity concerns (lightning issues, step, touch and transferred voltages).
The aim of this special report is to present a synthesis of the present concerns in PQ&EMC, based on all selected papers of session 2 and related papers from other sessions, (152 papers in total). The report is divided in the following 4 blocks:
Block 1: Electric and Magnetic Fields, EMC, Earthing systems
Block 2: Harmonics
Block 3: Voltage Variation
Block 4: Power Quality Monitoring
Two Round Tables will be organised:
- Power quality and EMC in the Future Grid (CIGRE/CIRED WG C4.24, RT 13)
- Reliability Benchmarking - why we should do it? What should be done in future? (RT 15
- …