11,431 research outputs found
A Comparative Study of Association Rule Mining Algorithms on Grid and Cloud Platform
Association rule mining is a time consuming process due to involving both
data intensive and computation intensive nature. In order to mine large volume
of data and to enhance the scalability and performance of existing sequential
association rule mining algorithms, parallel and distributed algorithms are
developed. These traditional parallel and distributed algorithms are based on
homogeneous platform and are not lucrative for heterogeneous platform such as
grid and cloud. This requires design of new algorithms which address the issues
of good data set partition and distribution, load balancing strategy,
optimization of communication and synchronization technique among processors in
such heterogeneous system. Grid and cloud are the emerging platform for
distributed data processing and various association rule mining algorithms have
been proposed on such platforms. This survey article integrates the brief
architectural aspect of distributed system, various recent approaches of grid
based and cloud based association rule mining algorithms with comparative
perception. We differentiate between approaches of association rule mining
algorithms developed on these architectures on the basis of data locality,
programming paradigm, fault tolerance, communication cost, partition and
distribution of data sets. Although it is not complete in order to cover all
algorithms, yet it can be very useful for the new researchers working in the
direction of distributed association rule mining algorithms.Comment: 8 pages, preprin
IBBE-SGX: Cryptographic Group Access Control using Trusted Execution Environments
While many cloud storage systems allow users to protect their data by making
use of encryption, only few support collaborative editing on that data. A major
challenge for enabling such collaboration is the need to enforce cryptographic
access control policies in a secure and efficient manner. In this paper, we
introduce IBBE-SGX, a new cryptographic access control extension that is
efficient both in terms of computation and storage even when processing large
and dynamic workloads of membership operations, while at the same time offering
zero knowledge guarantees. IBBE-SGX builds upon Identity-Based Broadcasting
Encryption (IBBE). We address IBBE's impracticality for cloud deployments by
exploiting Intel Software Guard Extensions (SGX) to derive cuts in the
computational complexity. Moreover, we propose a group partitioning mechanism
such that the computational cost of membership update is bound to a fixed
constant partition size rather than the size of the whole group. We have
implemented and evaluated our new access control extension. Results highlight
that IBBE-SGX performs membership changes 1.2 orders of magnitude faster than
the traditional approach of Hybrid Encryption (HE), producing group metadata
that are 6 orders of magnitude smaller than HE, while at the same time offering
zero knowledge guarantees
Consistency models in distributed systems: A survey on definitions, disciplines, challenges and applications
The replication mechanism resolves some challenges with big data such as data
durability, data access, and fault tolerance. Yet, replication itself gives
birth to another challenge known as the consistency in distributed systems.
Scalability and availability are the challenging criteria on which the
replication is based upon in distributed systems which themselves require the
consistency. Consistency in distributed computing systems has been employed in
three different applicable fields, such as system architecture, distributed
database, and distributed systems. Consistency models based on their
applicability could be sorted from strong to weak. Our goal is to propose a
novel viewpoint to different consistency models utilized in the distributed
systems. This research proposes two different categories of consistency models.
Initially, consistency models are categorized into three groups of
data-centric, client-centric and hybrid models. Each of which is then grouped
into three subcategories of traditional, extended, and novel consistency
models. Consequently, the concepts and procedures are expressed in mathematical
terms, which are introduced in order to present our models' behavior without
implementation. Moreover, we have surveyed different aspects of challenges with
respect to the consistency i.e., availability, scalability, security, fault
tolerance, latency, violation, and staleness, out of which the two latter i.e.
violation and staleness, play the most pivotal roles in terms of consistency
and trade-off balancing. Finally, the contribution extent of each of the
consistency models and the growing need for them in distributed systems are
investigated.Comment: 52 pages, 13 figure
Internet of Things: An Overview
As technology proceeds and the number of smart devices continues to grow
substantially, need for ubiquitous context-aware platforms that support
interconnected, heterogeneous, and distributed network of devices has given
rise to what is referred today as Internet-of-Things. However, paving the path
for achieving aforementioned objectives and making the IoT paradigm more
tangible requires integration and convergence of different knowledge and
research domains, covering aspects from identification and communication to
resource discovery and service integration. Through this chapter, we aim to
highlight researches in topics including proposed architectures, security and
privacy, network communication means and protocols, and eventually conclude by
providing future directions and open challenges facing the IoT development.Comment: Keywords: Internet of Things; IoT; Web of Things; Cloud of Thing
FogStore: Toward a Distributed Data Store for Fog Computing
Stateful applications and virtualized network functions (VNFs) can benefit
from state externalization to increase their reliability, scalability, and
inter-operability. To keep and share the externalized state, distributed data
stores (DDSs) are a powerful tool allowing for the management of classical
trade-offs in consistency, availability and partitioning tolerance. With the
advent of Fog and Edge Computing, stateful applications and VNFs are pushed
from the data centers toward the network edge. This poses new challenges on
DDSs that are tailored to a deployment in Cloud data centers. In this paper, we
propose two novel design goals for DDSs that are tailored to Fog Computing: (1)
Fog-aware replica placement, and (2) context-sensitive differential
consistency. To realize those design goals on top of existing DDSs, we propose
the FogStore system. FogStore manages the needed adaptations in replica
placement and consistency management transparently, so that existing DDSs can
be plugged into the system. To show the benefits of FogStore, we perform a set
of evaluations using the Yahoo Cloud Serving Benchmark.Comment: To appear in Proceedings of 2017 IEEE Fog World Congress (FWC '17
A Survey on Large Scale Metadata Server for Big Data Storage
Big Data is defined as high volume of variety of data with an exponential
data growth rate. Data are amalgamated to generate revenue, which results a
large data silo. Data are the oils of modern IT industries. Therefore, the data
are growing at an exponential pace. The access mechanism of these data silos
are defined by metadata. The metadata are decoupled from data server for
various beneficial reasons. For instance, ease of maintenance. The metadata are
stored in metadata server (MDS). Therefore, the study on the MDS is mandatory
in designing of a large scale storage system. The MDS requires many parameters
to augment with its architecture. The architecture of MDS depends on the demand
of the storage system's requirements. Thus, MDS is categorized in various ways
depending on the underlying architecture and design methodology. The article
surveys on the various kinds of MDS architecture, designs, and methodologies.
This article emphasizes on clustered MDS (cMDS) and the reports are prepared
based on a) Bloom filterbased MDS, b) Clientfunded MDS, c) Geoaware
MDS, d) Cacheaware MDS, e) Loadaware MDS, f) Hashbased MDS, and g)
Treebased MDS. Additionally, the article presents the issues and challenges
of MDS for mammoth sized data.Comment: Submitted to ACM for possible publicatio
A Security Reference Architecture for Blockchains
Due to their interesting features, blockchains have become popular in recent
years. They are full-stack systems where security is a critical factor for
their success. The main focus of this work is to systematize knowledge about
security and privacy issues of blockchains. To this end, we propose a security
reference architecture based on models that demonstrate the stacked hierarchy
of various threats (similar to the ISO/OSI hierarchy) as well as threat-risk
assessment using ISO/IEC 15408. In contrast to the previous surveys, we focus
on the categorization of security incidents based on their origins and using
the proposed architecture we present existing prevention and mitigation
techniques. The scope of our work mainly covers aspects related to the
decentralized nature of blockchains, while we mention common operational
security issues and countermeasures only tangentially
The CTTC 5G end-to-end experimental platform: Integrating heterogeneous wireless/optical networks, distributed cloud, and IoT devices
The Internet of Things (IoT) will facilitate a wide variety of applications
in different domains, such as smart cities, smart grids, industrial automation
(Industry 4.0), smart driving, assistance of the elderly, and home automation.
Billions of heterogeneous smart devices with different application requirements
will be connected to the networks and will generate huge aggregated volumes of
data that will be processed in distributed cloud infrastructures. On the other
hand, there is also a general trend to deploy functions as software (SW)
instances in cloud infrastructures [e.g., network function virtualization (NFV)
or mobile edge computing (MEC)]. Thus, the next generation of mobile networks,
the fifth-generation (5G), will need not only to develop new radio interfaces
or waveforms to cope with the expected traffic growth but also to integrate
heterogeneous networks from end to end (E2E) with distributed cloud resources
to deliver E2E IoT and mobile services. This article presents the E2E 5G
platform that is being developed by the Centre Tecnol\`ogic de
Telecomunicacions de Catalunya (CTTC), the first known platform capable of
reproducing such an ambitious scenario
Pilot-Data: An Abstraction for Distributed Data
Scientific problems that depend on processing large amounts of data require
overcoming challenges in multiple areas: managing large-scale data
distribution, controlling co-placement and scheduling of data with compute
resources, and storing, transferring, and managing large volumes of data.
Although there exist multiple approaches to addressing each of these
challenges, an integrative approach is missing; furthermore, extending existing
functionality or enabling interoperable capabilities remains difficult at best.
We propose the concept of Pilot-Data to address the fundamental challenges of
co-placement and scheduling of data and compute in heterogeneous and
distributed environments with interoperability and extensibility as first-order
concerns. Pilot-Data is an extension of the Pilot-Job abstraction for
supporting the management of data in conjunction with compute tasks. Pilot-Data
separates logical data units from physical storage, thereby providing the basis
for efficient compute/data placement and scheduling. In this paper, we discuss
the design and implementation of the Pilot-Data prototype, demonstrate its use
by data-intensive applications on multiple production distributed
cyberinfrastructure and illustrate the advantages arising from flexible
execution modes enabled by Pilot-Data. Our experiments utilize an
implementation of Pilot-Data in conjunction with a scalable Pilot-Job (BigJob)
to establish the application performance that can be enabled by the use of
Pilot-Data. We demonstrate how the concept of Pilot-Data also provides the
basis upon which to build tools and support capabilities like affinity which in
turn can be used for advanced data-compute co-placement and scheduling
New Trends in Parallel and Distributed Simulation: from Many-Cores to Cloud Computing
Recent advances in computing architectures and networking are bringing
parallel computing systems to the masses so increasing the number of potential
users of these kinds of systems. In particular, two important technological
evolutions are happening at the ends of the computing spectrum: at the "small"
scale, processors now include an increasing number of independent execution
units (cores), at the point that a mere CPU can be considered a parallel
shared-memory computer; at the "large" scale, the Cloud Computing paradigm
allows applications to scale by offering resources from a large pool on a
pay-as-you-go model. Multi-core processors and Clouds both require applications
to be suitably modified to take advantage of the features they provide. In this
paper, we analyze the state of the art of parallel and distributed simulation
techniques, and assess their applicability to multi-core architectures or
Clouds. It turns out that most of the current approaches exhibit limitations in
terms of usability and adaptivity which may hinder their application to these
new computing architectures. We propose an adaptive simulation mechanism, based
on the multi-agent system paradigm, to partially address some of those
limitations. While it is unlikely that a single approach will work well on both
settings above, we argue that the proposed adaptive mechanism has useful
features which make it attractive both in a multi-core processor and in a Cloud
system. These features include the ability to reduce communication costs by
migrating simulation components, and the support for adding (or removing) nodes
to the execution architecture at runtime. We will also show that, with the help
of an additional support layer, parallel and distributed simulations can be
executed on top of unreliable resources.Comment: Simulation Modelling Practice and Theory (SIMPAT), Elsevier, vol. 49
(December 2014
- …