13,815 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Data as a Service (DaaS) for sharing and processing of large data collections in the cloud
Data as a Service (DaaS) is among the latest kind of services being investigated in the Cloud computing community. The main aim of DaaS is to overcome limitations of state-of-the-art approaches in data technologies, according to which data is stored and accessed from repositories whose location is known and is relevant for sharing and processing. Besides limitations for the data sharing, current approaches also do not achieve to fully separate/decouple software services from data and thus impose limitations in inter-operability. In this paper we propose a DaaS approach for intelligent sharing and processing of large data collections with the aim of abstracting the data location (by making it relevant to the needs of sharing and accessing) and to fully decouple the data and its processing. The aim of our approach is to build a Cloud computing platform, offering DaaS to support large communities of users that need to share, access, and process the data for collectively building knowledge from data. We exemplify the approach from large data collections from health and biology domains.Peer ReviewedPostprint (author's final draft
Semantic reasoning on the edge of internet of things
Abstract. The Internet of Things (IoT) is a paradigm where physical objects are connected with each other with identifying, sensing, networking and processing capabilities over the Internet. Millions of new devices will be added into IoT network thus generating huge amount of data. How to represent, store, interconnect, search, and organize information generated by IoT devices become a challenge. Semantic technologies could play an important role by encoding meaning into data to enable a computer system to possess knowledge and reasoning. The vast amount of devices and data are also challenges. Edge Computing reduces both network latency and resource consumptions by deploying services and distributing computing tasks from the core network to the edge.
We recognize four challenges from IoT systems. First the centralized server may generate long latency because of physical distances. Second concern is that the resource-constrained IoT devices have limited computing ability in processing heavy tasks. Third, the data generated by heterogeneous devices can hardly be understood and utilized by other devices or systems. Our research focuses on these challenges and provide a solution based on Edge computing and semantic technologies.
We utilize Edge computing and semantic reasoning into IoT. Edge computing distributes tasks to the reasoning devices, which we call the Edge nodes. They are close to the terminal devices and provide services. The newly added resources could balance the workload of the systems and improve the computing capability. We annotate meaning into the data with Resource Description Framework thus providing an approach for heterogeneous machines to understand and utilize the data. We use semantic reasoning as a general purpose intelligent processing method.
The thesis work focuses on studying semantic reasoning performance in IoT system with Edge computing paradigm. We develop an Edge based IoT system with semantic technologies. The system deploys semantic reasoning services on Edge nodes. Based on IoT system, we design five experiments to evaluate the performance of the integrated IoT system. We demonstrate how could the Edge computing paradigm facilitate IoT in terms of data transforming, semantic reasoning and service experience. We analyze how to improve the performance by properly distributing the task for Cloud and Edge nodes. The thesis work result shows that the Edge computing could improve the performance of the semantic reasoning in IoT
MOSDEN: A Scalable Mobile Collaborative Platform for Opportunistic Sensing Applications
Mobile smartphones along with embedded sensors have become an efficient
enabler for various mobile applications including opportunistic sensing. The
hi-tech advances in smartphones are opening up a world of possibilities. This
paper proposes a mobile collaborative platform called MOSDEN that enables and
supports opportunistic sensing at run time. MOSDEN captures and shares sensor
data across multiple apps, smartphones and users. MOSDEN supports the emerging
trend of separating sensors from application-specific processing, storing and
sharing. MOSDEN promotes reuse and re-purposing of sensor data hence reducing
the efforts in developing novel opportunistic sensing applications. MOSDEN has
been implemented on Android-based smartphones and tablets. Experimental
evaluations validate the scalability and energy efficiency of MOSDEN and its
suitability towards real world applications. The results of evaluation and
lessons learned are presented and discussed in this paper.Comment: Accepted to be published in Transactions on Collaborative Computing,
2014. arXiv admin note: substantial text overlap with arXiv:1310.405
Active data-centric framework for data protection in cloud environment
Cloud computing is an emerging evolutionary computing model that provides highly scalable services over highspeed Internet on a pay-as-usage model. However, cloud-based solutions still have not been widely deployed in some sensitive areas, such as banking and healthcare. The lack of widespread development is related to users’ concern that their confidential data or privacy would leak out in the cloud’s outsourced environment. To address this problem, we propose a novel active data-centric framework to ultimately improve the transparency and accountability of actual usage of the users’ data in cloud. Our data-centric framework emphasizes “active” feature which packages the raw data with active properties that enforce data usage with active defending and protection capability. To achieve the active scheme, we devise the Triggerable Data File Structure (TDFS). Moreover, we employ the zero-knowledge proof scheme to verify the request’s identification without revealing any vital information. Our experimental outcomes demonstrate the efficiency, dependability, and scalability of our framework.<br /
On Evaluating Commercial Cloud Services: A Systematic Review
Background: Cloud Computing is increasingly booming in industry with many
competing providers and services. Accordingly, evaluation of commercial Cloud
services is necessary. However, the existing evaluation studies are relatively
chaotic. There exists tremendous confusion and gap between practices and theory
about Cloud services evaluation. Aim: To facilitate relieving the
aforementioned chaos, this work aims to synthesize the existing evaluation
implementations to outline the state-of-the-practice and also identify research
opportunities in Cloud services evaluation. Method: Based on a conceptual
evaluation model comprising six steps, the Systematic Literature Review (SLR)
method was employed to collect relevant evidence to investigate the Cloud
services evaluation step by step. Results: This SLR identified 82 relevant
evaluation studies. The overall data collected from these studies essentially
represent the current practical landscape of implementing Cloud services
evaluation, and in turn can be reused to facilitate future evaluation work.
Conclusions: Evaluation of commercial Cloud services has become a world-wide
research topic. Some of the findings of this SLR identify several research gaps
in the area of Cloud services evaluation (e.g., the Elasticity and Security
evaluation of commercial Cloud services could be a long-term challenge), while
some other findings suggest the trend of applying commercial Cloud services
(e.g., compared with PaaS, IaaS seems more suitable for customers and is
particularly important in industry). This SLR study itself also confirms some
previous experiences and reveals new Evidence-Based Software Engineering (EBSE)
lessons
Recommended from our members
A review paper on preserving privacy in mobile environments
Technology is improving day-by-day and so is the usage of mobile devices. Every activity that would involve manual and paper transactions can now be completed in seconds using your fingertips. On one hand, life has become fairly convenient with the help of mobile devices, whereas on the other hand security of the data and the transactions occurring in the process have been under continuous threat. This paper, re-evaluates the different policies and procedures used for preserving the privacy of sensitive data and device location.. Policy languages have been very vital in the mobile environments as they can be extended/used significantly for sending/receiving any data. In the mobile environment users always go to service providers to access various services. Hence, communications between the service providers and mobile handsets needs to be secured. Also, the data access control needs to be in place. A section of this paper will review the communication paths and channels and their related access criteria. This paper is a contribution to the mobile domain, showing the possible attacks related to privacy and the various mechanisms used to preserve the end-user privacy. In addition, it also gives acomparison of the different privacy preserving methods in mobile environments to provide guidance to the readers. Finally, the paper summarises future research challenges in the area of privacy preservation. This paper examines the ‘where’ problem and in particular, examines tradeoffs between enforcing location security at a device vs. enforcing location security at an edge location server. This paper also sketches an implementation of location security solution at both the device and the edge location server and presents detailed experiments using real mobility and user profile data sets collected from multiple data sources (taxicabs, Smartphones)
- …