2,858 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    Simurgh: a fully decentralized and secure NVMM user space file system

    Get PDF
    The availability of non-volatile main memory (NVMM) has started a new era for storage systems and NVMM specific file systems can support extremely high data and metadata rates, which are required by many HPC and data-intensive applications. Scaling metadata performance within NVMM file systems is nevertheless often restricted by the Linux kernel storage stack, while simply moving metadata management to the user space can compromise security or flexibility. This paper introduces Simurgh, a hardware-assisted user space file system with decentralized metadata management that allows secure metadata updates from within user space. Simurgh guarantees consistency, durability, and ordering of updates without sacrificing scalability. Security is enforced by only allowing NVMM access from protected user space functions, which can be implemented through two proposed instructions. Comparisons with other NVMM file systems show that Simurgh improves metadata performance up to 18x and application performance up to 89% compared to the second-fastest file system.This work has been supported by the European Comission’s BigStorage project H2020-MSCA-ITN2014-642963. It is also supported by the Big Data in Atmospheric Physics (BINARY) project, funded by the Carl Zeiss Foundation under Grant No.: P2018-02-003.Peer ReviewedPostprint (author's final draft

    A survey and classification of software-defined storage systems

    Get PDF
    The exponential growth of digital information is imposing increasing scale and efficiency demands on modern storage infrastructures. As infrastructure complexity increases, so does the difficulty in ensuring quality of service, maintainability, and resource fairness, raising unprecedented performance, scalability, and programmability challenges. Software-Defined Storage (SDS) addresses these challenges by cleanly disentangling control and data flows, easing management, and improving control functionality of conventional storage systems. Despite its momentum in the research community, many aspects of the paradigm are still unclear, undefined, and unexplored, leading to misunderstandings that hamper the research and development of novel SDS technologies. In this article, we present an in-depth study of SDS systems, providing a thorough description and categorization of each plane of functionality. Further, we propose a taxonomy and classification of existing SDS solutions according to different criteria. Finally, we provide key insights about the paradigm and discuss potential future research directions for the field.This work was financed by the Portuguese funding agency FCT-Fundacao para a Ciencia e a Tecnologia through national funds, the PhD grant SFRH/BD/146059/2019, the project ThreatAdapt (FCT-FNR/0002/2018), the LASIGE Research Unit (UIDB/00408/2020), and cofunded by the FEDER, where applicable

    Blockchain for secured IoT and D2D applications over 5G cellular networks : a thesis by publications presented in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer and Electronics Engineering, Massey University, Albany, New Zealand

    Get PDF
    Author's Declaration: "In accordance with Sensors, SpringerOpen, and IEEE’s copyright policy, this thesis contains the accepted and published version of each manuscript as the final version. Consequently, the content is identical to the published versions."The Internet of things (IoT) is in continuous development with ever-growing popularity. It brings significant benefits through enabling humans and the physical world to interact using various technologies from small sensors to cloud computing. IoT devices and networks are appealing targets of various cyber attacks and can be hampered by malicious intervening attackers if the IoT is not appropriately protected. However, IoT security and privacy remain a major challenge due to characteristics of the IoT, such as heterogeneity, scalability, nature of the data, and operation in open environments. Moreover, many existing cloud-based solutions for IoT security rely on central remote servers over vulnerable Internet connections. The decentralized and distributed nature of blockchain technology has attracted significant attention as a suitable solution to tackle the security and privacy concerns of the IoT and device-to-device (D2D) communication. This thesis explores the possible adoption of blockchain technology to address the security and privacy challenges of the IoT under the 5G cellular system. This thesis makes four novel contributions. First, a Multi-layer Blockchain Security (MBS) model is proposed to protect IoT networks while simplifying the implementation of blockchain technology. The concept of clustering is utilized to facilitate multi-layer architecture deployment and increase scalability. The K-unknown clusters are formed within the IoT network by applying a hybrid Evolutionary Computation Algorithm using Simulated Annealing (SA) and Genetic Algorithms (GA) to structure the overlay nodes. The open-source Hyperledger Fabric (HLF) Blockchain platform is deployed for the proposed model development. Base stations adopt a global blockchain approach to communicate with each other securely. The quantitative arguments demonstrate that the proposed clustering algorithm performs well when compared to the earlier reported methods. The proposed lightweight blockchain model is also better suited to balance network latency and throughput compared to a traditional global blockchain. Next, a model is proposed to integrate IoT systems and blockchain by implementing the permissioned blockchain Hyperledger Fabric. The security of the edge computing devices is provided by employing a local authentication process. A lightweight mutual authentication and authorization solution is proposed to ensure the security of tiny IoT devices within the ecosystem. In addition, the proposed model provides traceability for the data generated by the IoT devices. The performance of the proposed model is validated with practical implementation by measuring performance metrics such as transaction throughput and latency, resource consumption, and network use. The results indicate that the proposed platform with the HLF implementation is promising for the security of resource-constrained IoT devices and is scalable for deployment in various IoT scenarios. Despite the increasing development of blockchain platforms, there is still no comprehensive method for adopting blockchain technology on IoT systems due to the blockchain's limited capability to process substantial transaction requests from a massive number of IoT devices. The Fabric comprises various components such as smart contracts, peers, endorsers, validators, committers, and Orderers. A comprehensive empirical model is proposed that measures HLF's performance and identifies potential performance bottlenecks to better meet blockchain-based IoT applications' requirements. The implementation of HLF on distributed large-scale IoT systems is proposed. The performance of the HLF is evaluated in terms of throughput, latency, network sizes, scalability, and the number of peers serviceable by the platform. The experimental results demonstrate that the proposed framework can provide a detailed and real-time performance evaluation of blockchain systems for large-scale IoT applications. The diversity and the sheer increase in the number of connected IoT devices have brought significant concerns about storing and protecting the large IoT data volume. Dependencies of the centralized server solution impose significant trust issues and make it vulnerable to security risks. A layer-based distributed data storage design and implementation of a blockchain-enabled large-scale IoT system is proposed to mitigate these challenges by using the HLF platform for distributed ledger solutions. The need for a centralized server and third-party auditor is eliminated by leveraging HLF peers who perform transaction verification and records audits in a big data system with the help of blockchain technology. The HLF blockchain facilitates storing the lightweight verification tags on the blockchain ledger. In contrast, the actual metadata is stored in the off-chain big data system to reduce the communication overheads and enhance data integrity. Finally, experiments are conducted to evaluate the performance of the proposed scheme in terms of throughput, latency, communication, and computation costs. The results indicate the feasibility of the proposed solution to retrieve and store the provenance of large-scale IoT data within the big data ecosystem using the HLF blockchain

    TACTICAL BLOCKCHAIN TO PROVIDE DATA PROVENANCE IN SUPPORT OF INTERNET OF BATTLEFIELD THINGS AND BIG DATA ANALYTICS

    Get PDF
    This capstone project evaluated the use of blockchain technology to address a number of challenges with increasing amounts of disparate sensor data and an information-rich landscape that can quickly overwhelm effective decision-making processes. The team explored how blockchain can be used in a variety of defense applications to verify users, validate sensor data fed into artificial intelligence models, limit access to data, and provide an audit trail across the data life cycle. The team developed a conceptual design for implementing blockchain for tactical data, artificial intelligence, and machine learning applications; identified challenges and limitations involved in implementing blockchain for the tactical domain; described the benefits of blockchain for these various applications; and evaluated this project’s findings to propose future research into a wider set of blockchain applications. The team did this through the development of three use cases. One use case demonstrated the use of blockchain at the tactical edge in a “data light” information environment. The second use case explored the use of blockchain in securing medical information in the electronic health record. The third use case studied blockchain’s application in the use of multiple sensors collecting data for chemical weapons defense to support measurement and signature intelligence analysis using artificial intelligence and machine learning.Civilian, Department of the ArmyCivilian, Department of the ArmyCivilian, Department of the ArmyCivilian, Department of the ArmyCivilian, Department of the ArmyApproved for public release. Distribution is unlimited
    • …
    corecore