24 research outputs found

    Access control technologies for Big Data management systems: literature review and future trends

    Get PDF
    Abstract Data security and privacy issues are magnified by the volume, the variety, and the velocity of Big Data and by the lack, up to now, of a reference data model and related data manipulation languages. In this paper, we focus on one of the key data security services, that is, access control, by highlighting the differences with traditional data management systems and describing a set of requirements that any access control solution for Big Data platforms may fulfill. We then describe the state of the art and discuss open research issues

    Big Data Processing Attribute Based Access Control Security

    Get PDF
    The purpose of this research is to analyze the security of next-generation big data processing (BDP) and examine the feasibility of applying advanced security features to meet the needs of modern multi-tenant, multi-level data analysis. The research methodology was to survey of the status of security mechanisms in BDP systems and identify areas that require further improvement. Access control (AC) security services were identified as priority area, specifically Attribute Based Access Control (ABAC). The exemplar BDP system analyzed is the Apache Hadoop ecosystem. We created data generation software, analysis programs, and posted the detailed the experiment configuration on GitHub. Overall, our research indicates that before a BDP system, such as Hadoop, can be used in operational environment significant security configurations are required. We believe that the tools are available to achieve a secure system, with ABAC, using Apache Ranger and Apache Atlas. However, these systems are immature and require verification by an independent third party. We identified the following specific actions for overall improvement: consistent provisioning of security services through a data analyst workstation, a common backplane of security services, and a management console. These areas are partially satisfied in the current Hadoop ecosystem, continued AC improvements through the open source community, and rigorous independent testing should further address remaining security challenges. Robust security will enable further use of distributed, cluster BDP, such as Apache Hadoop and Hadoop-like systems, to meet future government and business requirements

    Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

    Get PDF
    One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study

    Hierarchical Group and Attribute-Based Access Control: Incorporating Hierarchical Groups and Delegation into Attribute-Based Access Control

    Get PDF
    Attribute-Based Access Control (ABAC) is a promising alternative to traditional models of access control (i.e. Discretionary Access Control (DAC), Mandatory Access Control (MAC) and Role-Based Access control (RBAC)) that has drawn attention in both recent academic literature and industry application. However, formalization of a foundational model of ABAC and large-scale adoption is still in its infancy. The relatively recent popularity of ABAC still leaves a number of problems unexplored. Issues like delegation, administration, auditability, scalability, hierarchical representations, etc. have been largely ignored or left to future work. This thesis seeks to aid in the adoption of ABAC by filling in several of these gaps. The core contribution of this work is the Hierarchical Group and Attribute-Based Access Control (HGABAC) model, a novel formal model of ABAC which introduces the concept of hierarchical user and object attribute groups to ABAC. It is shown that HGABAC is capable of representing the traditional models of access control (MAC, DAC and RBAC) using this group hierarchy and that in many cases it’s use simplifies both attribute and policy administration. HGABAC serves as the basis upon which extensions are built to incorporate delegation into ABAC. Several potential strategies for introducing delegation into ABAC are proposed, categorized into families and the trade-offs of each are examined. One such strategy is formalized into a new User-to-User Attribute Delegation model, built as an extension to the HGABAC model. Attribute Delegation enables users to delegate a subset of their attributes to other users in an off-line manner (not requiring connecting to a third party). Finally, a supporting architecture for HGABAC is detailed including descriptions of services, high-level communication protocols and a new low-level attribute certificate format for exchanging user and connection attributes between independent services. Particular emphasis is placed on ensuring support for federated and distributed systems. Critical components of the architecture are implemented and evaluated with promising preliminary results. It is hoped that the contributions in this research will further the acceptance of ABAC in both academia and industry by solving the problem of delegation as well as simplifying administration and policy authoring through the introduction of hierarchical user groups

    Data Spaces

    Get PDF
    This open access book aims to educate data space designers to understand what is required to create a successful data space. It explores cutting-edge theory, technologies, methodologies, and best practices for data spaces for both industrial and personal data and provides the reader with a basis for understanding the design, deployment, and future directions of data spaces. The book captures the early lessons and experience in creating data spaces. It arranges these contributions into three parts covering design, deployment, and future directions respectively. The first part explores the design space of data spaces. The single chapters detail the organisational design for data spaces, data platforms, data governance federated learning, personal data sharing, data marketplaces, and hybrid artificial intelligence for data spaces. The second part describes the use of data spaces within real-world deployments. Its chapters are co-authored with industry experts and include case studies of data spaces in sectors including industry 4.0, food safety, FinTech, health care, and energy. The third and final part details future directions for data spaces, including challenges and opportunities for common European data spaces and privacy-preserving techniques for trustworthy data sharing. The book is of interest to two primary audiences: first, researchers interested in data management and data sharing, and second, practitioners and industry experts engaged in data-driven systems where the sharing and exchange of data within an ecosystem are critical

    Data Spaces

    Get PDF
    This open access book aims to educate data space designers to understand what is required to create a successful data space. It explores cutting-edge theory, technologies, methodologies, and best practices for data spaces for both industrial and personal data and provides the reader with a basis for understanding the design, deployment, and future directions of data spaces. The book captures the early lessons and experience in creating data spaces. It arranges these contributions into three parts covering design, deployment, and future directions respectively. The first part explores the design space of data spaces. The single chapters detail the organisational design for data spaces, data platforms, data governance federated learning, personal data sharing, data marketplaces, and hybrid artificial intelligence for data spaces. The second part describes the use of data spaces within real-world deployments. Its chapters are co-authored with industry experts and include case studies of data spaces in sectors including industry 4.0, food safety, FinTech, health care, and energy. The third and final part details future directions for data spaces, including challenges and opportunities for common European data spaces and privacy-preserving techniques for trustworthy data sharing. The book is of interest to two primary audiences: first, researchers interested in data management and data sharing, and second, practitioners and industry experts engaged in data-driven systems where the sharing and exchange of data within an ecosystem are critical

    Design of a reference architecture for an IoT sensor network

    Get PDF

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications
    corecore