2 research outputs found
Big Data Processing Attribute Based Access Control Security
The purpose of this research is to analyze the security of next-generation big data processing (BDP) and examine the feasibility of applying advanced security features to meet the needs of modern multi-tenant, multi-level data analysis. The research methodology was to survey of the status of security mechanisms in BDP systems and identify areas that require further improvement. Access control (AC) security services were identified as priority area, specifically Attribute Based Access Control (ABAC). The exemplar BDP system analyzed is the Apache Hadoop ecosystem. We created data generation software, analysis programs, and posted the detailed the experiment configuration on GitHub. Overall, our research indicates that before a BDP system, such as Hadoop, can be used in operational environment significant security configurations are required. We believe that the tools are available to achieve a secure system, with ABAC, using Apache Ranger and Apache Atlas. However, these systems are immature and require verification by an independent third party. We identified the following specific actions for overall improvement: consistent provisioning of security services through a data analyst workstation, a common backplane of security services, and a management console. These areas are partially satisfied in the current Hadoop ecosystem, continued AC improvements through the open source community, and rigorous independent testing should further address remaining security challenges. Robust security will enable further use of distributed, cluster BDP, such as Apache Hadoop and Hadoop-like systems, to meet future government and business requirements
Towards A Cross-Domain MapReduce Framework
The Apache™ Hadoop® framework provides parallel
processing and distributed data storage capabilities that data
analytics applications can utilize to process massive sets of raw
data. These Big Data applications typically run as a set of
MapReduce jobs to take advantage of Hadoop’s ease of service
deployment and large-scale parallelism. Yet, Hadoop has not
been adapted for multilevel secure (MLS) environments where
data of different security classifications co-exist. To solve this problem, we have used the Security Enhanced Linux
(SELinux) Linux kernel extension in a prototype cross-domain
Hadoop on which multiple instances of Hadoop applications run
at different sensitivity levels. Their accesses to Hadoop resources
are constrained by the underlying MLS policy enforcement
mechanism. To solve this problem, we have used the Security Enhanced Linux
(SELinux) Linux kernel extension in a prototype cross-domain
Hadoop on which multiple instances of Hadoop applications run
at different sensitivity levels. Their accesses to Hadoop resources
are constrained by the underlying MLS policy enforcement
mechanism. To solve this problem, we have used the Security Enhanced Linux
(SELinux) Linux kernel extension in a prototype cross-domain
Hadoop on which multiple instances of Hadoop applications run
at different sensitivity levels. Their accesses to Hadoop resources
are constrained by the underlying MLS policy enforcement
mechanism. To solve this problem, we have used the Security Enhanced Linux
(SELinux) Linux kernel extension in a prototype cross-domain
Hadoop on which multiple instances of Hadoop applications run
at different sensitivity levels. Their accesses to Hadoop resources
are constrained by the underlying MLS policy enforcement
mechanism. A benefit of our prototype is its extension of the Hadoop Distributed File System to provide a cross-domain read-down capability for Hadoop applications without requiring complex Hadoop server components to be trustworthy