Search CORE

957 research outputs found

Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

Author: Awaysheh Feras Mahmoud Naji
Publication venue
Publication date: 01/01/2020
Field of study

One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study

Repositorio Institucional da Universidade de Santiago de Compostela

Security Privacy Process Involvement in Cloud Security for Data Preservation against Data Malicious Activity

Author: K Prabu et al.
Publication venue: Auricle Global Society of Education and Research
Publication date: 02/11/2023
Field of study

Cloud data sent from the person is attacked, leading to data hacking. Data classification can be made by malware detection, leading to the data warehouse technique and data storage. The cloud data from a particular internet protocol address cannot be hacked. Only random cloud data is hacked. Even though this leads to some illegal issues. The method of managing the cloud data and maintaining the factor to hack illegal cloud data has been proposed. The method of malware detection and ML-based end-to-end malware detection are used in calculating the time efficiency. The malware detection and defence method has been introduced for managing the data tracking and the system's formation to hack unwanted data. The time efficiency calculation for the data transmitted in the network has been enabled for the cloud data sent and received. The data from each router makes the data store 12% of the unwanted compared to the original messages. The factor for managing the individual aspect to produce the data is 30% of the database. This will contain 20% of the data in formulating the cloud storage system, which makes the data classifications. 4% of redundant data from the database has been enveloped for the data classifications. Meanwhile, the data attack can be evaluated using the malware detector and also manages classification method for evaluation of data and formation of the system to produce data from the appearance of Secure data clouds

International Journal on Recent and Innovation Trends in Computing and Communication

The Family of MapReduce and Large Scale Data Processing Systems

Author: Anna Liu
Ayman G. Fayoumi
King Abdulaziz
See Profile
Sherif Sakr
Sherif Sakr
South Wales
South Wales
Publication venue
Publication date: 12/02/2013
Field of study

In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

arXiv.org e-Print Archive

CiteSeerX

Architectural Design Decisions for Self-Serve Data Platforms in Data Meshes

Author: Di Nucci Dario
Heuvel Willem-Jan van den
Kumara Indika
Tamburri Damian Andrew
van Eijk Tom
Publication venue
Publication date: 07/02/2024
Field of study

Data mesh is an emerging decentralized approach to managing and generating value from analytical enterprise data at scale. It shifts the ownership of the data to the business domains closest to the data, promotes sharing and managing data as autonomous products, and uses a federated and automated data governance model. The data mesh relies on a managed data platform that offers services to domain and governance teams to build, share, and manage data products efficiently. However, designing and implementing a self-serve data platform is challenging, and the platform engineers and architects must understand and choose the appropriate design options to ensure the platform will enhance the experience of domain and governance teams. For these reasons, this paper proposes a catalog of architectural design decisions and their corresponding decision options by systematically reviewing 43 industrial gray literature articles on self-serve data platforms in data mesh. Moreover, we used semi-structured interviews with six data engineering experts with data mesh experience to validate, refine, and extend the findings from the literature. Such a catalog of design decisions and options drawn from the state of practice shall aid practitioners in building data meshes while providing a baseline for further research on data mesh architectures.Comment: 21st IEEE International Conference on Software Architecture (ICSA 2024), 13 page

arXiv.org e-Print Archive

Big Data Now, 2015 Edition

Author: Nicole Tache
Publication venue: O\u27Relly
Publication date: 01/11/2016
Field of study

Now in its fifth year, O’Reilly’s annual Big Data Now report recaps the trends, tools, applications, and forecasts we’ve talked about over the past year. For 2015, we’ve included a collection of blog posts, authored by leading thinkers and experts in the field, that reflect a unique set of themes we’ve identified as gaining significant attention and traction. Our list of 2015 topics include: Data-driven cultures Data science Data pipelines Big data architecture and infrastructure The Internet of Things and real time Applications of big data Security, ethics, and governance Is your organization on the right track? Get a hold of this free report now and stay in tune with the latest significant developments in big data

Open Library

Cloud technology options towards Free Flow of Data

Author: DPSP Cluster Working Group
Publication venue
Publication date
Field of study

This whitepaper collects the technology solutions that the projects in the Data Protection, Security and Privacy Cluster propose to address the challenges raised by the working areas of the Free Flow of Data initiative. The document describes the technologies, methodologies, models, and tools researched and developed by the clustered projects mapped to the ten areas of work of the Free Flow of Data initiative. The aim is to facilitate the identification of the state-of-the-art of technology options towards solving the data security and privacy challenges posed by the Free Flow of Data initiative in Europe. The document gives reference to the Cluster, the individual projects and the technologies produced by them

ZENODO

Security in Cloud Computing: Evaluation and Integration

Author: Halabi Talal
Publication venue
Publication date: 01/08/2018
Field of study

Au cours de la dernière décennie, le paradigme du Cloud Computing a révolutionné la manière dont nous percevons les services de la Technologie de l’Information (TI). Celui-ci nous a donné l’opportunité de répondre à la demande constamment croissante liée aux besoins informatiques des usagers en introduisant la notion d’externalisation des services et des données. Les consommateurs du Cloud ont généralement accès, sur demande, à un large éventail bien réparti d’infrastructures de TI offrant une pléthore de services. Ils sont à même de configurer dynamiquement les ressources du Cloud en fonction des exigences de leurs applications, sans toutefois devenir partie intégrante de l’infrastructure du Cloud. Cela leur permet d’atteindre un degré optimal d’utilisation des ressources tout en réduisant leurs coûts d’investissement en TI. Toutefois, la migration des services au Cloud intensifie malgré elle les menaces existantes à la sécurité des TI et en crée de nouvelles qui sont intrinsèques à l’architecture du Cloud Computing. C’est pourquoi il existe un réel besoin d’évaluation des risques liés à la sécurité du Cloud durant le procédé de la sélection et du déploiement des services. Au cours des dernières années, l’impact d’une efficace gestion de la satisfaction des besoins en sécurité des services a été pris avec un sérieux croissant de la part des fournisseurs et des consommateurs. Toutefois, l’intégration réussie de l’élément de sécurité dans les opérations de la gestion des ressources du Cloud ne requiert pas seulement une recherche méthodique, mais aussi une modélisation méticuleuse des exigences du Cloud en termes de sécurité. C’est en considérant ces facteurs que nous adressons dans cette thèse les défis liés à l’évaluation de la sécurité et à son intégration dans les environnements indépendants et interconnectés du Cloud Computing. D’une part, nous sommes motivés à offrir aux consommateurs du Cloud un ensemble de méthodes qui leur permettront d’optimiser la sécurité de leurs services et, d’autre part, nous offrons aux fournisseurs un éventail de stratégies qui leur permettront de mieux sécuriser leurs services d’hébergements du Cloud. L’originalité de cette thèse porte sur deux aspects : 1) la description innovatrice des exigences des applications du Cloud relativement à la sécurité ; et 2) la conception de modèles mathématiques rigoureux qui intègrent le facteur de sécurité dans les problèmes traditionnels du déploiement des applications, d’approvisionnement des ressources et de la gestion de la charge de travail au coeur des infrastructures actuelles du Cloud Computing. Le travail au sein de cette thèse est réalisé en trois phases.----------ABSTRACT: Over the past decade, the Cloud Computing paradigm has revolutionized the way we envision IT services. It has provided an opportunity to respond to the ever increasing computing needs of the users by introducing the notion of service and data outsourcing. Cloud consumers usually have online and on-demand access to a large and distributed IT infrastructure providing a plethora of services. They can dynamically configure and scale the Cloud resources according to the requirements of their applications without becoming part of the Cloud infrastructure, which allows them to reduce their IT investment cost and achieve optimal resource utilization. However, the migration of services to the Cloud increases the vulnerability to existing IT security threats and creates new ones that are intrinsic to the Cloud Computing architecture, thus the need for a thorough assessment of Cloud security risks during the process of service selection and deployment. Recently, the impact of effective management of service security satisfaction has been taken with greater seriousness by the Cloud Service Providers (CSP) and stakeholders. Nevertheless, the successful integration of the security element into the Cloud resource management operations does not only require methodical research, but also necessitates the meticulous modeling of the Cloud security requirements. To this end, we address throughout this thesis the challenges to security evaluation and integration in independent and interconnected Cloud Computing environments. We are interested in providing the Cloud consumers with a set of methods that allow them to optimize the security of their services and the CSPs with a set of strategies that enable them to provide security-aware Cloud-based service hosting. The originality of this thesis lies within two aspects: 1) the innovative description of the Cloud applications’ security requirements, which paved the way for an effective quantification and evaluation of the security of Cloud infrastructures; and 2) the design of rigorous mathematical models that integrate the security factor into the traditional problems of application deployment, resource provisioning, and workload management within current Cloud Computing infrastructures. The work in this thesis is carried out in three phases

PolyPublie