1,416 research outputs found

    Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures

    Get PDF
    One of the significant shifts of the next-generation computing technologies will certainly be in the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD landmark, evolved as a widely deployed BD operating system. Its new features include federation structure and many associated frameworks, which provide Hadoop 3.x with the maturity to serve different markets. This dissertation addresses two leading issues involved in exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely, (i)Scalability that directly affects the system performance and overall throughput using portable Docker containers. (ii) Security that spread the adoption of data protection practices among practitioners using access controls. An Enhanced Mapreduce Environment (EME), OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker (BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for data streaming to the cloud computing are the main contribution of this thesis study

    New cross-layer techniques for multi-criteria scheduling in large-scale systems

    Get PDF
    The global ecosystem of information technology (IT) is in transition to a new generation of applications that require more and more intensive data acquisition, processing and storage systems. As a result of that change towards data intensive computing, there is a growing overlap between high performance computing (HPC) and Big Data techniques in applications, since many HPC applications produce large volumes of data, and Big Data needs HPC capabilities. The hypothesis of this PhD. thesis is that the potential interoperability and convergence of the HPC and Big Data systems are crucial for the future, being essential the unification of both paradigms to address a broad spectrum of research domains. For this reason, the main objective of this Phd. thesis is purposing and developing a monitoring system to allow the HPC and Big Data convergence, thanks to giving information about behaviors of applications in a system which execute both kind of them, giving information to improve scalability, data locality, and to allow adaptability to large scale computers. To achieve this goal, this work is focused on the design of resource monitoring and discovery to exploit parallelism at all levels. These collected data are disseminated to facilitate global improvements at the whole system, and, thus, avoid mismatches between layers. The result is a two-level monitoring framework (both at node and application level) with a low computational load, scalable, and that can communicate with different modules thanks to an API provided for this purpose. All data collected is disseminated to facilitate the implementation of improvements globally throughout the system, and thus avoid mismatches between layers, which combined with the techniques applied to deal with fault tolerance, makes the system robust and with high availability. On the other hand, the developed framework includes a task scheduler capable of managing the launch of applications, their migration between nodes, as well as the possibility of dynamically increasing or decreasing the number of processes. All these thanks to the cooperation with other modules that are integrated into LIMITLESS, and whose objective is to optimize the execution of a stack of applications based on multi-criteria policies. This scheduling mode is called coarse-grain scheduling based on monitoring. For better performance and in order to further reduce the overhead during the monitorization, different optimizations have been applied at different levels to try to reduce communications between components, while trying to avoid the loss of information. To achieve this objective, data filtering techniques, Machine Learning (ML) algorithms, and Neural Networks (NN) have been used. In order to improve the scheduling process and to design new multi-criteria scheduling policies, the monitoring information has been combined with other ML algorithms to identify (through classification algorithms) the applications and their execution phases, doing offline profiling. Thanks to this feature, LIMITLESS can detect which phase is executing an application and tries to share the computational resources with other applications that are compatible (there is no performance degradation between them when both are running at the same time). This feature is called fine-grain scheduling, and can reduce the makespan of the use cases while makes efficient use of the computational resources that other applications do not use.El ecosistema global de las tecnologías de la información (IT) se encuentra en transición a una nueva generación de aplicaciones que requieren sistemas de adquisición de datos, procesamiento y almacenamiento cada vez más intensivo. Como resultado de ese cambio hacia la computación intensiva de datos, existe una superposición, cada vez mayor, entre la computación de alto rendimiento (HPC) y las técnicas Big Data en las aplicaciones, pues muchas aplicaciones HPC producen grandes volúmenes de datos, y Big Data necesita capacidades HPC. La hipótesis de esta tesis es que hay un gran potencial en la interoperabilidad y convergencia de los sistemas HPC y Big Data, siendo crucial para el futuro tratar una unificación de ambos para hacer frente a un amplio espectro de problemas de investigación. Por lo tanto, el objetivo principal de esta tesis es la propuesta y desarrollo de un sistema de monitorización que facilite la convergencia de los paradigmas HPC y Big Data gracias a la provisión de datos sobre el comportamiento de las aplicaciones en un entorno en el que se pueden ejecutar aplicaciones de ambos mundos, ofreciendo información útil para mejorar la escalabilidad, la explotación de la localidad de datos y la adaptabilidad en los computadores de gran escala. Para lograr este objetivo, el foco se ha centrado en el diseño de mecanismos de monitorización y localización de recursos para explotar el paralelismo en todos los niveles de la pila del software. El resultado es un framework de monitorización en dos niveles (tanto a nivel de nodo como de aplicación) con una baja carga computacional, escalable, y que se puede comunicar con distintos módulos gracias a una API proporcionada para tal objetivo. Todos datos recolectados se difunden para facilitar la realización de mejoras de manera global en todo el sistema, y así evitar desajustes entre capas, lo que combinado con las técnicas aplicadas para lidiar con la tolerancia a fallos, hace que el sistema sea robusto y con una alta disponibilidad. Por otro lado, el framework desarrollado incluye un planificador de tareas capaz de gestionar el lanzamiento de aplicaciones, la migración de las mismas entre nodos, además de la posibilidad de incrementar o disminuir su número de procesos de forma dinámica. Todo ello gracias a la cooperación con otros módulos que se integran en LIMITLESS, y cuyo objetivo es optimizar la ejecución de una pila de aplicaciones en base a políticas multicriterio. Esta funcionalidad se llama planificación de grano grueso. Para un mejor desempeño y con el objetivo de reducir más aún la carga durante la ejecución, se han aplicado distintas optimizaciones en distintos niveles para tratar de reducir las comunicaciones entre componentes, a la vez que se trata de evitar la pérdida de información. Para lograr este objetivo se ha hecho uso de técnicas de filtrado de datos, algoritmos de Machine Learning (ML), y Redes Neuronales (NN). Finalmente, para obtener mejores resultados en la planificación de aplicaciones y para diseñar nuevas políticas de planificación multi-criterio, los datos de monitorización recolectados han sido combinados con nuevos algoritmos de ML para identificar (por medio de algoritmos de clasificación) aplicaciones y sus fases de ejecución. Todo ello realizando tareas de profiling offline. Gracias a estas técnicas, LIMITLESS puede detectar en qué fase de su ejecución se encuentra una determinada aplicación e intentar compartir los recursos de computacionales con otras aplicaciones que sean compatibles (no se produce una degradación del rendimiento entre ellas cuando ambas se ejecutan a la vez en el mismo nodo). Esta funcionalidad se llama planificación de grano fino y puede reducir el tiempo total de ejecución de la pila de aplicaciones en los casos de uso porque realiza un uso más eficiente de los recursos de las máquinas.This PhD dissertation has been partially supported by the Spanish Ministry of Science and Innovation under an FPI fellowship associated to a National Project with reference TIN2016-79637-P (from July 1, 2018 to October 10, 2021)Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Félix García Carballeira.- Secretario: Pedro Ángel Cuenca Castillo.- Vocal: María Cristina V. Marinesc

    Navigating the IoT landscape: Unraveling forensics, security issues, applications, research challenges, and future

    Full text link
    Given the exponential expansion of the internet, the possibilities of security attacks and cybercrimes have increased accordingly. However, poorly implemented security mechanisms in the Internet of Things (IoT) devices make them susceptible to cyberattacks, which can directly affect users. IoT forensics is thus needed for investigating and mitigating such attacks. While many works have examined IoT applications and challenges, only a few have focused on both the forensic and security issues in IoT. Therefore, this paper reviews forensic and security issues associated with IoT in different fields. Future prospects and challenges in IoT research and development are also highlighted. As demonstrated in the literature, most IoT devices are vulnerable to attacks due to a lack of standardized security measures. Unauthorized users could get access, compromise data, and even benefit from control of critical infrastructure. To fulfil the security-conscious needs of consumers, IoT can be used to develop a smart home system by designing a FLIP-based system that is highly scalable and adaptable. Utilizing a blockchain-based authentication mechanism with a multi-chain structure can provide additional security protection between different trust domains. Deep learning can be utilized to develop a network forensics framework with a high-performing system for detecting and tracking cyberattack incidents. Moreover, researchers should consider limiting the amount of data created and delivered when using big data to develop IoT-based smart systems. The findings of this review will stimulate academics to seek potential solutions for the identified issues, thereby advancing the IoT field.Comment: 77 pages, 5 figures, 5 table

    Security in Distributed, Grid, Mobile, and Pervasive Computing

    Get PDF
    This book addresses the increasing demand to guarantee privacy, integrity, and availability of resources in networks and distributed systems. It first reviews security issues and challenges in content distribution networks, describes key agreement protocols based on the Diffie-Hellman key exchange and key management protocols for complex distributed systems like the Internet, and discusses securing design patterns for distributed systems. The next section focuses on security in mobile computing and wireless networks. After a section on grid computing security, the book presents an overview of security solutions for pervasive healthcare systems and surveys wireless sensor network security

    Malware detection at runtime for resource-constrained mobile devices: data-driven approach

    Get PDF
    The number of smart and connected mobile devices is increasing, bringing enormous possibilities to users in various domains and transforming everything that we get in touch with into smart. Thus, we have smart watches, smart phones, smart homes, and finally even smart cities. Increased smartness of mobile devices means that they contain more valuable information about their users, more decision making capabilities, and more control over sometimes even life-critical systems. Although, on one side, all of these are necessary in order to enable mobile devices maintain their main purpose to help and support people, on the other, it opens new vulnerabilities. Namely, with increased number and volume of smart devices, also the interest of attackers to abuse them is rising, making their security one of the main challenges. The main mean that the attackers use in order to abuse mobile devices is malicious software, shortly called malware. One way to protect against malware is by using static analysis, that investigates the nature of software by analyzing its static features. However, this technique detects well only known malware and it is prone to obfuscation, which means that it is relatively easy to create a new malicious sample that would be able to pass the radar. Thus, alone, is not powerful enough to protect the users against increasing malicious attacks. The other way to cope with malware is through dynamic analysis, where the nature of the software is decided based on its behavior during its execution on a device. This is a promising solution, because while the code of the software can be easily changed to appear as new, the same cannot be done with ease with its behavior when being executed. However, in order to achieve high accuracy dynamic analysis usually requires computational resources that are beyond suitable for battery-operated mobile devices. This is further complicated if, in addition to detecting the presence of malware, we also want to understand which type of malware it is, in order to trigger suitable countermeasures. Finally, the decisions on potential infections have to happen early enough, to guarantee minimal exposure to the attacks. Fulfilling these requirements in a mobile, battery-operated environments is a challenging task, for which, to the best of our knowledge, a suitable solution is not yet proposed. In this thesis, we pave the way towards such a solution by proposing a dynamic malware detection system that is able to early detect malware that appears at runtime and that provides useful information to discriminate between diverse types of malware while taking into account limited resources of mobile devices. On a mobile device we monitor a set of the representative features for presence of malware and based on them we trigger an alarm if software infection is observed. When this happens, we analyze a set of previously stored information relevant for malware classification, in order to understand what type of malware is being executed. In order to make the detection efficient and suitable for resource-constrained environments of mobile devices, we minimize the set of observed system parameters to only the most informative ones for both detection and classification. Additionally, since sampling period of monitoring infrastructure is directly connected to the power consumption, we take it into account as an important parameter of the development of the detection system. In order to make detection effective, we use dynamic features related to memory, CPU, system calls and network as they reflect well the behavior of a system. Our experiments show that the monitoring with a sampling period of eight seconds gives a good trade-off between detection accuracy, detection time and consumed power. Using it and by monitoring a set of only seven dynamic features (six related to the behavior of memory and one of CPU), we are able to provide a detection solution that satisfies the initial requirements and to detect malware at runtime with F- measure of 0.85, within 85.52 seconds of its execution, and with consumed average power of 20mW. Apart from observed features containing enough information to discriminate between malicious and benign applications, our results show that they can also be used to discriminate between diverse behavior of malware, reflected in different malware families. Using small number of features we are able to identify the presence of the malicious records from the considered family with precision of up to 99.8%. In addition to the standalone use of the proposed detection solution, we have also used it in a hybrid scenario where the applications were first analyzed by a static method, and it was able to detect correctly all the malware previously undetected by static analysis with false positive rate of 3.81% and average detection time of 44.72s. The method, we have designed, tested and validated, has been applied on a smartphone running on Android Operating System. However, since in the design of this method efficient usage of available computational resources was one of our main criteria, we are confident that the method as such can be applied also on the other battery-operated mobile devices of Internet of Things, in order to provide an effective and efficient system able to counter the ever-increasing and ever-evolving number and a variety of malicious attacks

    Attribute Based Encryption for Secure Data Access in Cloud

    Get PDF
    Cloud computing is a progressive computing worldview, which empowers adaptable, on-request, and ease use of Information Technology assets. However, the information transmitted to some cloud servers, and various protection concerns are arising out of it. Different plans given the property-based encryption have been proposed to secure the Cloud Storage. In any case, most work spotlights on the information substance security and the get to control, while less consideration towards the benefit control and the character protection. In this paper, a semi-anonymous benefit control conspires AnonyControl to address the information protection, as well as the client character security in existing access control plans. AnonyControl decentralizes the central authority to restrain the character spillage and accordingly accomplishes semi-anonymity. Furthermore, it likewise sums up the document get to control to the benefit control, by which advantages of all operations on the cloud information managed in a fine-grained way. Along these lines, display the AnonyControl-F, which ultimately keeps the character spillage and accomplish the full secrecy. Our security assessment demonstrates that both AnonyControl and AnonyControl-F are secure under the decisional bilinear Diffie-Hellman presumption, and our execution assessment shows the attainability of our plans. Index Terms: Anonymity, multi-authority, attribute-based encryption

    Integrated intelligent systems for industrial automation: the challenges of Industry 4.0, information granulation and understanding agents .

    Get PDF
    The objective of the paper consists in considering the challenges of new automation paradigm Industry 4.0 and reviewing the-state-of-the-art in the field of its enabling information and communication technologies, including Cyberphysical Systems, Cloud Computing, Internet of Things and Big Data. Some ways of multi-dimensional, multi-faceted industrial Big Data representation and analysis are suggested. The fundamentals of Big Data processing with using Granular Computing techniques have been developed. The problem of constructing special cognitive tools to build artificial understanding agents for Integrated Intelligent Enterprises has been faced
    corecore