1,416 research outputs found
Big Data and Large-scale Data Analytics: Efficiency of Sustainable Scalability and Security of Centralized Clouds and Edge Deployment Architectures
One of the significant shifts of the next-generation computing technologies will certainly be in
the development of Big Data (BD) deployment architectures. Apache Hadoop, the BD
landmark, evolved as a widely deployed BD operating system. Its new features include
federation structure and many associated frameworks, which provide Hadoop 3.x with the
maturity to serve different markets. This dissertation addresses two leading issues involved in
exploiting BD and large-scale data analytics realm using the Hadoop platform. Namely,
(i)Scalability that directly affects the system performance and overall throughput using
portable Docker containers. (ii) Security that spread the adoption of data protection practices
among practitioners using access controls. An Enhanced Mapreduce Environment (EME),
OPportunistic and Elastic Resource Allocation (OPERA) scheduler, BD Federation Access Broker
(BDFAB), and a Secure Intelligent Transportation System (SITS) of multi-tiers architecture for
data streaming to the cloud computing are the main contribution of this thesis study
New cross-layer techniques for multi-criteria scheduling in large-scale systems
The global ecosystem of information technology (IT) is in transition to a new generation
of applications that require more and more intensive data acquisition, processing and
storage systems. As a result of that change towards data intensive computing, there is a
growing overlap between high performance computing (HPC) and Big Data techniques in
applications, since many HPC applications produce large volumes of data, and Big Data
needs HPC capabilities.
The hypothesis of this PhD. thesis is that the potential interoperability and convergence
of the HPC and Big Data systems are crucial for the future, being essential the unification
of both paradigms to address a broad spectrum of research domains. For this reason, the
main objective of this Phd. thesis is purposing and developing a monitoring system to
allow the HPC and Big Data convergence, thanks to giving information about behaviors of
applications in a system which execute both kind of them, giving information to improve
scalability, data locality, and to allow adaptability to large scale computers. To achieve
this goal, this work is focused on the design of resource monitoring and discovery to
exploit parallelism at all levels. These collected data are disseminated to facilitate global
improvements at the whole system, and, thus, avoid mismatches between layers. The
result is a two-level monitoring framework (both at node and application level) with
a low computational load, scalable, and that can communicate with different modules
thanks to an API provided for this purpose. All data collected is disseminated to facilitate
the implementation of improvements globally throughout the system, and thus avoid
mismatches between layers, which combined with the techniques applied to deal with fault
tolerance, makes the system robust and with high availability.
On the other hand, the developed framework includes a task scheduler capable of managing
the launch of applications, their migration between nodes, as well as the possibility
of dynamically increasing or decreasing the number of processes. All these thanks to the
cooperation with other modules that are integrated into LIMITLESS, and whose objective
is to optimize the execution of a stack of applications based on multi-criteria policies. This
scheduling mode is called coarse-grain scheduling based on monitoring.
For better performance and in order to further reduce the overhead during the monitorization,
different optimizations have been applied at different levels to try to reduce
communications between components, while trying to avoid the loss of information. To
achieve this objective, data filtering techniques, Machine Learning (ML) algorithms, and
Neural Networks (NN) have been used.
In order to improve the scheduling process and to design new multi-criteria scheduling
policies, the monitoring information has been combined with other ML algorithms to
identify (through classification algorithms) the applications and their execution phases,
doing offline profiling. Thanks to this feature, LIMITLESS can detect which phase is executing an application and tries to share the computational resources with other applications
that are compatible (there is no performance degradation between them when both are
running at the same time). This feature is called fine-grain scheduling, and can reduce the
makespan of the use cases while makes efficient use of the computational resources that
other applications do not use.El ecosistema global de las tecnologías de la información (IT) se encuentra en transición
a una nueva generación de aplicaciones que requieren sistemas de adquisición de datos,
procesamiento y almacenamiento cada vez más intensivo. Como resultado de ese cambio
hacia la computación intensiva de datos, existe una superposición, cada vez mayor, entre
la computación de alto rendimiento (HPC) y las técnicas Big Data en las aplicaciones,
pues muchas aplicaciones HPC producen grandes volúmenes de datos, y Big Data necesita
capacidades HPC.
La hipótesis de esta tesis es que hay un gran potencial en la interoperabilidad y
convergencia de los sistemas HPC y Big Data, siendo crucial para el futuro tratar una
unificación de ambos para hacer frente a un amplio espectro de problemas de investigación.
Por lo tanto, el objetivo principal de esta tesis es la propuesta y desarrollo de un sistema
de monitorización que facilite la convergencia de los paradigmas HPC y Big Data gracias
a la provisión de datos sobre el comportamiento de las aplicaciones en un entorno en
el que se pueden ejecutar aplicaciones de ambos mundos, ofreciendo información útil
para mejorar la escalabilidad, la explotación de la localidad de datos y la adaptabilidad
en los computadores de gran escala. Para lograr este objetivo, el foco se ha centrado en
el diseño de mecanismos de monitorización y localización de recursos para explotar el
paralelismo en todos los niveles de la pila del software. El resultado es un framework
de monitorización en dos niveles (tanto a nivel de nodo como de aplicación) con una
baja carga computacional, escalable, y que se puede comunicar con distintos módulos
gracias a una API proporcionada para tal objetivo. Todos datos recolectados se difunden
para facilitar la realización de mejoras de manera global en todo el sistema, y así evitar
desajustes entre capas, lo que combinado con las técnicas aplicadas para lidiar con la
tolerancia a fallos, hace que el sistema sea robusto y con una alta disponibilidad.
Por otro lado, el framework desarrollado incluye un planificador de tareas capaz de
gestionar el lanzamiento de aplicaciones, la migración de las mismas entre nodos, además
de la posibilidad de incrementar o disminuir su número de procesos de forma dinámica.
Todo ello gracias a la cooperación con otros módulos que se integran en LIMITLESS, y
cuyo objetivo es optimizar la ejecución de una pila de aplicaciones en base a políticas
multicriterio. Esta funcionalidad se llama planificación de grano grueso.
Para un mejor desempeño y con el objetivo de reducir más aún la carga durante la
ejecución, se han aplicado distintas optimizaciones en distintos niveles para tratar de
reducir las comunicaciones entre componentes, a la vez que se trata de evitar la pérdida
de información. Para lograr este objetivo se ha hecho uso de técnicas de filtrado de datos,
algoritmos de Machine Learning (ML), y Redes Neuronales (NN).
Finalmente, para obtener mejores resultados en la planificación de aplicaciones y
para diseñar nuevas políticas de planificación multi-criterio, los datos de monitorización recolectados han sido combinados con nuevos algoritmos de ML para identificar (por
medio de algoritmos de clasificación) aplicaciones y sus fases de ejecución. Todo ello
realizando tareas de profiling offline. Gracias a estas técnicas, LIMITLESS puede detectar
en qué fase de su ejecución se encuentra una determinada aplicación e intentar compartir los
recursos de computacionales con otras aplicaciones que sean compatibles (no se produce
una degradación del rendimiento entre ellas cuando ambas se ejecutan a la vez en el mismo
nodo). Esta funcionalidad se llama planificación de grano fino y puede reducir el tiempo
total de ejecución de la pila de aplicaciones en los casos de uso porque realiza un uso más
eficiente de los recursos de las máquinas.This PhD dissertation has been partially supported by the Spanish Ministry of Science and Innovation under an FPI fellowship associated to a National Project with reference TIN2016-79637-P (from July 1,
2018 to October 10, 2021)Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Félix García Carballeira.- Secretario: Pedro Ángel Cuenca Castillo.- Vocal: María Cristina V. Marinesc
Navigating the IoT landscape: Unraveling forensics, security issues, applications, research challenges, and future
Given the exponential expansion of the internet, the possibilities of
security attacks and cybercrimes have increased accordingly. However, poorly
implemented security mechanisms in the Internet of Things (IoT) devices make
them susceptible to cyberattacks, which can directly affect users. IoT
forensics is thus needed for investigating and mitigating such attacks. While
many works have examined IoT applications and challenges, only a few have
focused on both the forensic and security issues in IoT. Therefore, this paper
reviews forensic and security issues associated with IoT in different fields.
Future prospects and challenges in IoT research and development are also
highlighted. As demonstrated in the literature, most IoT devices are vulnerable
to attacks due to a lack of standardized security measures. Unauthorized users
could get access, compromise data, and even benefit from control of critical
infrastructure. To fulfil the security-conscious needs of consumers, IoT can be
used to develop a smart home system by designing a FLIP-based system that is
highly scalable and adaptable. Utilizing a blockchain-based authentication
mechanism with a multi-chain structure can provide additional security
protection between different trust domains. Deep learning can be utilized to
develop a network forensics framework with a high-performing system for
detecting and tracking cyberattack incidents. Moreover, researchers should
consider limiting the amount of data created and delivered when using big data
to develop IoT-based smart systems. The findings of this review will stimulate
academics to seek potential solutions for the identified issues, thereby
advancing the IoT field.Comment: 77 pages, 5 figures, 5 table
Security in Distributed, Grid, Mobile, and Pervasive Computing
This book addresses the increasing demand to guarantee privacy, integrity, and availability of resources in networks and distributed systems. It first reviews security issues and challenges in content distribution networks, describes key agreement protocols based on the Diffie-Hellman key exchange and key management protocols for complex distributed systems like the Internet, and discusses securing design patterns for distributed systems. The next section focuses on security in mobile computing and wireless networks. After a section on grid computing security, the book presents an overview of security solutions for pervasive healthcare systems and surveys wireless sensor network security
Malware detection at runtime for resource-constrained mobile devices: data-driven approach
The number of smart and connected mobile devices is increasing, bringing enormous possibilities to users in various domains and transforming everything that we get in touch with into smart. Thus, we have smart watches, smart phones, smart homes, and finally even smart cities. Increased smartness of mobile devices means that they contain more valuable information about their users, more decision making capabilities, and more control over sometimes even life-critical systems. Although, on one side, all of these are necessary in order to enable mobile devices maintain their main purpose to help and support people, on the other, it opens new vulnerabilities. Namely, with increased number and volume of smart devices, also the interest of attackers to abuse them is rising, making their security one of the main challenges. The main mean that the attackers use in order to abuse mobile devices is malicious software, shortly called malware. One way to protect against malware is by using static analysis, that investigates the nature of software by analyzing its static features. However, this technique detects well only known malware and it is prone to obfuscation, which means that it is relatively easy to create a new malicious sample that would be able to pass the radar. Thus, alone, is not powerful enough to protect the users against increasing malicious attacks. The other way to cope with malware is through dynamic analysis, where the nature of the software is decided based on its behavior during its execution on a device. This is a promising solution, because while the code of the software can be easily changed to appear as new, the same cannot be done with ease with its behavior when being executed. However, in order to achieve high accuracy dynamic analysis usually requires computational resources that are beyond suitable for battery-operated mobile devices. This is further complicated if, in addition to detecting the presence of malware, we also want to understand which type of malware it is, in order to trigger suitable countermeasures. Finally, the decisions on potential infections have to happen early enough, to guarantee minimal exposure to the attacks. Fulfilling these requirements in a mobile, battery-operated environments is a challenging task, for which, to the best of our knowledge, a suitable solution is not yet proposed. In this thesis, we pave the way towards such a solution by proposing a dynamic malware detection system that is able to early detect malware that appears at runtime and that provides useful information to discriminate between diverse types of malware while taking into account limited resources of mobile devices. On a mobile device we monitor a set of the representative features for presence of malware and based on them we trigger an alarm if software infection is observed. When this happens, we analyze a set of previously stored information relevant for malware classification, in order to understand what type of malware is being executed. In order to make the detection efficient and suitable for resource-constrained environments of mobile devices, we minimize the set of observed system parameters to only the most informative ones for both detection and classification. Additionally, since sampling period of monitoring infrastructure is directly connected to the power consumption, we take it into account as an important parameter of the development of the detection system. In order to make detection effective, we use dynamic features related to memory, CPU, system calls and network as they reflect well the behavior of a system. Our experiments show that the monitoring with a sampling period of eight seconds gives a good trade-off between detection accuracy, detection time and consumed power. Using it and by monitoring a set of only seven dynamic features (six related to the behavior of memory and one of CPU), we are able to provide a detection solution that satisfies the initial requirements and to detect malware at runtime with F- measure of 0.85, within 85.52 seconds of its execution, and with consumed average power of 20mW. Apart from observed features containing enough information to discriminate between malicious and benign applications, our results show that they can also be used to discriminate between diverse behavior of malware, reflected in different malware families. Using small number of features we are able to identify the presence of the malicious records from the considered family with precision of up to 99.8%. In addition to the standalone use of the proposed detection solution, we have also used it in a hybrid scenario where the applications were first analyzed by a static method, and it was able to detect correctly all the malware previously undetected by static analysis with false positive rate of 3.81% and average detection time of 44.72s. The method, we have designed, tested and validated, has been applied on a smartphone running on Android Operating System. However, since in the design of this method efficient usage of available computational resources was one of our main criteria, we are confident that the method as such can be applied also on the other battery-operated mobile devices of Internet of Things, in order to provide an effective and efficient system able to counter the ever-increasing and ever-evolving number and a variety of malicious attacks
Attribute Based Encryption for Secure Data Access in Cloud
Cloud computing is a progressive computing worldview, which empowers adaptable, on-request, and ease use of Information Technology assets. However, the information transmitted to some cloud servers, and various protection concerns are arising out of it. Different plans given the property-based encryption have been proposed to secure the Cloud Storage. In any case, most work spotlights on the information substance security and the get to control, while less consideration towards the benefit control and the character protection. In this paper, a semi-anonymous benefit control conspires AnonyControl to address the information protection, as well as the client character security in existing access control plans. AnonyControl decentralizes the central authority to restrain the character spillage and accordingly accomplishes semi-anonymity. Furthermore, it likewise sums up the document get to control to the benefit control, by which advantages of all operations on the cloud information managed in a fine-grained way. Along these lines, display the AnonyControl-F, which ultimately keeps the character spillage and accomplish the full secrecy. Our security assessment demonstrates that both AnonyControl and AnonyControl-F are secure under the decisional bilinear Diffie-Hellman presumption, and our execution assessment shows the attainability of our plans.
Index Terms: Anonymity, multi-authority, attribute-based encryption
Integrated intelligent systems for industrial automation: the challenges of Industry 4.0, information granulation and understanding agents .
The objective of the paper consists in considering the challenges of new automation paradigm Industry 4.0 and reviewing the-state-of-the-art in the field of its enabling information and communication technologies, including Cyberphysical Systems, Cloud Computing, Internet of Things and Big Data. Some ways of multi-dimensional, multi-faceted industrial Big Data representation and analysis are suggested. The fundamentals of Big Data processing with using Granular Computing techniques have been developed. The problem of constructing special cognitive tools to build artificial understanding agents for Integrated Intelligent Enterprises has been faced
- …