224 research outputs found
Recommended from our members
Ray: A Distributed Execution Engine for the Machine Learning Ecosystem
In recent years, growing data volumes and more sophisticated computational procedures have greatly increased the demand for computational power. Machine learning and artificial intelligence applications, for example, are notorious for their computational requirements. At the same time, Moores law is ending and processor speeds are stalling. As a result, distributed computing has become ubiquitous. While the cloud makes distributed hardware infrastructure widely accessible and therefore offers the potential of horizontal scale, developing these distributed algorithms and applications remains surprisingly hard. This is due to the inherent complexity of concurrent algorithms, the engineering challenges that arise when communicating between many machines, the requirements like fault tolerance and straggler mitigation that arise at large scale and the lack of a general-purpose distributed execution engine that can support a wide variety of applications.In this thesis, we study the requirements for a general-purpose distributed computation model and present a solution that is easy to use yet expressive and resilient to faults. At its core our model takes familiar concepts from serial programming, namely functions and classes, and generalizes them to the distributed world, therefore unifying stateless and stateful distributed computation. This model not only supports many machine learning workloads like training or serving, but is also a good t for cross-cutting machine learning applications like reinforcement learning and data processing applications like streaming or graph processing. We implement this computational model as an open-source system called Ray, which matches or exceeds the performance of specialized systems in many application domains, while also offering horizontally scalability and strong fault tolerance properties
Experimental Design and Comparative Testing of a Hybrid-Cooled Computer Cluster
With water cooling becoming an affordable option both at home and at scale, it is important to consider the possible benefits over air cooling. There are several methods of liquid cooling, notables include: immersion, cold water cooling, and warm water cooling. The total cost of ownership is difficult to determine with these options as each has a different impact on the data center. Considering retrofit, over a new data center, introduces unforeseen variables that make cost analysis a challenge. Besides the added costs of additional infrastructure, and the cost to remove old, the upfront costs could be daunting. Therefore a cost analysis would be a study of its won. This study however hopes to reveal the resulting tradeoffs in temperature, performance, and power usage presented in the case between classical airflow based heat sink mechanisms to water provided directly at the heatsink. Having control over a discrete chiller will provide answers to the CPU temperatures, power usage, and performance at various inlet water temperatures. To water or to air
Infrastructural Security for Virtualized Grid Computing
The goal of the grid computing paradigm is to make computer power as easy to access as an electrical power grid. Unlike the power grid, the computer grid uses remote resources located at a service provider. Malicious users can abuse the provided resources, which not only affects their own systems but also those of the provider and others.
Resources are utilized in an environment where sensitive programs and data from competitors are processed on shared resources, creating again the potential for misuse. This is one of the main security issues, since in a business environment competitors distrust each other, and the fear of industrial espionage is always present. Currently, human trust is the strategy used to deal with these threats. The relationship between grid users and resource providers ranges from highly trusted to highly untrusted. This wide trust relationship occurs because grid computing itself changed from a research topic with few users to a widely deployed product that included early commercial adoption. The traditional open research communities have very low security requirements, while in contrast, business customers often operate on sensitive data that represents intellectual property; thus, their security demands are very high. In traditional grid computing, most users share the same resources concurrently. Consequently, information regarding other users and their jobs can usually be acquired quite easily. This includes, for example, that a user can see which processes are running on another user´s system. For business users, this is unacceptable since even the meta-data of their jobs is classified. As a consequence, most commercial customers are not convinced that their intellectual property in the form of software and data is protected in the grid.
This thesis proposes a novel infrastructural security solution that advances the concept of virtualized grid computing. The work started back in 2007 and led to the development of the XGE, a virtual grid management software. The XGE itself uses operating system virtualization to provide a virtualized landscape. Users’ jobs are no longer executed in a shared manner; they are executed within special sandboxed environments. To satisfy the requirements of a traditional grid setup, the solution can be coupled with an installed scheduler and grid middleware on the grid head node. To protect the prominent grid head node, a novel dual-laned demilitarized zone is introduced to make attacks more difficult. In a traditional grid setup, the head node and the computing nodes are installed in the same network, so a successful attack could also endanger the user´s software and data. While the zone complicates attacks, it is, as all security solutions, not a perfect solution. Therefore, a network intrusion detection system is enhanced with grid specific signatures. A novel software called Fence is introduced that supports end-to-end encryption, which means that all data remains encrypted until it reaches its final destination. It transfers data securely between the user´s computer, the head node and the nodes within the shielded, internal network. A lightweight kernel rootkit detection system assures that only trusted kernel modules can be loaded. It is no longer possible to load untrusted modules such as kernel rootkits. Furthermore, a malware scanner for virtualized grids scans for signs of malware in all running virtual machines. Using virtual machine introspection, that scanner remains invisible for most types of malware and has full access to all system calls on the monitored system. To speed up detection, the load is distributed to multiple detection engines simultaneously. To enable multi-site service-oriented grid applications, the novel concept of public virtual nodes is presented. This is a virtualized grid node with a public IP address shielded by a set of dynamic firewalls. It is possible to create a set of connected, public nodes, either present on one or more remote grid sites. A special web service allows users to modify their own rule set in both directions and in a controlled manner.
The main contribution of this thesis is the presentation of solutions that convey the security of grid computing infrastructures. This includes the XGE, a software that transforms a traditional grid into a virtualized grid. Design and implementation details including experimental evaluations are given for all approaches. Nearly all parts of the software are available as open source software. A summary of the contributions and an outlook to future work conclude this thesis
Investigation of spacecraft cluster autonomy through an acoustic imaging interferometric testbed
Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Aeronautics and Astronautics, 1999.Includes bibliographical references (p. 169-173).The development and use of a novel testbed architecture is presented. Separated spacecraft interferometers have been proposed for applications in sparse aperture radar or astronomical observations. Modeled after these systems, an integrated hardware and software interferometry testbed is developed. Utilizing acoustic sources and sensors as a simplified analog to radio or optical systems, the Acoustic Imaging Testbed's simplest function is that of a Michelson interferometer. Robot arms control the motion of microphones. Through successive measurements an acoustic image can be formed. On top of this functionality, a layered software architecture is developed. This software creates a virtual environment that mimics the command, control and communications functions appropriate to a space interferometer. Autonomous spacecraft agents interact within this environment as the logical equivalent of distributed satellites. Optimal imaging configurations are validated. A scalable approach to cluster autonomy is discussed.by John Enright.S.M
Modelli e strumenti di programmazione parallela per piattaforme many-core
The negotiation between power consumption, performance, programmability, and portability drives all computing industry designs, in particular the mobile and embedded systems domains.
Two design paradigms have proven particularly promising in this context: architectural heterogeneity and many-core processors.
Parallel programming models are key to effectively harness the computational power of heterogeneous many-core SoC.
This thesis presents a set of techniques and HW/SW extensions that enable performance improvements and that simplify programmability for heterogeneous many-core platforms.
The thesis contributions cover vertically the entire software stack for many-core platforms, from hardware abstraction layers running on top of bare-metal, to programming models; from hardware extensions for efficient parallelism support to middleware that enables optimized resource management within many-core platforms.
First, we present mechanisms to decrease parallelism overheads on parallel programming runtimes for many-core platforms, targeting fine-grain parallelism.
Second, we present programming model support that enables the offload of computational kernels within heterogeneous many-core systems.
Third, we present a novel approach to dynamically sharing and managing many-core platforms when multiple applications coded with different programming models execute concurrently.
All these contributions were validated using STMicroelectronics STHORM, a real embodiment of a state-of-the-art many-core system. Hardware extensions and architectural explorations were explored using VirtualSoC, a SystemC based cycle-accurate simulator of many-core platforms
Extempore: The design, implementation and application of a cyber-physical programming language
There is a long history of experimental and exploratory
programming
supported by systems that expose interaction through a
programming
language interface. These live programming systems enable
software
developers to create, extend, and modify the behaviour of
executing
software by changing source code without perceptual breaks for
recompilation. These live programming systems have taken many
forms,
but have generally been limited in their ability to express
low-level
programming concepts and the generation of efficient native
machine
code. These shortcomings have limited the effectiveness of live
programming in domains that require highly efficient numerical
processing and explicit memory management.
The most general questions addressed by this thesis are what a
systems
language designed for live programming might look like and how
such a
language might influence the development of live programming in
performance sensitive domains requiring real-time support,
direct
hardware control, or high performance computing. This thesis
answers
these questions by exploring the design, implementation and
application of Extempore, a new systems programming language,
designed specifically for live interactive programming
Cooperative localisation in underwater robotic swarms for ocean bottom seismic imaging.
Spatial information must be collected alongside the data modality of interest in wide variety of sub-sea applications, such as deep sea exploration, environmental monitoring, geological and ecological research, and samples collection. Ocean-bottom seismic surveys are vital for oil and gas exploration, and for productivity enhancement of an existing production facility. Ocean-bottom seismic sensors are deployed on the seabed to acquire those surveys. Node deployment methods used in industry today are costly, time-consuming and unusable in deep oceans. This study proposes the autonomous deployment of ocean-bottom seismic nodes, implemented by a swarm of Autonomous Underwater Vehicles (AUVs). In autonomous deployment of ocean-bottom seismic nodes, a swarm of sensor-equipped AUVs are deployed to achieve ocean-bottom seismic imaging through collaboration and communication. However, the severely limited bandwidth of underwater acoustic communications and the high cost of maritime assets limit the number of AUVs that can be deployed for experiments. A holistic fuzzy-based localisation framework for large underwater robotic swarms (i.e. with hundreds of AUVs) to dynamically fuse multiple position estimates of an autonomous underwater vehicle is proposed. Simplicity, exibility and scalability are the main three advantages inherent in the proposed localisation framework, when compared to other traditional and commonly adopted underwater localisation methods, such as the Extended Kalman Filter. The proposed fuzzy-based localisation algorithm improves the entire swarm mean localisation error and standard deviation (by 16.53% and 35.17% respectively) at a swarm size of 150 AUVs when compared to the Extended Kalman Filter based localisation with round-robin scheduling. The proposed fuzzy based localisation method requires fuzzy rules and fuzzy set parameters tuning, if the deployment scenario is changed. Therefore a cooperative localisation scheme that relies on a scalar localisation confidence value is proposed. A swarm subset is navigationally aided by ultra-short baseline and a swarm subset (i.e. navigation beacons) is configured to broadcast navigation aids (i.e. range-only), once their confidence values are higher than a predetermined confidence threshold. The confidence value and navigation beacons subset size are two key parameters for the proposed algorithm, so that they are optimised using the evolutionary multi-objective optimisation algorithm NSGA-II to enhance its localisation performance. Confidence value-based localisation is proposed to control the cooperation dynamics among the swarm agents, in terms of aiding acoustic exteroceptive sensors. Given the error characteristics of a commercially available ultra-short baseline system and the covariance matrix of a trilaterated underwater vehicle position, dead reckoning navigation - aided by Extended Kalman Filter-based acoustic exteroceptive sensors - is performed and controlled by the vehicle's confidence value. The proposed confidence-based localisation algorithm has significantly improved the entire swarm mean localisation error when compared to the fuzzy-based and round-robin Extended Kalman Filter-based localisation methods (by 67.10% and 59.28% respectively, at a swarm size of 150 AUVs). The proposed fuzzy-based and confidence-based localisation algorithms for cooperative underwater robotic swarms are validated on a co-simulation platform. A physics-based co-simulation platform that considers an environment's hydrodynamics, industrial grade inertial measurement unit and underwater acoustic communications characteristics is implemented for validation and optimisation purposes
New cross-layer techniques for multi-criteria scheduling in large-scale systems
The global ecosystem of information technology (IT) is in transition to a new generation
of applications that require more and more intensive data acquisition, processing and
storage systems. As a result of that change towards data intensive computing, there is a
growing overlap between high performance computing (HPC) and Big Data techniques in
applications, since many HPC applications produce large volumes of data, and Big Data
needs HPC capabilities.
The hypothesis of this PhD. thesis is that the potential interoperability and convergence
of the HPC and Big Data systems are crucial for the future, being essential the unification
of both paradigms to address a broad spectrum of research domains. For this reason, the
main objective of this Phd. thesis is purposing and developing a monitoring system to
allow the HPC and Big Data convergence, thanks to giving information about behaviors of
applications in a system which execute both kind of them, giving information to improve
scalability, data locality, and to allow adaptability to large scale computers. To achieve
this goal, this work is focused on the design of resource monitoring and discovery to
exploit parallelism at all levels. These collected data are disseminated to facilitate global
improvements at the whole system, and, thus, avoid mismatches between layers. The
result is a two-level monitoring framework (both at node and application level) with
a low computational load, scalable, and that can communicate with different modules
thanks to an API provided for this purpose. All data collected is disseminated to facilitate
the implementation of improvements globally throughout the system, and thus avoid
mismatches between layers, which combined with the techniques applied to deal with fault
tolerance, makes the system robust and with high availability.
On the other hand, the developed framework includes a task scheduler capable of managing
the launch of applications, their migration between nodes, as well as the possibility
of dynamically increasing or decreasing the number of processes. All these thanks to the
cooperation with other modules that are integrated into LIMITLESS, and whose objective
is to optimize the execution of a stack of applications based on multi-criteria policies. This
scheduling mode is called coarse-grain scheduling based on monitoring.
For better performance and in order to further reduce the overhead during the monitorization,
different optimizations have been applied at different levels to try to reduce
communications between components, while trying to avoid the loss of information. To
achieve this objective, data filtering techniques, Machine Learning (ML) algorithms, and
Neural Networks (NN) have been used.
In order to improve the scheduling process and to design new multi-criteria scheduling
policies, the monitoring information has been combined with other ML algorithms to
identify (through classification algorithms) the applications and their execution phases,
doing offline profiling. Thanks to this feature, LIMITLESS can detect which phase is executing an application and tries to share the computational resources with other applications
that are compatible (there is no performance degradation between them when both are
running at the same time). This feature is called fine-grain scheduling, and can reduce the
makespan of the use cases while makes efficient use of the computational resources that
other applications do not use.El ecosistema global de las tecnologías de la información (IT) se encuentra en transición
a una nueva generación de aplicaciones que requieren sistemas de adquisición de datos,
procesamiento y almacenamiento cada vez más intensivo. Como resultado de ese cambio
hacia la computación intensiva de datos, existe una superposición, cada vez mayor, entre
la computación de alto rendimiento (HPC) y las técnicas Big Data en las aplicaciones,
pues muchas aplicaciones HPC producen grandes volúmenes de datos, y Big Data necesita
capacidades HPC.
La hipótesis de esta tesis es que hay un gran potencial en la interoperabilidad y
convergencia de los sistemas HPC y Big Data, siendo crucial para el futuro tratar una
unificación de ambos para hacer frente a un amplio espectro de problemas de investigación.
Por lo tanto, el objetivo principal de esta tesis es la propuesta y desarrollo de un sistema
de monitorización que facilite la convergencia de los paradigmas HPC y Big Data gracias
a la provisión de datos sobre el comportamiento de las aplicaciones en un entorno en
el que se pueden ejecutar aplicaciones de ambos mundos, ofreciendo información útil
para mejorar la escalabilidad, la explotación de la localidad de datos y la adaptabilidad
en los computadores de gran escala. Para lograr este objetivo, el foco se ha centrado en
el diseño de mecanismos de monitorización y localización de recursos para explotar el
paralelismo en todos los niveles de la pila del software. El resultado es un framework
de monitorización en dos niveles (tanto a nivel de nodo como de aplicación) con una
baja carga computacional, escalable, y que se puede comunicar con distintos módulos
gracias a una API proporcionada para tal objetivo. Todos datos recolectados se difunden
para facilitar la realización de mejoras de manera global en todo el sistema, y así evitar
desajustes entre capas, lo que combinado con las técnicas aplicadas para lidiar con la
tolerancia a fallos, hace que el sistema sea robusto y con una alta disponibilidad.
Por otro lado, el framework desarrollado incluye un planificador de tareas capaz de
gestionar el lanzamiento de aplicaciones, la migración de las mismas entre nodos, además
de la posibilidad de incrementar o disminuir su número de procesos de forma dinámica.
Todo ello gracias a la cooperación con otros módulos que se integran en LIMITLESS, y
cuyo objetivo es optimizar la ejecución de una pila de aplicaciones en base a políticas
multicriterio. Esta funcionalidad se llama planificación de grano grueso.
Para un mejor desempeño y con el objetivo de reducir más aún la carga durante la
ejecución, se han aplicado distintas optimizaciones en distintos niveles para tratar de
reducir las comunicaciones entre componentes, a la vez que se trata de evitar la pérdida
de información. Para lograr este objetivo se ha hecho uso de técnicas de filtrado de datos,
algoritmos de Machine Learning (ML), y Redes Neuronales (NN).
Finalmente, para obtener mejores resultados en la planificación de aplicaciones y
para diseñar nuevas políticas de planificación multi-criterio, los datos de monitorización recolectados han sido combinados con nuevos algoritmos de ML para identificar (por
medio de algoritmos de clasificación) aplicaciones y sus fases de ejecución. Todo ello
realizando tareas de profiling offline. Gracias a estas técnicas, LIMITLESS puede detectar
en qué fase de su ejecución se encuentra una determinada aplicación e intentar compartir los
recursos de computacionales con otras aplicaciones que sean compatibles (no se produce
una degradación del rendimiento entre ellas cuando ambas se ejecutan a la vez en el mismo
nodo). Esta funcionalidad se llama planificación de grano fino y puede reducir el tiempo
total de ejecución de la pila de aplicaciones en los casos de uso porque realiza un uso más
eficiente de los recursos de las máquinas.This PhD dissertation has been partially supported by the Spanish Ministry of Science and Innovation under an FPI fellowship associated to a National Project with reference TIN2016-79637-P (from July 1,
2018 to October 10, 2021)Programa de Doctorado en Ciencia y Tecnología Informática por la Universidad Carlos III de MadridPresidente: Félix García Carballeira.- Secretario: Pedro Ángel Cuenca Castillo.- Vocal: María Cristina V. Marinesc
- …