39 research outputs found

    The Architecture of Complexity Revisited: Design Primitives for Ultra-Large-Scale Systems

    Get PDF
    As software-intensive systems continue to grow in scale and complexity the techniques that we have used to design and analyze them in the past no longer suffice. In this paper we look at examples of existing ultra-large-scale systems—systems of enormous size and complexity. We examine instances of such systems that have arisen spontaneously in nature and those that have been human-constructed. We distill from these example systems the design primitives that underlie them. We capture these design primitives as a set of tactics— fundamental architectural building-blocks—and argue that to efficiently build and analyze such systems in the future we should strongly consider employing such building-blocks

    Self-organising agent communities for autonomic resource management

    No full text
    The autonomic computing paradigm addresses the operational challenges presented by increasingly complex software systems by proposing that they be composed of many autonomous components, each responsible for the run-time reconfiguration of its own dedicated hardware and software components. Consequently, regulation of the whole software system becomes an emergent property of local adaptation and learning carried out by these autonomous system elements. Designing appropriate local adaptation policies for the components of such systems remains a major challenge. This is particularly true where the system’s scale and dynamism compromise the efficiency of a central executive and/or prevent components from pooling information to achieve a shared, accurate evidence base for their negotiations and decisions.In this paper, we investigate how a self-regulatory system response may arise spontaneously from local interactions between autonomic system elements tasked with adaptively consuming/providing computational resources or services when the demand for such resources is continually changing. We demonstrate that system performance is not maximised when all system components are able to freely share information with one another. Rather, maximum efficiency is achieved when individual components have only limited knowledge of their peers. Under these conditions, the system self-organises into appropriate community structures. By maintaining information flow at the level of communities, the system is able to remain stable enough to efficiently satisfy service demand in resource-limited environments, and thus minimise any unnecessary reconfiguration whilst remaining sufficiently adaptive to be able to reconfigure when service demand changes

    Resource discovery for distributed computing systems: A comprehensive survey

    Get PDF
    Large-scale distributed computing environments provide a vast amount of heterogeneous computing resources from different sources for resource sharing and distributed computing. Discovering appropriate resources in such environments is a challenge which involves several different subjects. In this paper, we provide an investigation on the current state of resource discovery protocols, mechanisms, and platforms for large-scale distributed environments, focusing on the design aspects. We classify all related aspects, general steps, and requirements to construct a novel resource discovery solution in three categories consisting of structures, methods, and issues. Accordingly, we review the literature, analyzing various aspects for each category

    Mathematical analysis of scheduling policies in peer-to-peer video streaming networks

    Get PDF
    Las redes de pares son comunidades virtuales autogestionadas, desarrolladas en la capa de aplicación sobre la infraestructura de Internet, donde los usuarios (denominados pares) comparten recursos (ancho de banda, memoria, procesamiento) para alcanzar un fin común. La distribución de video representa la aplicación más desafiante, dadas las limitaciones de ancho de banda. Existen básicamente tres servicios de video. El más simple es la descarga, donde un conjunto de servidores posee el contenido original, y los usuarios deben descargar completamente este contenido previo a su reproducción. Un segundo servicio se denomina video bajo demanda, donde los pares se unen a una red virtual siempre que inicien una solicitud de un contenido de video, e inician una descarga progresiva en línea. El último servicio es video en vivo, donde el contenido de video es generado, distribuido y visualizado simultáneamente. En esta tesis se estudian aspectos de diseño para la distribución de video en vivo y bajo demanda. Se presenta un análisis matemático de estabilidad y capacidad de arquitecturas de distribución bajo demanda híbridas, asistidas por pares. Los pares inician descargas concurrentes de múltiples contenidos, y se desconectan cuando lo desean. Se predice la evolución esperada del sistema asumiendo proceso Poisson de arribos y egresos exponenciales, mediante un modelo determinístico de fluidos. Un sub-modelo de descargas secuenciales (no simultáneas) es globalmente y estructuralmente estable, independientemente de los parámetros de la red. Mediante la Ley de Little se determina el tiempo medio de residencia de usuarios en un sistema bajo demanda secuencial estacionario. Se demuestra teóricamente que la filosofía híbrida de cooperación entre pares siempre desempeña mejor que la tecnología pura basada en cliente-servidor

    A survey of distributed data aggregation algorithms

    Get PDF
    Distributed data aggregation is an important task, allowing the decentralized determination of meaningful global properties, which can then be used to direct the execution of other applications. The resulting values are derived by the distributed computation of functions like COUNT, SUM, and AVERAGE. Some application examples deal with the determination of the network size, total storage capacity, average load, majorities and many others. In the last decade, many different approaches have been proposed, with different trade-offs in terms of accuracy, reliability, message and time complexity. Due to the considerable amount and variety of aggregation algorithms, it can be difficult and time consuming to determine which techniques will be more appropriate to use in specific settings, justifying the existence of a survey to aid in this task. This work reviews the state of the art on distributed data aggregation algorithms, providing three main contributions. First, it formally defines the concept of aggregation, characterizing the different types of aggregation functions. Second, it succinctly describes the main aggregation techniques, organizing them in a taxonomy. Finally, it provides some guidelines toward the selection and use of the most relevant techniques, summarizing their principal characteristics.info:eu-repo/semantics/publishedVersio

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    Descoberta de recursos para sistemas de escala arbitrarias

    Get PDF
    Doutoramento em InformáticaTecnologias de Computação Distribuída em larga escala tais como Cloud, Grid, Cluster e Supercomputadores HPC estão a evoluir juntamente com a emergência revolucionária de modelos de múltiplos núcleos (por exemplo: GPU, CPUs num único die, Supercomputadores em single die, Supercomputadores em chip, etc) e avanços significativos em redes e soluções de interligação. No futuro, nós de computação com milhares de núcleos podem ser ligados entre si para formar uma única unidade de computação transparente que esconde das aplicações a complexidade e a natureza distribuída desses sistemas com múltiplos núcleos. A fim de beneficiar de forma eficiente de todos os potenciais recursos nesses ambientes de computação em grande escala com múltiplos núcleos ativos, a descoberta de recursos é um elemento crucial para explorar ao máximo as capacidade de todos os recursos heterogéneos distribuídos, através do reconhecimento preciso e localização desses recursos no sistema. A descoberta eficiente e escalável de recursos ´e um desafio para tais sistemas futuros, onde os recursos e as infira-estruturas de computação e comunicação subjacentes são altamente dinâmicas, hierarquizadas e heterogéneas. Nesta tese, investigamos o problema da descoberta de recursos no que diz respeito aos requisitos gerais da escalabilidade arbitrária de ambientes de computação futuros com múltiplos núcleos ativos. A principal contribuição desta tese ´e a proposta de uma entidade de descoberta de recursos adaptativa híbrida (Hybrid Adaptive Resource Discovery - HARD), uma abordagem de descoberta de recursos eficiente e altamente escalável, construída sobre uma sobreposição hierárquica virtual baseada na auto-organizaçãoo e auto-adaptação de recursos de processamento no sistema, onde os recursos computacionais são organizados em hierarquias distribuídas de acordo com uma proposta de modelo de descriçãoo de recursos multi-camadas hierárquicas. Operacionalmente, em cada camada, que consiste numa arquitetura ponto-a-ponto de módulos que, interagindo uns com os outros, fornecem uma visão global da disponibilidade de recursos num ambiente distribuído grande, dinâmico e heterogéneo. O modelo de descoberta de recursos proposto fornece a adaptabilidade e flexibilidade para executar consultas complexas através do apoio a um conjunto de características significativas (tais como multi-dimensional, variedade e consulta agregada) apoiadas por uma correspondência exata e parcial, tanto para o conteúdo de objetos estéticos e dinâmicos. Simulações mostram que o HARD pode ser aplicado a escalas arbitrárias de dinamismo, tanto em termos de complexidade como de escala, posicionando esta proposta como uma arquitetura adequada para sistemas futuros de múltiplos núcleos. Também contribuímos com a proposta de um regime de gestão eficiente dos recursos para sistemas futuros que podem utilizar recursos distribuíos de forma eficiente e de uma forma totalmente descentralizada. Além disso, aproveitando componentes de descoberta (RR-RPs) permite que a nossa plataforma de gestão de recursos encontre e aloque dinamicamente recursos disponíeis que garantam os parâmetros de QoS pedidos.Large scale distributed computing technologies such as Cloud, Grid, Cluster and HPC supercomputers are progressing along with the revolutionary emergence of many-core designs (e.g. GPU, CPUs on single die, supercomputers on chip, etc.) and significant advances in networking and interconnect solutions. In future, computing nodes with thousands of cores may be connected together to form a single transparent computing unit which hides from applications the complexity and distributed nature of these many core systems. In order to efficiently benefit from all the potential resources in such large scale many-core-enabled computing environments, resource discovery is the vital building block to maximally exploit the capabilities of all distributed heterogeneous resources through precisely recognizing and locating those resources in the system. The efficient and scalable resource discovery is challenging for such future systems where the resources and the underlying computation and communication infrastructures are highly-dynamic, highly-hierarchical and highly-heterogeneous. In this thesis, we investigate the problem of resource discovery with respect to the general requirements of arbitrary scale future many-core-enabled computing environments. The main contribution of this thesis is to propose Hybrid Adaptive Resource Discovery (HARD), a novel efficient and highly scalable resource-discovery approach which is built upon a virtual hierarchical overlay based on self-organization and self-adaptation of processing resources in the system, where the computing resources are organized into distributed hierarchies according to a proposed hierarchical multi-layered resource description model. Operationally, at each layer, it consists of a peer-to-peer architecture of modules that, by interacting with each other, provide a global view of the resource availability in a large, dynamic and heterogeneous distributed environment. The proposed resource discovery model provides the adaptability and flexibility to perform complex querying by supporting a set of significant querying features (such as multi-dimensional, range and aggregate querying) while supporting exact and partial matching, both for static and dynamic object contents. The simulation shows that HARD can be applied to arbitrary scales of dynamicity, both in terms of complexity and of scale, positioning this proposal as a proper architecture for future many-core systems. We also contributed to propose a novel resource management scheme for future systems which efficiently can utilize distributed resources in a fully decentralized fashion. Moreover, leveraging discovery components (RR-RPs) enables our resource management platform to dynamically find and allocate available resources that guarantee the QoS parameters on demand

    Ontology engineering and routing in distributed knowledge management applications

    Get PDF
    corecore