79 research outputs found

    Towards Efficient Delivery of Dynamic Web Content

    Get PDF
    Advantages of cache cooperation on edge cache networks serving dynamic web content were studied. Design of cooperative edge cache grid a large-scale cooperative edge cache network for delivering highly dynamic web content with varying server update frequencies was presented. A cache clouds-based architecture was proposed to promote low-cost cache cooperation in cooperative edge cache grid. An Internet landmarks-based scheme, called selective landmarks-based server-distance sensitive clustering scheme, for grouping edge caches into cooperative clouds was presented. Dynamic hashing technique for efficient, load-balanced, and reliable documents lookups and updates was presented. Utility-based scheme for cooperative document placement in cache clouds was proposed. The proposed architecture and techniques were evaluated through trace-based simulations using both real-world and synthetic traces. Results showed that the proposed techniques provide significant performance benefits. A framework for automatically detecting cache-effective fragments in dynamic web pages was presented. Two types of fragments in web pages, namely, shared fragments and lifetime-personalization fragments were identified and formally defined. A hierarchical fragment-aware web page model called the augmented-fragment tree model was proposed. An efficient algorithm to detect maximal fragments that are shared among multiple documents was proposed. A practical algorithm for detecting fragments based on their lifetime and personalization characteristics was designed. The proposed framework and algorithms were evaluated through experiments on real web sites. The effect of adopting the detected fragments on web-caches and origin-servers is experimentally studied.Ph.D.Committee Chair: Dr. Ling Liu; Committee Member: Dr. Arun Iyengar; Committee Member: Dr. Calton Pu; Committee Member: Dr. H. Venkateswaran; Committee Member: Dr. Mustaque Ahama

    SoS: self-organizing substrates

    Get PDF
    Large-scale networked systems often, both by design or chance exhibit self-organizing properties. Understanding self-organization using tools from cybernetics, particularly modeling them as Markov processes is a first step towards a formal framework which can be used in (decentralized) systems research and design.Interesting aspects to look for include the time evolution of a system and to investigate if and when a system converges to some absorbing states or stabilizes into a dynamic (and stable) equilibrium and how it performs under such an equilibrium state. Such a formal framework brings in objectivity in systems research, helping discern facts from artefacts as well as providing tools for quantitative evaluation of such systems. This thesis introduces such formalism in analyzing and evaluating peer-to-peer (P2P) systems in order to better understand the dynamics of such systems which in turn helps in better designs. In particular this thesis develops and studies the fundamental building blocks for a P2P storage system. In the process the design and evaluation methodology we pursue illustrate the typical methodological approaches in studying and designing self-organizing systems, and how the analysis methodology influences the design of the algorithms themselves to meet system design goals (preferably with quantifiable guarantees). These goals include efficiency, availability and durability, load-balance, high fault-tolerance and self-maintenance even in adversarial conditions like arbitrarily skewed and dynamic load and high membership dynamics (churn), apart of-course the specific functionalities that the system is supposed to provide. The functionalities we study here are some of the fundamental building blocks for various P2P applications and systems including P2P storage systems, and hence we call them substrates or base infrastructure. These elemental functionalities include: (i) Reliable and efficient discovery of resources distributed over the network in a decentralized manner; (ii) Communication among participants in an address independent manner, i.e., even when peers change their physical addresses; (iii) Availability and persistence of stored objects in the network, irrespective of availability or departure of individual participants from the system at any time; and (iv) Freshness of the objects/resources' (up-to-date replicas). Internet-scale distributed index structures (often termed as structured overlays) are used for discovery and access of resources in a decentralized setting. We propose a rapid construction from scratch and maintenance of the P-Grid overlay network in a self-organized manner so as to provide efficient search of both individual keys as well as a whole range of keys, doing so providing good load-balancing characteristics for diverse kind of arbitrarily skewed loads - storage and replication, query forwarding and query answering loads. For fast overlay construction we employ recursive partitioning of the key-space so that the resulting partitions are balanced with respect to storage load and replication. The proper algorithmic parameters for such partitioning is derived from a transient analysis of the partitioning process which has Markov property. Preservation of ordering information in P-Grid such that queries other than exact queries, like range queries can be efficiently and rather trivially handled makes P-Grid suitable for data-oriented applications. Fast overlay construction is analogous to building an index on a new set of keys making P-Grid suitable as the underlying indexing mechanism for peer-to-peer information retrieval applications among other potential applications which may require frequent indexing of new attributes apart regular updates to an existing index. In order to deal with membership dynamics, in particular changing physical address of peers across sessions, the overlay itself is used as a (self-referential) directory service for maintaining the participating peers' physical addresses across sessions. Exploiting this self-referential directory, a family of overlay maintenance scheme has been designed with lower communication overhead than other overlay maintenance strategies. The notion of dynamic equilibrium study for overlays under continuous churn and repairs, modeled as a Markov process, was introduced in order to evaluate and compare the overlay maintenance schemes. While the self-referential directory was originally invented to realize overlay maintenance schemes with lower overheads than existing overlay maintenance schemes, the self-referential directory is generic in nature and can be used for various other purposes, e.g., as a decentralized public key infrastructure. Persistence of peer identity across sessions, in spite of changes in physical address, provides a logical independence of the overlay network from the underlying physical network. This has many other potential usages, for example, efficient maintenance mechanisms for P2P storage systems and P2P trust and reputation management. We specifically look into the dynamics of maintaining redundancy for storage systems and design a novel lazy maintenance strategy. This strategy is algorithmically a simple variant of existing maintenance strategies which adapts to the system dynamics. This randomized lazy maintenance strategy thus explores the cost-performance trade-offs of the storage maintenance operations in a self-organizing manner. We model the storage system (redundancy), under churn and maintenance, as a Markov process. We perform an equilibrium study to show that the system operates in a more stable dynamic equilibrium with our strategy than for the existing maintenance scheme for comparable overheads. Particularly, we show that our maintenance scheme provides substantial performance gains in terms of maintenance overhead and system's resilience in presence of churn and correlated failures. Finally, we propose a gossip mechanism which works with lower communication overhead than existing approaches for communication among a relatively large set of unreliable peers without assuming any specific structure for their mutual connectivity. We use such a communication primitive for propagating replica updates in P2P systems, facilitating management of mutable content in P2P systems. The peer population affected by a gossip can be modeled as a Markov process. Studying the transient spread of gossips help in choosing proper algorithm parameters to reduce communication overhead while guaranteeing coverage of online peers. Each of these substrates in themselves were developed to find practical solutions for real problems. Put together, these can be used in other applications, including a P2P storage system with support for efficient lookup and inserts, membership dynamics, content mutation and updates, persistence and availability. Many of the ideas have already been implemented in real systems and several others are in the way to be integrated into the implementations. There are two principal contributions of this dissertation. It provides design of the P2P systems which are useful for end-users as well as other application developers who can build upon these existing systems. Secondly, it adapts and introduces the methodology of analysis of a system's time-evolution (tools typically used in diverse domains including physics and cybernetics) to study the long run behavior of P2P systems, and uses this methodology to (re-)design appropriate algorithms and evaluate them. We observed that studying P2P systems from the perspective of complex systems reveals their inner dynamics and hence ways to exploit such dynamics for suitable or better algorithms. In other words, the analysis methodology in itself strongly influences and inspires the way we design such systems. We believe that such an approach of orchestrating self-organization in internet-scale systems, where the algorithms and the analysis methodology have strong mutual influence will significantly change the way future such systems are developed and evaluated. We envision that such an approach will particularly serve as an important tool for the nascent but fast moving P2P systems research and development community

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Scalable hosting of web applications

    Get PDF
    Modern Web sites have evolved from simple monolithic systems to complex multitiered systems. In contrast to traditional Web sites, these sites do not simply deliver pre-written content but dynamically generate content using (one or more) multi-tiered Web applications. In this thesis, we addressed the question: How to host multi-tiered Web applications in a scalable manner? Scaling up a Web application requires scaling its individual tiers. To this end, various research works have proposed techniques that employ replication or caching solutions at different tiers. However, most of these techniques aim to optimize the performance of individual tiers and not the entire application. A key observation made in our research is that there exists no elixir technique that performs the best for allWeb applications. Effective hosting of a Web application requires careful selection and deployment of several techniques at different tiers. To this end, we present several caching and replication strategies, such as GlobeCBC, GlobeDB and GlobeTP, to improve the scalability of different tiers of a Web application. While these techniques and systems improve the performance of the individual tiers (and eventually the application), an application's administrator is not only interested in the performance of its individual tiers but also in its endto- end performance. To this end, we propose a resource provisioning approach that allows us to choose the best resource configuration for hosting a Web application such that its end-to-end response time can be optimized with minimum usage of resources. The proposed approach is based on an analytical model for multi-tier systems, which allows us to derive expressions for estimating the mean end-to-end response time and its variance.Steen, M.R. van [Promotor]Pierre, G.E.O. [Copromotor

    High-Performance Modelling and Simulation for Big Data Applications

    Get PDF
    This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications

    Entrega de conteúdos multimédia em over-the-top: caso de estudo das gravações automáticas

    Get PDF
    Doutoramento em Engenharia EletrotécnicaOver-The-Top (OTT) multimedia delivery is a very appealing approach for providing ubiquitous, exible, and globally accessible services capable of low-cost and unrestrained device targeting. In spite of its appeal, the underlying delivery architecture must be carefully planned and optimized to maintain a high Qualityof- Experience (QoE) and rational resource usage, especially when migrating from services running on managed networks with established quality guarantees. To address the lack of holistic research works on OTT multimedia delivery systems, this Thesis focuses on an end-to-end optimization challenge, considering a migration use-case of a popular Catch-up TV service from managed IP Television (IPTV) networks to OTT. A global study is conducted on the importance of Catch-up TV and its impact in today's society, demonstrating the growing popularity of this time-shift service, its relevance in the multimedia landscape, and tness as an OTT migration use-case. Catch-up TV consumption logs are obtained from a Pay-TV operator's live production IPTV service containing over 1 million subscribers to characterize demand and extract insights from service utilization at a scale and scope not yet addressed in the literature. This characterization is used to build demand forecasting models relying on machine learning techniques to enable static and dynamic optimization of OTT multimedia delivery solutions, which are able to produce accurate bandwidth and storage requirements' forecasts, and may be used to achieve considerable power and cost savings whilst maintaining a high QoE. A novel caching algorithm, Most Popularly Used (MPU), is proposed, implemented, and shown to outperform established caching algorithms in both simulation and experimental scenarios. The need for accurate QoE measurements in OTT scenarios supporting HTTP Adaptive Streaming (HAS) motivates the creation of a new QoE model capable of taking into account the impact of key HAS aspects. By addressing the complete content delivery pipeline in the envisioned content-aware OTT Content Delivery Network (CDN), this Thesis demonstrates that signi cant improvements are possible in next-generation multimedia delivery solutions.A entrega de conteúdos multimédia em Over-The-Top (OTT) e uma proposta atractiva para fornecer um serviço flexível e globalmente acessível, capaz de alcançar qualquer dispositivo, com uma promessa de baixos custos. Apesar das suas vantagens, e necessario um planeamento arquitectural detalhado e optimizado para manter níveis elevados de Qualidade de Experiência (QoE), em particular aquando da migração dos serviços suportados em redes geridas com garantias de qualidade pré-estabelecidas. Para colmatar a falta de trabalhos de investigação na área de sistemas de entrega de conteúdos multimédia em OTT, esta Tese foca-se na optimização destas soluções como um todo, partindo do caso de uso de migração de um serviço popular de Gravações Automáticas suportado em redes de Televisão sobre IP (IPTV) geridas, para um cenário de entrega em OTT. Um estudo global para aferir a importância das Gravações Automáticas revela a sua relevância no panorama de serviços multimédia e a sua adequação enquanto caso de uso de migração para cenários OTT. São obtidos registos de consumos de um serviço de produção de Gravações Automáticas, representando mais de 1 milhão de assinantes, para caracterizar e extrair informação de consumos numa escala e âmbito não contemplados ate a data na literatura. Esta caracterização e utilizada para construir modelos de previsão de carga, tirando partido de sistemas de machine learning, que permitem optimizações estáticas e dinâmicas dos sistemas de entrega de conteúdos em OTT através de previsões das necessidades de largura de banda e armazenamento, potenciando ganhos significativos em consumo energético e custos. Um novo mecanismo de caching, Most Popularly Used (MPU), demonstra um desempenho superior as soluções de referencia, quer em cenários de simulação quer experimentais. A necessidade de medição exacta da QoE em streaming adaptativo HTTP motiva a criaçao de um modelo capaz de endereçar aspectos específicos destas tecnologias adaptativas. Ao endereçar a cadeia completa de entrega através de uma arquitectura consciente dos seus conteúdos, esta Tese demonstra que são possíveis melhorias de desempenho muito significativas nas redes de entregas de conteúdos em OTT de próxima geração

    Ninth Workshop and Tutorial on Practical Use of Coloured Petri Nets and the CPN Tools, Aarhus, Denmark, October 20-22, 2008

    Get PDF
    This booklet contains the proceedings of the Ninth Workshop on Practical Use of Coloured Petri Nets and the CPN Tools, October 20-22, 2008. The workshop is organised by the CPN group at the Department of Computer Science, University of Aarhus, Denmark. The papers are also available in electronic form via the web pages: http://www.daimi.au.dk/CPnets/workshop0

    Dynamic Compilation for Functional Programs

    Get PDF
    Diese Arbeit behandelt die dynamische, zur Laufzeit stattfindende Übersetzung und Optimierung funktionaler Programme. Ziel der Optimierung ist die erhöhte Laufzeiteffizient der Programme, die durch die compilergesteuerte Eliminierung von Abstraktionen der Programmiersprache erreicht wird. Bei der Implementierung objekt-orientierter Programmiersprachen werden bereits seit mehreren Jahrzehnten Compiler-Techniken zur Laufzeit eingesetzt, um objekt-orientierte Programme effizient ausführen zu können. Spätestens seit der Einführung der Programmiersprache Java und ihres auf einer abstrakten Maschine basierenden Ausführungsmodells hat sich die Praktikabilität dieser Implementierungstechnik gezeigt. Viele Eigenschaften moderner Programmiersprachen konnten erst durch den Einsatz dynamischer Transformationstechniken effizient realisiert werden, wie zum Beispiel das dynamische Nachladen von Programmteilen (auch über Netzwerke), Reflection sowie verschiedene Sicherheitslösungen (z.B. Sandboxing). Ziel dieser Arbeit ist zu zeigen, dass rein funktionale Programmiersprachen auf ähnliche Weise effizient implementiert werden können, und sogar Vorteile gegenüber den allgemein eingesetzten objekt-orientierten Sprachen bieten, was die Effizienz, Sicherheit und Korrektheit von Programmen angeht. Um dieses Ziel zu erreichen, werden in dieser Arbeit Implementierungstechniken entworfen bzw. aus bestehenden Lösungen weiterentwickelt, welche die dynamische Kompilierung und Optimierung funktionaler Programme erlauben: zum einen präsentieren wir eine Programmzwischendarstellung (getypte dynamische Continuation-Passing-Style-Darstellung), welche sich zur dynamischen Kompilierung und Optimierung eignet. Basierend auf dieser Darstellung haben wir eine Erweiterung zur verzögerten und selektiven Codeerzeugung von Programmteilen entwickelt. Der wichtigste Beitrag dieser Arbeit ist die dynamische Spezialisierung zur Eliminierung polymorpher Funktionen und Datenstrukturen, welche die Effizienz funktionaler Programme deutlich steigern kann. Die präsentierten Ergebnisse experimenteller Messungen eines prototypischen Ausführungssystems belegen, dass funktionale Programme effizient dynamisch kompiliert werden können.This thesis is about dynamic translation and optimization of functional programs. The goal of the optimization is increased run-time efficiency, which is obtained by compiler-directed elimination of programming language abstractions. Object-oriented programming languages have been implemented for several decades using run-time compilation techniques. With the introduction of the Java programming language and its virtual machine-based execution model, the practicability of this implementation method for real-world applications has been proved. Many aspects of modern programming languages, such as dynamic loading and linking of code (even across networks), reflection and security solutions (e.g., sandboxing) can be realized efficiently only by using dynamic transformation techniques. The goal of this work is to show that functional programming languages can be efficiently implemented in a similar way, and that these languages even offer advantages when compared to more common object-oriented languages. Efficiency, security and correctness of programs is easier to ensure in the functional setting. Towards this goal, we design and develop implementation techniques to enable dynamic compilation and optimization of functional programming languages: we describe an intermediate representation for functional programs (typed dynamic continuation-passing style), which is well suited for dynamic compilation. Based on this representation, we have developed an extension for incremental and selective code generation. The main contribution of this work shows how dynamic specialization of polymorphic functions and data structures can increase the run-time efficiency of functional programs considerably. We present the results of experimental measurements for a prototypical implementation, which prove that functional programs can efficiently be dynamically compiled

    Design-time performance analysis of component-based real-time systems

    Get PDF
    In current real-time systems, performance metrics are one of the most challenging properties to specify, predict and measure. Performance properties depend on various factors, like environmental context, load profile, middleware, operating system, hardware platform and sharing of internal resources. Performance failures and not satisfying related requirements cause delays, cost overruns, and even abandonment of projects. In order to avoid these performancerelated project failures, the performance properties should be obtained and analyzed already at the early design phase of a project. In this thesis we employ principles of component-based software engineering (CBSE), which enable building software systems from individual components. The advantage of CBSE is that individual components can be modeled, reused and traded. The main objective of this thesis is to develop a method that enables to predict the performance properties of a system, based on the performance properties of the involved individual components. The prediction method serves rapid prototyping and performance analysis of the architecture or related alternatives, without performing the usual testing and implementation stages. The involved research questions are as follows. How should the behaviour and performance properties of individual components be specified in order to enable automated composition of these properties into an analyzable model of a complete system? How to synthesize the models of individual components into a model of a complete system in an automated way, such that the resulting system model can be analyzed against the performance properties? The thesis presents a new framework called DeepCompass, which realizes the concept of predictable assembly throughout all phases of the system design. The cornerstones of the framework are the composable models of individual software components and hardware blocks. The models are specified at the component development time and shipped in a component package. At the component composition phase, the models of the constituent components are synthesized into an executable system model. Since the thesis focuses on performance properties, we introduce performance-related types of component models, such as behaviour, performance and resource models. The dynamics of the system execution are captured in scenario models. The essential advantage of the introduced models is that, through the behaviour of individual components and scenario models, the behaviour of the complete system is synthesized in the executable system model. Further simulation-based analysis of the obtained executable system model provides application-specific and system-specific performance property values. To support the performance analysis, we have developed a CARAT software toolkit that provides and automates the algorithms for model synthesis and simulation. Besides this, the toolkit provides graphical tools for designing alternative architectures and visualization of obtained performance properties. We have conducted an empirical case study on the use of scenarios in the industry to analyze the system performance at the early design phase. It was found that industrial architects make extensive use of scenarios for performance evaluation. Based on the inputs of the architects, we have provided a set of guidelines for identification and use of performance-critical scenarios. At the end of this thesis, we have validated the DeepCompass framework by performing three case studies on performance prediction of real-time systems: an MPEG-4 video decoder, a Car Radio Navigation system and a JPEG application. For each case study, we have constructed models of the individual components, defined the SW/HW architecture, and used the CARAT toolkit to synthesize and simulate the executable system model. The simulation provided the predicted performance properties, which we later compared with the actual performance properties of the realized systems. With respect to resource usage properties and average task latencies, the variation of the prediction error showed to be within 30% of the actual performance. Concerning the pick loads on the processor nodes, the actual values were sometimes three times larger than the predicted values. As a conclusion, the framework has proven to be effective in rapid architecture prototyping and performance analysis of a complete system. This is valid, as in the case studies we have spent not more than 4-5 days on the average for the complete iteration cycle, including the design of several architecture alternatives. The framework can handle different architectural styles, which makes it widely applicable. A conceptual limitation of the framework is that it assumes that the models of individual components are already available at the design phase
    corecore