454 research outputs found

    A Scalable Solution For Interactive Video Streaming

    Get PDF
    This dissertation presents an overall solution for interactive Near Video On Demand (NVOD) systems, where limited server and network resources prevent the system from servicing all customers’ requests. The interactive nature of recent workloads complicates matters further. Interactive requests require additional resources to be handled. This dissertation analyzes the system performance under a realistic workload using different stream merging techniques and scheduling policies. It considers a wide range of system parameters and studies their impact on the waiting and blocking metrics. In order to improve waiting customers experience, we propose a new scheduling policy for waiting customers that is fairer and delivers a descent performance. Blocking is a major issue in interactive NVOD systems and we propose a few techniques to minimize it. In particular, we study the maximum Interactive Stream (I-Stream) length (Threshold) that should be allowed in order to prevent a few requests from using the expensive I-Streams for a prolonged period of time, which starves other requests from a chance of using this valuable resource. Using a reasonable I-Stream threshold proves very effective in improving blocking metrics. Moreover, we introduce an I-Stream provisioning policy to dynamically shift resources based on the system requirements at the time. The proposed policy proves to be highly effective in improving the overall system performance. To account for both average waiting time and average blocking time, we introduce a new metric (Aggregate Delay) . We study the client-side cache management policy. We utilize the customer’s cache to service most interactive requests, which reduces the load on the server. We propose three purging algorithms to clear data when the cache gets full. Purge Oldest removes the oldest data in the cache, whereas Purge Furthest clears the furthest data from the client’s playback point. In contrast, Adaptive Purge tries to avoid purging any data that includes the customer’s playback point or the playback point of any stream that is being listened to by the client. Additionally, we study the impact of the purge block, which is the least amount of data to be cleared, on the system performance. Finally, we study the effect of bookmarking on the system performance. A video segment that is searched and watched repeatedly is called a hotspot and is pointed to by a bookmark. We introduce three enhancements to effectively support bookmarking. Specifically, we propose a new purging algorithm to avoid purging hotspot data if it is already cached. On top of that, we fetch hotspot data for customers not listening to any stream. Furthermore, we reserve multicast channels to fetch hotspot data

    Facebook (A)Live? Are Live Social Broadcasts Really Broadcasts?

    Get PDF
    The era of live-broadcast is back but with two major changes. First, unlike traditional TV broadcasts, content is now streamed over the Internet enabling it to reach a wider audience. Second, due to various user-generated content platforms it has become possible for anyone to get involved, streaming their own content to the world. This emerging trend of going live usually happens via social platforms, where users perform live social broadcasts predominantly from their mobile devices, allowing their friends (and the general public) to engage with the stream in real-time. With the growing popularity of such platforms, the burden on the current Internet infrastructure is therefore expected to multiply. With this in mind, we explore one such prominent platform - Facebook Live. We gather 3TB of data, representing one month of global activity and explore the characteristics of live social broadcast. From this, we derive simple yet effective principles which can decrease the network burden. We then dissect global and hyper-local properties of the video while on-air, by capturing the geography of the broadcasters or the users who produce the video and the viewers or the users who interact with it. Finally, we study the social engagement while the video is live and distinguish the key aspects when the same video goes on-demand. A common theme throughout the paper is that, despite its name, many attributes of Facebook Live deviate from both the concepts of live and broadcast.Comment: Published at The Web Conference 2018 (WWW 2018). Please cite the WWW versio

    A survey on cost-effective context-aware distribution of social data streams over energy-efficient data centres

    Get PDF
    Social media have emerged in the last decade as a viable and ubiquitous means of communication. The ease of user content generation within these platforms, e.g. check-in information, multimedia data, etc., along with the proliferation of Global Positioning System (GPS)-enabled, always-connected capture devices lead to data streams of unprecedented amount and a radical change in information sharing. Social data streams raise a variety of practical challenges, including derivation of real-time meaningful insights from effectively gathered social information, as well as a paradigm shift for content distribution with the leverage of contextual data associated with user preferences, geographical characteristics and devices in general. In this article we present a comprehensive survey that outlines the state-of-the-art situation and organizes challenges concerning social media streams and the infrastructure of the data centres supporting the efficient access to data streams in terms of content distribution, data diffusion, data replication, energy efficiency and network infrastructure. We systematize the existing literature and proceed to identify and analyse the main research points and industrial efforts in the area as far as modelling, simulation and performance evaluation are concerned

    Performance characterization of multi-container deployment schemes for online learning inference

    Get PDF
    Online machine learning (ML) inference services provide users with an interactive way to request for predictions in realtime. To meet the notable computational requirements of such services, they are increasingly being deployed in the Cloud. In this context, the efficient provisioning and optimization of ML inference services in the Cloud is critical to achieve the required performance and meet the dynamic queries by end-users. Existing provisioning solutions focus on framework parameter tuning and infrastructure resources scaling, without considering deployments based on containerization technologies. The latter promises reproducibility and portability features for ML inferences services. There is limited knowledge about the impact of distinct deployment schemes at the container-level on the performance of online ML inference services, particularly on how to exploit multi-container deployments and its relation with processor and memory affinity. In light of this, in this paper we investigate experimentally the containerization of ML inference services and analyze the performance of multi-container deployments that partition the threads belonging to an online learning application into multiple containers in each node. This paper shares the findings and lessons learned from conducting realistic client patterns on an image classification model across numerous deployment configurations, especially including the impact of container granularity and its potential to exploit processor and memory affinity. Our results indicate that fine-grained multi-container deployments and affinity are useful for improving performance (both throughput and latency). In particular, our experiments on single-node and four-node clusters show up to 69% and 87% performance improvement compared to the single-container deployment, respectively.This work was partially supported by Lenovo as part of Lenovo-BSC collaboration agreement, by the Spanish Government under contract PID2019-107255GB-C22, and by the Generalitat de Catalunya under contract 2021-SGR-00478 and under grant 2020 FI-B 00257.Peer ReviewedPostprint (author's final draft

    Measurements and analysis of a major adult video portal

    Get PDF
    Today the Internet is a large multimedia delivery infrastructure, with websites such as YouTube appearing at the top of most measurement studies. However, most traffic studies have ignored an important domain: adult multimedia distribution. Whereas, traditionally, such services were provided primarily via bespoke websites, recently these have converged towards what is known as "Porn 2.0". These services allow users to upload, view, rate and comment on videos for free (much like YouTube). Despite their scale, we still lack even a basic understanding of their operation This paper addresses this gap by performing a large-scale study of one of the most popular Porn 2.0 websites: YouPorn. Our measurements reveal a global delivery infrastructure that we have repeatedly crawled to collect statistics (on 183k videos). We use this data to characterise the corpus, as well as to inspect popularity trends and and how they relate to other features, e.g., categories and ratings. To explore our discoveries further, we use a small-scale user study, highlighting key system implications

    Towards Efficient and Scalable Data-Intensive Content Delivery: State-of-the-Art, Issues and Challenges

    Get PDF
    This chapter presents the authors’ work for the Case Study entitled “Delivering Social Media with Scalability” within the framework of High-Performance Modelling and Simulation for Big Data Applications (cHiPSet) COST Action 1406. We identify some core research areas and give an outline of the publications we came up within the framework of the aforementioned action. The ease of user content generation within social media platforms, e.g. check-in information, multimedia data, etc., along with the proliferation of Global Positioning System (GPS)-enabled, always-connected capture devices lead to data streams of unprecedented amount and a radical change in information sharing. Social data streams raise a variety of practical challenges: derivation of real-time meaningful insights from effectively gathered social information, a paradigm shift for content distribution with the leverage of contextual data associated with user preferences, geographical characteristics and devices in general, etc. In this article we present the methodology we followed, the results of our work and the outline of a comprehensive survey, that depicts the state-of-the-art situation and organizes challenges concerning social media streams and the infrastructure of the data centers supporting the efficient access to data streams in terms of content distribution, data diffusion, data replication, energy efficiency and network infrastructure. The challenges of enabling better provisioning of social media data have been identified and they were based on the context of users accessing these resources. The existing literature has been systematized and the main research points and industrial efforts in the area were identified and analyzed. In our works, in the framework of the Action, we came up with potential solutions addressing the problems of the area and described how these fit in the general ecosystem

    User experience driven CPU frequency scaling on mobile devices towards better energy efficiency

    Get PDF
    With the development of modern smartphones, mobile devices have become ubiquitous in our daily lives. With high processing capabilities and a vast number of applications, users now need them for both business and personal tasks. Unfortunately, battery technology did not scale with the same speed as computational power. Hence, modern smartphone batteries often last for less than a day before they need to be recharged. One of the most power hungry components is the central processing unit (CPU). Multiple techniques are applied to reduce CPU energy consumption. Among them is dynamic voltage and frequency scaling (DVFS). This technique reduces energy consumption by dynamically changing CPU supply voltage depending on the currently running workload. Reducing voltage, however, also makes it necessary to reduce the clock frequency, which can have a significant impact on task performance. Current DVFS algorithms deliver a good user experience, however, as experiments conducted later in this thesis will show, they do not deliver an optimal energy efficiency for an interactive mobile workload. This thesis presents methods and tools to determine where energy can be saved during mobile workload execution when using DVFS. Furthermore, an improved DVFS technique is developed that achieves a higher energy efficiency than the current standard. One important question when developing a DVFS technique is: How much can you slow down a task to save energy before the negative effect on performance becomes intolerable? The ultimate goal when optimising a mobile system is to provide a high quality of experience (QOE) to the end user. In that context, task slowdowns become intolerable when they have a perceptible effect on QOE. Experiments conducted in this thesis answer this question by identifying workload periods in which performance changes are directly perceptible by the end user and periods where they are imperceptible, namely interaction lags and interaction idle periods. Interaction lags are the time it takes the system to process a user interaction and display a corresponding response. Idle periods are the periods between interactions where the user perceives the system as idle and ready for the next input. By knowing where those periods are and how they are affected by frequency changes, a more energy efficient DVFS governor can be developed. This thesis begins by introducing a methodology that measures the duration of interaction lags as perceived by the user. It uses them as an indicator to benchmark the quality of experience for a workload execution. A representative benchmark workload is generated comprising 190 minutes of interactions collected from real users. In conjunction with this QOE benchmark, a DVFS Oracle study is conducted. It is able to find a frequency profile for an interactive mobile workload which has the maximum energy savings achievable without a perceptible performance impact on the user. The developed Oracle performance profile achieves a QOE which is indistinguishable from always running on the fastest frequency while needing 45% less energy. Furthermore, this Oracle is used as a baseline to evaluate how well current mobile frequency governors are performing. It shows that none of these governors perform particularly well and up to 32% energy savings are possible. Equipped with a benchmark and an optimisation baseline, a user perception aware DVFS technique is developed in the second part of this thesis. Initially, a runtime heuristic is introduced which is able to detect interaction lags as the user would perceive them. Using this heuristic, a reinforcement learning driven governor is developed which is able to learn good frequency settings for interaction lag and idle periods based on sample observations. It consumes up to 22% less energy than current standard governors on mobile devices, and maintains a low impact on QOE

    On the role of performance interference in consolidated environments

    Get PDF
    Cotutela Universitat PolitĂšcnica de Catalunya i KTH Royal Institute of TechnologyWith the advent of resource shared environments such as the Cloud, virtualization has become the de facto standard for server consolidation. While consolidation improves utilization, it causes performance-interference between Virtual Machines (VMs) from contention in shared resources such as CPU, Last Level Cache (LLC) and memory bandwidth. Over-provisioning resources for performance sensitive applications can guarantee Quality of Service (QoS), however, it results in low machine utilization. Thus, assuring QoS for performance sensitive applications while allowing co-location has been a challenging problem. In this thesis, we identify ways to mitigate performance interference without undue over-provisioning and also point out the need to model and account for performance interference to improve the reliability and accuracy of elastic scaling. The end goal of this research is to leverage on the observations to provide efficient resource management that is both performance and cost aware. Our main contributions are threefold; first, we improve the overall machine utilization by executing best-e↔ort applications along side latency critical applications without violating its performance requirements. Our solution is able to dynamically adapt and leverage on the changing workload/phase behaviour to execute best-e↔ort applications without causing excessive interference on performance; second, we identify that certain performance metrics used for elastic scaling decisions may become unreliable if performance interference is unaccounted. By modelling performance interference, we show that these performance metrics become reliable in a multi-tenant environment; and third, we identify and demonstrate the impact of interference on the accuracy of elastic scaling and propose a solution to significantly minimise performance violations at a reduced cost.Con la apariciĂłn de entornos con recurso compartidos tales como la nube, la virtualizaciĂłn se ha convertido en el estĂĄndar de facto para la consolidaciĂłn de servidores. Mientras que la consolidaciĂłn mejora la utilizaciĂłn, tambiĂ©n causa interferencia en el rendimiento de las mĂĄquinas virtuales (VM) debido a la contenciĂłn en recursos compartidos, tales como CPU, el Ășltimo nivel de cache (LLC) y el ancho de banda de memoria. El exceso de aprovisionamiento de recursos para aplicaciones sensibles al rendimiento puede garantizar la calidad de servicio (QoS), sin embargo, resulta en una baja utilizaciĂłn de la maquina. Por lo tanto, asegurar QoS en aplicaciones sensibles al rendimiento, al tiempo que permitir la co-localizaciĂłn ha sido un problema difĂ­cil. En esta tesis, se identifican las formas de mitigar la interferencia sin necesidad de sobre-aprovisionamiento y tambiĂ©n se señala la necesidad de modelar y contabilizar la interferencia en el desempeño para mejorar la fiabilidad y la precisiĂłn del escalado elĂĄstico. El objetivo final de esta investigaciĂłn consiste en aprovechar las observaciones para proporcionar una gestiĂłn eficiente de los recursos considerando tanto el rendimiento como el coste. Nuestras contribuciones principales son tres; primero, mejoramos la utilizaciĂłn total de la maquina mediante la ejecuciĂłn de aplicaciones best-effort junto con aplicaciones crĂ­ticas en latencia sin vulnerar sus requisitos de rendimiento. Nuestra soluciĂłn es capaz de adaptarse de forma dinĂĄmica y sacar provecho del comportamiento cambiante de la carga de trabajo y sus cambios de fase para ejecutar aplicaciones best-effort, sin causar interferencia excesiva en el rendimiento; segundo, identificamos que ciertos parĂĄmetros de rendimiento utilizados para las decisiones de escalado elĂĄstico pueden no ser fiables si no se tiene en cuenta la interferencia en el rendimiento. Al modelar la interferencia en el rendimiento, se muestra que estas mĂ©tricas de rendimiento resultan fiables en un entorno multi-proveedor; y tercero, se identifica y muestra el impacto de la interferencia en la precisiĂłn del escalado elĂĄstico y se propone una soluciĂłn para minimizar significativamente vulneraciones de rendimiento con un coste reducido.Postprint (published version

    A framework for the dynamic management of Peer-to-Peer overlays

    Get PDF
    Peer-to-Peer (P2P) applications have been associated with inefficient operation, interference with other network services and large operational costs for network providers. This thesis presents a framework which can help ISPs address these issues by means of intelligent management of peer behaviour. The proposed approach involves limited control of P2P overlays without interfering with the fundamental characteristics of peer autonomy and decentralised operation. At the core of the management framework lays the Active Virtual Peer (AVP). Essentially intelligent peers operated by the network providers, the AVPs interact with the overlay from within, minimising redundant or inefficient traffic, enhancing overlay stability and facilitating the efficient and balanced use of available peer and network resources. They offer an “insider‟s” view of the overlay and permit the management of P2P functions in a compatible and non-intrusive manner. AVPs can support multiple P2P protocols and coordinate to perform functions collectively. To account for the multi-faceted nature of P2P applications and allow the incorporation of modern techniques and protocols as they appear, the framework is based on a modular architecture. Core modules for overlay control and transit traffic minimisation are presented. Towards the latter, a number of suitable P2P content caching strategies are proposed. Using a purpose-built P2P network simulator and small-scale experiments, it is demonstrated that the introduction of AVPs inside the network can significantly reduce inter-AS traffic, minimise costly multi-hop flows, increase overlay stability and load-balancing and offer improved peer transfer performance

    System-level power management using online machine learning for prediction and adaptation

    No full text
    Nowadays embedded devices have the need to be portable, battery powered and high performance. This need for high performance makes power management a matter of critical priority. Power management algorithms exist, but most of the approaches focus on an energy-performance trade-off oblivious to the applications running on the system. Others are application-specific and their solution cannot be applied to other applications.This work proposes Shepherd, a cross-layer runtime management system for reduction of energy consumption whilst offering soft real-time performance. It is cross-layer because it takes the performance requirements from the application, and learns to adjust the power management knobs to provide the expected performance at the minimum cost of energy. Shepherd is implemented as a Linux governor running at OS level, this layer offers a low-overhead interface to change the CPU voltage and frequency dynamically.As opposed to the reactive behaviour of Linux Governors, Shepherd adapts to the application-specific performance requirements dynamically, and proactively selects the power state that fulfils these requirements while consuming the least power. Proactiveness is achieved by using AEWMA for adapting to the upcoming workload. These adaptations are facilitated using a model-free reinforcement learning algorithm, that once it learns the optimal decisions it starts exploiting them. This work enables Shepherd to work with different applications. A programming framework was designed to allow programmers to develop their applications to be power-aware, by enabling them to send their performance requirements and annotations to Shepherd and provide the cross-layer soft real-time performance desired.Shepherd is implemented within the Linux Kernel 3.7.10, interfacing with the application and hardware to select an appropriate voltage-frequency control for the executing application. The performance of Shepherd is demonstrated on an ARM Cortex-A8 processor. Experiments conducted with multimedia applications demonstrate that Shepherd minimises energy consumption by up to 30% against existing Governors. Also, the framework has been used to adapt example applications to work with Shepherd, achieving 60% energy savings compared to the existing approaches
    • 

    corecore