132 research outputs found

    State Management for Efficient Event Pattern Detection

    Get PDF
    Event Stream Processing (ESP) Systeme ĂŒberwachen kontinuierliche Datenströme, um benutzerdefinierte Queries auszuwerten. Die Herausforderung besteht darin, dass die Queryverarbeitung zustandsbehaftet ist und die Anzahl von TeilĂŒbereinstimmungen mit der GrĂ¶ĂŸe der verarbeiteten Events exponentiell anwĂ€chst. Die Dynamik von Streams und die Notwendigkeit, entfernte Daten zu integrieren, erschweren die Zustandsverwaltung. Erstens liefern heterogene Eventquellen Streams mit unvorhersehbaren Eingaberaten und QueryselektivitĂ€ten. WĂ€hrend Spitzenzeiten ist eine erschöpfende Verarbeitung unmöglich, und die Systeme mĂŒssen auf eine Best-Effort-Verarbeitung zurĂŒckgreifen. Zweitens erfordern Queries möglicherweise externe Daten, um ein bestimmtes Event fĂŒr eine Query auszuwĂ€hlen. Solche AbhĂ€ngigkeiten sind problematisch: Das Abrufen der Daten unterbricht die Stream-Verarbeitung. Ohne eine Eventauswahl auf Grundlage externer Daten wird das Wachstum von TeilĂŒbereinstimmungen verstĂ€rkt. In dieser Dissertation stelle ich Strategien fĂŒr optimiertes Zustandsmanagement von ESP Systemen vor. Zuerst ermögliche ich eine Best-Effort-Verarbeitung mittels Load Shedding. Dabei werden sowohl Eingabeeevents als auch TeilĂŒbereinstimmungen systematisch verworfen, um eine Latenzschwelle mit minimalem QualitĂ€tsverlust zu garantieren. Zweitens integriere ich externe Daten, indem ich das Abrufen dieser von der Verwendung in der Queryverarbeitung entkoppele. Mit einem effizienten Caching-Mechanismus vermeide ich Unterbrechungen durch Übertragungslatenzen. Dazu werden externe Daten basierend auf ihrer erwarteten Verwendung vorab abgerufen und mittels Lazy Evaluation bei der Eventauswahl berĂŒcksichtigt. Dabei wird ein Kostenmodell verwendet, um zu bestimmen, wann welche externen Daten abgerufen und wie lange sie im Cache aufbewahrt werden sollen. Ich habe die EffektivitĂ€t und Effizienz der vorgeschlagenen Strategien anhand von synthetischen und realen Daten ausgewertet und unter Beweis gestellt.Event stream processing systems continuously evaluate queries over event streams to detect user-specified patterns with low latency. However, the challenge is that query processing is stateful and it maintains partial matches that grow exponentially in the size of processed events. State management is complicated by the dynamicity of streams and the need to integrate remote data. First, heterogeneous event sources yield dynamic streams with unpredictable input rates, data distributions, and query selectivities. During peak times, exhaustive processing is unreasonable, and systems shall resort to best-effort processing. Second, queries may require remote data to select a specific event for a pattern. Such dependencies are problematic: Fetching the remote data interrupts the stream processing. Yet, without event selection based on remote data, the growth of partial matches is amplified. In this dissertation, I present strategies for optimised state management in event pattern detection. First, I enable best-effort processing with load shedding that discards both input events and partial matches. I carefully select the shedding elements to satisfy a latency bound while striving for a minimal loss in result quality. Second, to efficiently integrate remote data, I decouple the fetching of remote data from its use in query evaluation by a caching mechanism. To this end, I hide the transmission latency by prefetching remote data based on anticipated use and by lazy evaluation that postpones the event selection based on remote data to avoid interruptions. A cost model is used to determine when to fetch which remote data items and how long to keep them in the cache. I evaluated the above techniques with queries over synthetic and real-world data. I show that the load shedding technique significantly improves the recall of pattern detection over baseline approaches, while the technique for remote data integration significantly reduces the pattern detection latency

    Mitigating Interference During Virtual Machine Live Migration through Storage Offloading

    Get PDF
    Today\u27s cloud landscape has evolved computing infrastructure into a dynamic, high utilization, service-oriented paradigm. This shift has enabled the commoditization of large-scale storage and distributed computation, allowing engineers to tackle previously untenable problems without large upfront investment. A key enabler of flexibility in the cloud is the ability to transfer running virtual machines across subnets or even datacenters using live migration. However, live migration can be a costly process, one that has the potential to interfere with other applications not involved with the migration. This work investigates storage interference through experimentation with real-world systems and well-established benchmarks. In order to address migration interference in general, a buffering technique is presented that offloads the migration\u27s read, eliminating interference in the majority of scenarios

    Prefetching techniques for client server object-oriented database systems

    Get PDF
    The performance of many object-oriented database applications suffers from the page fetch latency which is determined by the expense of disk access. In this work we suggest several prefetching techniques to avoid, or at least to reduce, page fetch latency. In practice no prediction technique is perfect and no prefetching technique can entirely eliminate delay due to page fetch latency. Therefore we are interested in the trade-off between the level of accuracy required for obtaining good results in terms of elapsed time reduction and the processing overhead needed to achieve this level of accuracy. If prefetching accuracy is high then the total elapsed time of an application can be reduced significantly otherwise if the prefetching accuracy is low, many incorrect pages are prefetched and the extra load on the client, network, server and disks decreases the whole system performance. Access pattern of object-oriented databases are often complex and usually hard to predict accurately. The ..

    Human Mobility and Application Usage Prediction Algorithms for Mobile Devices

    Get PDF
    Mobile devices such as smartphones and smart watches are ubiquitous companions of humans’ daily life. Since 2014, there are more mobile devices on Earth than humans. Mobile applications utilize sensors and actuators of these devices to support individuals in their daily life. In particular, 24% of the Android applications leverage users’ mobility data. For instance, this data allows applications to understand which places an individual typically visits. This allows providing her with transportation information, location-based advertisements, or to enable smart home heating systems. These and similar scenarios require the possibility to access the Internet from everywhere and at any time. To realize these scenarios 83% of the applications available in the Android Play Store require the Internet to operate properly and therefore access it from everywhere and at any time. Mobile applications such as Google Now or Apple Siri utilize human mobility data to anticipate where a user will go next or which information she is likely to access en route to her destination. However, predicting human mobility is a challenging task. Existing mobility prediction solutions are typically optimized a priori for a particular application scenario and mobility prediction task. There is no approach that allows for automatically composing a mobility prediction solution depending on the underlying prediction task and other parameters. This approach is required to allow mobile devices to support a plethora of mobile applications running on them, while each of the applications support its users by leveraging mobility predictions in a distinct application scenario. Mobile applications rely strongly on the availability of the Internet to work properly. However, mobile cellular network providers are struggling to provide necessary cellular resources. Mobile applications generate a monthly average mobile traffic volume that ranged between 1 GB in Asia and 3.7 GB in North America in 2015. The Ericsson Mobility Report Q1 2016 predicts that by the end of 2021 this mobile traffic volume will experience a 12-fold increase. The consequences are higher costs for both providers and consumers and a reduced quality of service due to congested mobile cellular networks. Several countermeasures can be applied to cope with these problems. For instance, mobile applications apply caching strategies to prefetch application content by predicting which applications will be used next. However, existing solutions suffer from two major shortcomings. They either (1) do not incorporate traffic volume information into their prefetching decisions and thus generate a substantial amount of cellular traffic or (2) require a modification of mobile application code. In this thesis, we present novel human mobility and application usage prediction algorithms for mobile devices. These two major contributions address the aforementioned problems of (1) selecting a human mobility prediction model and (2) prefetching of mobile application content to reduce cellular traffic. First, we address the selection of human mobility prediction models. We report on an extensive analysis of the influence of temporal, spatial, and phone context data on the performance of mobility prediction algorithms. Building upon our analysis results, we present (1) SELECTOR – a novel algorithm for selecting individual human mobility prediction models and (2) MAJOR – an ensemble learning approach for human mobility prediction. Furthermore, we introduce population mobility models and demonstrate their practical applicability. In particular, we analyze techniques that focus on detection of wrong human mobility predictions. Among these techniques, an ensemble learning algorithm, called LOTUS, is designed and evaluated. Second, we present EBC – a novel algorithm for prefetching mobile application content. EBC’s goal is to reduce cellular traffic consumption to improve application content freshness. With respect to existing solutions, EBC presents novel techniques (1) to incorporate different strategies for prefetching mobile applications depending on the available network type and (2) to incorporate application traffic volume predictions into the prefetching decisions. EBC also achieves a reduction in application launch time to the cost of a negligible increase in energy consumption. Developing human mobility and application usage prediction algorithms requires access to human mobility and application usage data. To this end, we leverage in this thesis three publicly available data set. Furthermore, we address the shortcomings of these data sets, namely, (1) the lack of ground-truth mobility data and (2) the lack of human mobility data at short-term events like conferences. We contribute with JK2013 and UbiComp Data Collection Campaign (UbiDCC) two human mobility data sets that address these shortcomings. We also develop and make publicly available a mobile application called LOCATOR, which was used to collect our data sets. In summary, the contributions of this thesis provide a step further towards supporting mobile applications and their users. With SELECTOR, we contribute an algorithm that allows optimizing the quality of human mobility predictions by appropriately selecting parameters. To reduce the cellular traffic footprint of mobile applications, we contribute with EBC a novel approach for prefetching of mobile application content by leveraging application usage predictions. Furthermore, we provide insights about how and to what extent wrong and uncertain human mobility predictions can be detected. Lastly, with our mobile application LOCATOR and two human mobility data sets, we contribute practical tools for researchers in the human mobility prediction domain

    Surfing the optimization space of a multiple-GPU parallel implementation of a X-ray tomography reconstruction algorithm

    Get PDF
    The increasing popularity of massively parallel architectures based on accelerators have opened up the possibility of significantly improving the performance of X-ray computed tomography (CT) applications towards achieving real-time imaging. However, achieving this goal is a challenging process, as most CT applications have not been designed for exploiting the amount of parallelism existing in these architectures. In this paper we present the massively parallel implementation and optimization of Mangoose(++), a CT application for reconstructing 3D volumes from 20 images collected by scanners based on cone-beam geometry. The main contribution of this paper are the following. First, we develop a modular application design that allows to exploit the functional parallelism inside the application and to facilitate the parallelization of individual application phases. Second, we identify a set of optimizations that can be applied individually and in combination for optimally deploying the application on a massively parallel multi-GPU system. Third, we present a study of surfing the optimization space of the modularized application and demonstrate that a significant benefit can be obtained from employing the adequate combination of application optimizations. (C) 2014 Elsevier Inc. All rights reserved.This work was partially funded by the Spanish Ministry of Science and Technology under the grant TIN2010-16497, the AMIT project (CEN-20101014) from the CDTI-CENIT program, RECAVA-RETIC Network (RD07/0014/2009), projects TEC2010-21619-C04-01, TEC2011-28972-C02-01, and PI11/00616 from the Spanish Ministerio de Ciencia e Innovacion, ARTEMIS program (S2009/DPI-1802), from the Comunidad de Madrid

    ATCache: Reducing DRAM-cache Latency via a Small SRAM Tag Cache

    Get PDF
    ABSTRACT 3D-stacking technology has enabled the option of embedding a large DRAM onto the processor. Prior works have proposed to use this as a DRAM cache. Because of its large size (a DRAM cache can be in the order of hundreds of megabytes), the total size of the tags associated with it can also be quite large (in the order of tens of megabytes). The large size of the tags has created a problem. Should we maintain the tags in the DRAM and pay the cost of a costly tag access in the critical path? Or should we maintain the tags in the faster SRAM by paying the area cost of a large SRAM for this purpose? Prior works have primarily chosen the former and proposed a variety of techniques for reducing the cost of a DRAM tag access. In this paper, we first establish (with the help of a study) that maintaining the tags in SRAM, because of its smaller access latency, leads to overall better performance. Motivated by this study, we ask if it is possible to maintain tags in SRAM without incurring high area overhead. Our key idea is simple. We propose to cache the tags in a small SRAM tag cache -we show that there is enough spatial and temporal locality amongst tag accesses to merit this idea. We propose the ATCache which is a small SRAM tag cache. Similar to a conventional cache, the ATCache caches recently accessed tags to exploit temporal locality; it exploits spatial locality by prefetching tags from nearby cache sets. In order to avoid the high miss latency and cache pollution caused by excessive prefetching, we use a simple technique to throttle the number of sets prefetched. Our proposed ATCache (which consumes 0.4% of overall tag size) can satisfy over 60% of DRAM cache tag accesses on average

    DAMOV: A New Methodology and Benchmark Suite for Evaluating Data Movement Bottlenecks

    Full text link
    Data movement between the CPU and main memory is a first-order obstacle against improving performance, scalability, and energy efficiency in modern systems. Computer systems employ a range of techniques to reduce overheads tied to data movement, spanning from traditional mechanisms (e.g., deep multi-level cache hierarchies, aggressive hardware prefetchers) to emerging techniques such as Near-Data Processing (NDP), where some computation is moved close to memory. Our goal is to methodically identify potential sources of data movement over a broad set of applications and to comprehensively compare traditional compute-centric data movement mitigation techniques to more memory-centric techniques, thereby developing a rigorous understanding of the best techniques to mitigate each source of data movement. With this goal in mind, we perform the first large-scale characterization of a wide variety of applications, across a wide range of application domains, to identify fundamental program properties that lead to data movement to/from main memory. We develop the first systematic methodology to classify applications based on the sources contributing to data movement bottlenecks. From our large-scale characterization of 77K functions across 345 applications, we select 144 functions to form the first open-source benchmark suite (DAMOV) for main memory data movement studies. We select a diverse range of functions that (1) represent different types of data movement bottlenecks, and (2) come from a wide range of application domains. Using NDP as a case study, we identify new insights about the different data movement bottlenecks and use these insights to determine the most suitable data movement mitigation mechanism for a particular application. We open-source DAMOV and the complete source code for our new characterization methodology at https://github.com/CMU-SAFARI/DAMOV.Comment: Our open source software is available at https://github.com/CMU-SAFARI/DAMO
    • 

    corecore