85 research outputs found

    Balancing Interactive Performance and Budgeted Resources in Mobile Computing.

    Full text link
    In this dissertation, we explore the various limited resources involved in mobile applications --- battery energy, cellular data usage, and, critically, user attention --- and we devise principled methods for managing the tradeoffs involved in creating a good user experience. Building quality mobile applications requires developers to understand complex interactions between network usage, performance, and resource consumption. Because of this difficulty, developers commonly choose simple but suboptimal approaches that strictly prioritize performance or resource conservation. These extremes are symptoms of a lack of system-provided abstractions for managing the complexity inherent in managing performance/resource tradeoffs. By providing abstractions that help applications manage these tradeoffs, mobile systems can significantly improve user-visible performance without exhausting resource budgets. This dissertation explores three such abstractions in detail. We first present Intentional Networking, a system that provides synchronization primitives and intelligent scheduling for multi-network traffic. Next, we present Informed Mobile Prefetching, a system that helps applications decide when to prefetch data and how aggressively to spend limited battery energy and cellular data resources toward that end. Finally, we present Meatballs, a library that helps applications consider the cloudy nature of predictions when making decisions, selectively employing redundancy to mitigate uncertainty and provide more reliable performance. Overall, experiments show that these abstractions can significantly reduce interactive delay without overspending the available energy and data resources.PHDComputer Science and EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/108956/1/brettdh_1.pd

    Transaction-filtering data mining and a predictive model for intelligent data management

    Get PDF
    This thesis, first of all, proposes a new data mining paradigm (transaction-filtering association rule mining) addressing a time consumption issue caused by the repeated scans of original transaction databases in conventional associate rule mining algorithms. An in-memory transaction filter is designed to discard those infrequent items in the pruning steps. This filter is a data structure to be updated at the end of each iteration. The results based on an IBM benchmark show that an execution time reduction of 10% - 19% is achieved compared with the base case. Next, a data mining-based predictive model is then established contributing to intelligent data management within the context of Centre for Grid Computing. The capability of discovering unseen rules, patterns and correlations enables data mining techniques favourable in areas where massive amounts of data are generated. The past behaviours of two typical scenarios (network file systems and Data Grids) have been analyzed to build the model. The future popularity of files can be forecasted with an accuracy of 90% by deploying the above predictor based on the given real system traces. A further step towards intelligent policy design is achieved by analyzing the prediction results of files’ future popularity. The real system trace-based simulations have shown improvements of 2-4 times in terms of data response time in network file system scenario and 24% mean job time reduction in Data Grids compared with conventional cases.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Dynamic data placement and discovery in wide-area networks

    Get PDF
    The workloads of online services and applications such as social networks, sensor data platforms and web search engines have become increasingly global and dynamic, setting new challenges to providing users with low latency access to data. To achieve this, these services typically leverage a multi-site wide-area networked infrastructure. Data access latency in such an infrastructure depends on the network paths between users and data, which is determined by the data placement and discovery strategies. Current strategies are static, which offer low latencies upon deployment but worse performance under a dynamic workload. We propose dynamic data placement and discovery strategies for wide-area networked infrastructures, which adapt to the data access workload. We achieve this with data activity correlation (DAC), an application-agnostic approach for determining the correlations between data items based on access pattern similarities. By dynamically clustering data according to DAC, network traffic in clusters is kept local. We utilise DAC as a key component in reducing access latencies for two application scenarios, emphasising different aspects of the problem: The first scenario assumes the fixed placement of data at sites, and thus focusses on data discovery. This is the case for a global sensor discovery platform, which aims to provide low latency discovery of sensor metadata. We present a self-organising hierarchical infrastructure consisting of multiple DAC clusters, maintained with an online and distributed split-and-merge algorithm. This reduces the number of sites visited, and thus latency, during discovery for a variety of workloads. The second scenario focusses on data placement. This is the case for global online services that leverage a multi-data centre deployment to provide users with low latency access to data. We present a geo-dynamic partitioning middleware, which maintains DAC clusters with an online elastic partition algorithm. It supports the geo-aware placement of partitions across data centres according to the workload. This provides globally distributed users with low latency access to data for static and dynamic workloads.Open Acces

    Navigating Diverse Datasets in the Face of Uncertainty

    Get PDF
    When exploring big volumes of data, one of the challenging aspects is their diversity of origin. Multiple files that have not yet been ingested into a database system may contain information of interest to a researcher, who must curate, understand and sieve their content before being able to extract knowledge. Performance is one of the greatest difficulties in exploring these datasets. On the one hand, examining non-indexed, unprocessed files can be inefficient. On the other hand, any processing before its understanding introduces latency and potentially un- necessary work if the chosen schema matches poorly the data. We have surveyed the state-of-the-art and, fortunately, there exist multiple proposal of solutions to handle data in-situ performantly. Another major difficulty is matching files from multiple origins since their schema and layout may not be compatible or properly documented. Most surveyed solutions overlook this problem, especially for numeric, uncertain data, as is typical in fields like astronomy. The main objective of our research is to assist data scientists during the exploration of unprocessed, numerical, raw data distributed across multiple files based solely on its intrinsic distribution. In this thesis, we first introduce the concept of Equally-Distributed Dependencies, which provides the foundations to match this kind of dataset. We propose PresQ, a novel algorithm that finds quasi-cliques on hypergraphs based on their expected statistical properties. The probabilistic approach of PresQ can be successfully exploited to mine EDD between diverse datasets when the underlying populations can be assumed to be the same. Finally, we propose a two-sample statistical test based on Self-Organizing Maps (SOM). This method can outperform, in terms of power, other classifier-based two- sample tests, being in some cases comparable to kernel-based methods, with the advantage of being interpretable. Both PresQ and the SOM-based statistical test can provide insights that drive serendipitous discoveries

    Dynamic generation of personalized hybrid recommender systems

    Get PDF

    Proactive Mechanisms for Video-on-Demand Content Delivery

    Get PDF
    Video delivery over the Internet is the dominant source of network load all over the world. Especially VoD streaming services such as YouTube, Netflix, and Amazon Video have propelled the proliferation of VoD in many peoples' everyday life. VoD allows watching video from a large quantity of content at any time and on a multitude of devices, including smart TVs, laptops, and smartphones. Studies show that many people under the age of 32 grew up with VoD services and have never subscribed to a traditional cable TV service. This shift in video consumption behavior is continuing with an ever-growing number of users. satisfy this large demand, VoD service providers usually rely on CDN, which make VoD streaming scalable by operating a geographically distributed network of several hundreds of thousands of servers. Thereby, they deliver content from locations close to the users, which keeps traffic local and enables a fast playback start. CDN experience heavy utilization during the day and are usually reactive to the user demand, which is not optimal as it leads to expensive over-provisioning, to cope with traffic peaks, and overreacting content eviction that decreases the CDN's performance. However, to sustain future VoD streaming projections with hundreds of millions of users, new approaches are required to increase the content delivery efficiency. To this end, this thesis identifies three key research areas that have the potential to address the future demand for VoD content. Our first contribution is the design of vFetch, a privacy-preserving prefetching mechanism for mobile devices. It focuses explicitly on OTT VoD providers such as YouTube. vFetch learns the user interest towards different content channels and uses these insights to prefetch content on a user terminal. To do so, it continually monitors the user behavior and the device's mobile connectivity pattern, to allow for resource-efficient download scheduling. Thereby, vFetch illustrates how personalized prefetching can reduce the mobile data volume and alleviate mobile networks by offloading peak-hour traffic. Our second contribution focuses on proactive in-network caching. To this end, we present the design of the ProCache mechanism that divides the available cache storage concerning separate content categories. Thus, the available storage is allocated to these divisions based on their contribution to the overall cache efficiency. We propose a general work-flow that emphasizes multiple categories of a mixed content workload in addition to a work-flow tailored for music video content, the dominant traffic source on YouTube. Thereby, ProCache shows how content-awareness can contribute to efficient in-network caching. Our third contribution targets the application of multicast for VoD scenarios. Many users request popular VoD content with only small differences in their playback start time which offers a potential for multicast. Therefore, we present the design of the VoDCast mechanism that leverages this potential to multicast parts of popular VoD content. Thereby, VoDCast illustrates how ISP can collaborate with CDN to coordinate on content that should be delivered by ISP-internal multicast
    • …
    corecore