280 research outputs found

    Exploiting Data Mining Techniques for Broadcasting Data in Mobile Computing Environments

    Get PDF
    Cataloged from PDF version of article.Mobile computers can be equipped with wireless communication devices that enable users to access data services from any location. In wireless communication, the server-to-client (downlink) communication bandwidth is much higher than the client-to-server (uplink) communication bandwidth. This asymmetry makes the dissemination of data to client machines a desirable approach. However, dissemination of data by broadcasting may induce high access latency in case the number of broadcast data items is large. In this paper, we propose two methods aiming to reduce client access latency of broadcast data. Our methods are based on analyzing the broadcast history (i.e., the chronological sequence of items that have been requested by clients) using data mining techniques. With the first method, the data items in the broadcast disk are organized in such a way that the items requested subsequently are placed close to each other. The second method focuses on improving the cache hit ratio to be able to decrease the access latency. It enables clients to prefetch the data from the broadcast disk based on the rules extracted from previous data request patterns. The proposed methods are implemented on a Web log to estimate their effectiveness. It is shown through performance experiments that the proposed rule-based methods are effective in improving the system performance in terms of the average latency as well as the cache hit ratio of mobile clients

    NVRAM as an enabler to new horizons in graph processing

    Get PDF

    Reinforcement learning for proactive content caching in wireless networks

    Get PDF
    Proactive content caching (PC) at the edge of wireless networks, that is, at the base stations (BSs) and/or user equipments (UEs), is a promising strategy to successfully handle the ever-growing mobile data traffic and to improve the quality-of-service for content delivery over wireless networks. However, factors such as limitations in storage capacity, time-variations in wireless channel conditions as well as in content demand profile pose challenges that need to be addressed in order to realise the benefits of PC at the wireless edge. This thesis aims to develop PC solutions that address these challenges. We consider PC directly at UEs equipped with finite capacity cache memories. This consideration is done within the framework of a dynamic system, where mobile users randomly request contents from a non-stationary content library; new contents are added to the library over time and each content may remain in the library for a random lifetime within which it may be requested. Contents are delivered through wireless channels with time-varying quality, and any time contents are transmitted, a transmission cost associated with the number of bits downloaded and the channel quality of the receiving user(s) at that time is incurred by the system. We formulate each considered problem as a Markov decision process with the objective of minimising the long term expected average cost on the system. We then use reinforcement learning (RL) to solve this highly challenging problem with a prohibitively large state and action spaces. In particular, we employ policy approximation techniques for compact representation of complex policy structures, and policy gradient RL methods to train the system. In a single-user problem setting that we consider, we show the optimality of a threshold-based PC scheme that is adaptive to system dynamics. We use this result to characterise and design a multicast-aware PC scheme, based on deep RL framework, when we consider a multi-user problem setting. We perform extensive numerical simulations of the schemes we propose. Our results show not only significant improvements against the state-of-the-art reactive content delivery approaches, but also near-optimality of the proposed RL solutions based on comparisons with some lower bounds.Open Acces

    Prefetching techniques for client server object-oriented database systems

    Get PDF
    The performance of many object-oriented database applications suffers from the page fetch latency which is determined by the expense of disk access. In this work we suggest several prefetching techniques to avoid, or at least to reduce, page fetch latency. In practice no prediction technique is perfect and no prefetching technique can entirely eliminate delay due to page fetch latency. Therefore we are interested in the trade-off between the level of accuracy required for obtaining good results in terms of elapsed time reduction and the processing overhead needed to achieve this level of accuracy. If prefetching accuracy is high then the total elapsed time of an application can be reduced significantly otherwise if the prefetching accuracy is low, many incorrect pages are prefetched and the extra load on the client, network, server and disks decreases the whole system performance. Access pattern of object-oriented databases are often complex and usually hard to predict accurately. The ..

    Letter from the Special Issue Editor

    Get PDF
    Editorial work for DEBULL on a special issue on data management on Storage Class Memory (SCM) technologies

    Performance, memory efficiency and programmability: the ambitious triptych of combining vertex-centricity with HPC

    Get PDF
    The field of graph processing has grown significantly due to the flexibility and wide applicability of the graph data structure. In the meantime, so has interest from the community in developing new approaches to graph processing applications. In 2010, Google introduced the vertex-centric programming model through their framework Pregel. This consists of expressing computation from the perspective of a vertex, whilst inter-vertex communications are achieved via data exchanges along incoming and outgoing edges, using the message-passing abstraction provided. Pregel ’s high-level programming interface, designed around a set of simple functions, provides ease of programmability to the user. The aim is to enable the development of graph processing applications without requiring expertise in optimisation or parallel programming. Such challenges are instead abstracted from the user and offloaded to the underlying framework. However, fine-grained synchronisation, unpredictable memory access patterns and multiple sources of load imbalance make it difficult to implement the vertex centric model efficiently on high performance computing platforms without sacrificing programmability. This research focuses on combining vertex-centric and High-Performance Comput- ing (HPC), resulting in the development of a shared-memory framework, iPregel, which demonstrates that a performance and memory efficiency similar to that of non-vertex- centric approaches can be achieved while preserving the programmability benefits of vertex-centric. Non-volatile memory is then explored to extend single-node capabilities, during which multiple versions of iPregel are implemented to experiment with the various data movement strategies. Then, distributed memory parallelism is investigated to overcome the resource limitations of single node processing. A second framework named DiP, which ports applicable iPregel ’s optimisations to distributed memory, prioritises performance to high scalability. This research has resulted in a set of techniques and optimisations illustrated through a shared-memory framework iPregel and a distributed-memory framework DiP. The former closes a gap of several orders of magnitude in both performance and memory efficiency, even able to process a graph of 750 billion edges using non-volatile memory. The latter has proved that this competitiveness can also be scaled beyond a single node, enabling the processing of the largest graph generated in this research, comprising 1.6 trillion edges. Most importantly, both frameworks achieved these performance and capability gains whilst also preserving programmability, which is the cornerstone of the vertex-centric programming model. This research therefore demonstrates that by combining vertex-centricity and High-Performance Computing (HPC), it is possible to maintain performance, memory efficiency and programmability

    From online social network analysis to a user-centric private sharing system

    Get PDF
    Online social networks (OSNs) have become a massive repository of data constructed from individuals’ inputs: posts, photos, feedbacks, locations, etc. By analyzing such data, meaningful knowledge is generated that can affect individuals’ beliefs, desires, happiness and choices—a data circulation started from individuals and ended in individuals! The OSN owners, as the one authority having full control over the stored data, make the data available for research, advertisement and other purposes. However, the individuals are missed in this circle while they generate the data and shape the OSN structure. In this thesis, we started by introducing approximation algorithms for finding the most influential individuals in a social graph and modeling the spread of information. To do so, we considered the communities of individuals that are shaped in a social graph. The social graph is extracted from the data stored and controlled centrally, which can cause privacy breaches and lead to individuals’ concerns. Therefore, we introduced UPSS: the user-centric private sharing system, in which the individuals are considered as the real data owners and provides secure and private data sharing on untrusted servers. The UPSS’s public API allows the application developers to implement applications as diverse as OSNs, document redaction systems with integrity properties, censorship-resistant systems, health care auditing systems, distributed version control systems with flexible access controls and a filesystem in userspace. Accessing users’ data is possible only with explicit user consent. We implemented the two later cases to show the applicability of UPSS. Supporting different storage models by UPSS enables us to have a local, remote and global filesystem in userspace with one unique core filesystem implementation and having it mounted with different block stores. By designing and implementing UPSS, we show that security and privacy can be addressed at the same time in the systems that need selective, secure and collaborative information sharing without requiring complete trust

    The Family of MapReduce and Large Scale Data Processing Systems

    Full text link
    In the last two decades, the continuous increase of computational power has produced an overwhelming flow of data which has called for a paradigm shift in the computing architecture and large scale data processing mechanisms. MapReduce is a simple and powerful programming model that enables easy development of scalable parallel applications to process vast amounts of data on large clusters of commodity machines. It isolates the application from the details of running a distributed program such as issues on data distribution, scheduling and fault tolerance. However, the original implementation of the MapReduce framework had some limitations that have been tackled by many research efforts in several followup works after its introduction. This article provides a comprehensive survey for a family of approaches and mechanisms of large scale data processing mechanisms that have been implemented based on the original idea of the MapReduce framework and are currently gaining a lot of momentum in both research and industrial communities. We also cover a set of introduced systems that have been implemented to provide declarative programming interfaces on top of the MapReduce framework. In addition, we review several large scale data processing systems that resemble some of the ideas of the MapReduce framework for different purposes and application scenarios. Finally, we discuss some of the future research directions for implementing the next generation of MapReduce-like solutions.Comment: arXiv admin note: text overlap with arXiv:1105.4252 by other author

    3rd Many-core Applications Research Community (MARC) Symposium. (KIT Scientific Reports ; 7598)

    Get PDF
    This manuscript includes recent scientific work regarding the Intel Single Chip Cloud computer and describes approaches for novel approaches for programming and run-time organization
    • …
    corecore