60,329 research outputs found

    Stochastic Modeling of Hybrid Cache Systems

    Full text link
    In recent years, there is an increasing demand of big memory systems so to perform large scale data analytics. Since DRAM memories are expensive, some researchers are suggesting to use other memory systems such as non-volatile memory (NVM) technology to build large-memory computing systems. However, whether the NVM technology can be a viable alternative (either economically and technically) to DRAM remains an open question. To answer this question, it is important to consider how to design a memory system from a "system perspective", that is, incorporating different performance characteristics and price ratios from hybrid memory devices. This paper presents an analytical model of a "hybrid page cache system" so to understand the diverse design space and performance impact of a hybrid cache system. We consider (1) various architectural choices, (2) design strategies, and (3) configuration of different memory devices. Using this model, we provide guidelines on how to design hybrid page cache to reach a good trade-off between high system throughput (in I/O per sec or IOPS) and fast cache reactivity which is defined by the time to fill the cache. We also show how one can configure the DRAM capacity and NVM capacity under a fixed budget. We pick PCM as an example for NVM and conduct numerical analysis. Our analysis indicates that incorporating PCM in a page cache system significantly improves the system performance, and it also shows larger benefit to allocate more PCM in page cache in some cases. Besides, for the common setting of performance-price ratio of PCM, "flat architecture" offers as a better choice, but "layered architecture" outperforms if PCM write performance can be significantly improved in the future.Comment: 14 pages; mascots 201

    A Hybrid Recommender System for Patient-Doctor Matchmaking in Primary Care

    Full text link
    We partner with a leading European healthcare provider and design a mechanism to match patients with family doctors in primary care. We define the matchmaking process for several distinct use cases given different levels of available information about patients. Then, we adopt a hybrid recommender system to present each patient a list of family doctor recommendations. In particular, we model patient trust of family doctors using a large-scale dataset of consultation histories, while accounting for the temporal dynamics of their relationships. Our proposed approach shows higher predictive accuracy than both a heuristic baseline and a collaborative filtering approach, and the proposed trust measure further improves model performance.Comment: This paper is accepted at DSAA 2018 as a full paper, Proc. of the 5th IEEE International Conference on Data Science and Advanced Analytics (DSAA), Turin, Ital

    Reconsidering big data security and privacy in cloud and mobile cloud systems

    Get PDF
    Large scale distributed systems in particular cloud and mobile cloud deployments provide great services improving people\u27s quality of life and organizational efficiency. In order to match the performance needs, cloud computing engages with the perils of peer-to-peer (P2P) computing and brings up the P2P cloud systems as an extension for federated cloud. Having a decentralized architecture built on independent nodes and resources without any specific central control and monitoring, these cloud deployments are able to handle resource provisioning at a very low cost. Hence, we see a vast amount of mobile applications and services that are ready to scale to billions of mobile devices painlessly. Among these, data driven applications are the most successful ones in terms of popularity or monetization. However, data rich applications expose other problems to consider including storage, big data processing and also the crucial task of protecting private or sensitive information. In this work, first, we go through the existing layered cloud architectures and present a solution addressing the big data storage. Secondly, we explore the use of P2P Cloud System (P2PCS) for big data processing and analytics. Thirdly, we propose an efficient hybrid mobile cloud computing model based on cloudlets concept and we apply this model to health care systems as a case study. Then, the model is simulated using Mobile Cloud Computing Simulator (MCCSIM). According to the experimental power and delay results, the hybrid cloud model performs up to 75% better when compared to the traditional cloud models. Lastly, we enhance our proposals by presenting and analyzing security and privacy countermeasures against possible attacks

    Integrated Face Analytics Networks through Cross-Dataset Hybrid Training

    Full text link
    Face analytics benefits many multimedia applications. It consists of a number of tasks, such as facial emotion recognition and face parsing, and most existing approaches generally treat these tasks independently, which limits their deployment in real scenarios. In this paper we propose an integrated Face Analytics Network (iFAN), which is able to perform multiple tasks jointly for face analytics with a novel carefully designed network architecture to fully facilitate the informative interaction among different tasks. The proposed integrated network explicitly models the interactions between tasks so that the correlations between tasks can be fully exploited for performance boost. In addition, to solve the bottleneck of the absence of datasets with comprehensive training data for various tasks, we propose a novel cross-dataset hybrid training strategy. It allows "plug-in and play" of multiple datasets annotated for different tasks without the requirement of a fully labeled common dataset for all the tasks. We experimentally show that the proposed iFAN achieves state-of-the-art performance on multiple face analytics tasks using a single integrated model. Specifically, iFAN achieves an overall F-score of 91.15% on the Helen dataset for face parsing, a normalized mean error of 5.81% on the MTFL dataset for facial landmark localization and an accuracy of 45.73% on the BNU dataset for emotion recognition with a single model.Comment: 10 page

    Exploring Application Performance on Emerging Hybrid-Memory Supercomputers

    Full text link
    Next-generation supercomputers will feature more hierarchical and heterogeneous memory systems with different memory technologies working side-by-side. A critical question is whether at large scale existing HPC applications and emerging data-analytics workloads will have performance improvement or degradation on these systems. We propose a systematic and fair methodology to identify the trend of application performance on emerging hybrid-memory systems. We model the memory system of next-generation supercomputers as a combination of "fast" and "slow" memories. We then analyze performance and dynamic execution characteristics of a variety of workloads, from traditional scientific applications to emerging data analytics to compare traditional and hybrid-memory systems. Our results show that data analytics applications can clearly benefit from the new system design, especially at large scale. Moreover, hybrid-memory systems do not penalize traditional scientific applications, which may also show performance improvement.Comment: 18th International Conference on High Performance Computing and Communications, IEEE, 201

    Unleashing the Power of Hashtags in Tweet Analytics with Distributed Framework on Apache Storm

    Full text link
    Twitter is a popular social network platform where users can interact and post texts of up to 280 characters called tweets. Hashtags, hyperlinked words in tweets, have increasingly become crucial for tweet retrieval and search. Using hashtags for tweet topic classification is a challenging problem because of context dependent among words, slangs, abbreviation and emoticons in a short tweet along with evolving use of hashtags. Since Twitter generates millions of tweets daily, tweet analytics is a fundamental problem of Big data stream that often requires a real-time Distributed processing. This paper proposes a distributed online approach to tweet topic classification with hashtags. Being implemented on Apache Storm, a distributed real time framework, our approach incrementally identifies and updates a set of strong predictors in the Na\"ive Bayes model for classifying each incoming tweet instance. Preliminary experiments show promising results with up to 97% accuracy and 37% increase in throughput on eight processors.Comment: IEEE International Conference on Big Data 201
    corecore