1,454 research outputs found

    Evaluation of Storage Systems for Big Data Analytics

    Get PDF
    abstract: Recent trends in big data storage systems show a shift from disk centric models to memory centric models. The primary challenges faced by these systems are speed, scalability, and fault tolerance. It is interesting to investigate the performance of these two models with respect to some big data applications. This thesis studies the performance of Ceph (a disk centric model) and Alluxio (a memory centric model) and evaluates whether a hybrid model provides any performance benefits with respect to big data applications. To this end, an application TechTalk is created that uses Ceph to store data and Alluxio to perform data analytics. The functionalities of the application include offline lecture storage, live recording of classes, content analysis and reference generation. The knowledge base of videos is constructed by analyzing the offline data using machine learning techniques. This training dataset provides knowledge to construct the index of an online stream. The indexed metadata enables the students to search, view and access the relevant content. The performance of the application is benchmarked in different use cases to demonstrate the benefits of the hybrid model.Dissertation/ThesisMasters Thesis Computer Science 201

    Cost-Effective Cache Deployment in Mobile Heterogeneous Networks

    Full text link
    This paper investigates one of the fundamental issues in cache-enabled heterogeneous networks (HetNets): how many cache instances should be deployed at different base stations, in order to provide guaranteed service in a cost-effective manner. Specifically, we consider two-tier HetNets with hierarchical caching, where the most popular files are cached at small cell base stations (SBSs) while the less popular ones are cached at macro base stations (MBSs). For a given network cache deployment budget, the cache sizes for MBSs and SBSs are optimized to maximize network capacity while satisfying the file transmission rate requirements. As cache sizes of MBSs and SBSs affect the traffic load distribution, inter-tier traffic steering is also employed for load balancing. Based on stochastic geometry analysis, the optimal cache sizes for MBSs and SBSs are obtained, which are threshold-based with respect to cache budget in the networks constrained by SBS backhauls. Simulation results are provided to evaluate the proposed schemes and demonstrate the applications in cost-effective network deployment

    Search strategies in unstructured overlays

    Get PDF
    Trabalho de projecto de mestrado em Engenharia Informática, apresentado à Universidade de Lisboa, através da Faculdade de Ciências, 2008Unstructured peer-to-peer networks have a low maintenance cost, high resilience and tolerance to the continuous arrival and departure of nodes. In these networks search is usually performed by flooding, which generates a high number of duplicate messages. To improve scalability, unstructured overlays evolved to a two-tiered architecture where regular nodes rely on special nodes, called supernodes or superpeers, to locate resources, thus reducing the scope of flooding based searches. While this approach takes advantage of node heterogeneity, it makes the overlay less resilient to accidental and malicious faults, and less attractive to users concerned with the consumption of their resources and who may not desire to commit additional resources that are required by nodes selected as superpeers. Another point of concern is churn, defined as the constant entry and departure of nodes. Churn affects both structured and unstructured overlay networks and, in order to build resilient search protocols, it must be taken into account. This dissertation proposes a novel search algorithm, called FASE, which combines a replication policy and a search space division technique to achieve low hop counts using a small number of messages, on unstructured overlays with nonhierarquical topologies. The problem of churn is mitigated by a distributed monitoring algorithm designed with FASE in mind. Simulation results validate FASE efficiency when compared to other search algorithms for peer-to-peer networks. The evaluation of the distributed monitoring algorithm shows that it maintains FASE performance when subjected to churn.Os sistemas peer-to-peer, como aplicações de partilha e distribuição de conteúdos ou voz-sobre-IP, são construídos sobre redes sobrepostas. Redes sobrepostas são redes virtuais que existem sobre uma rede subjacente, em que a topologia da rede sobreposta não tem de ter uma correspondência com a topologia da rede subjacente. Ao contrário das suas congéneres estruturadas, as redes sobrepostas não-estru-turadas não restringem a localização dos seus participantes, ou seja, não limitam a escolha de vizinhos de um dado nó, o que torna a sua manutenção mais simples. O baixo custo de manutenção das redes sobrepostas não-estruturadas torna estas especialmente adequadas para a construção de sistemas peer-to-peer capazes de tolerar o comportamento dinâmico dos seus participantes, uma vez que estas redes são permanentemente afectadas pela entrada e saída de nós na rede, um fénomeno conhecido como churn. O algoritmo de pesquisa mais comum em redes sobrepostas não-estruturadas consiste em inundar a rede, o que origina uma grande quantidade de mensagens duplicadas por cada pesquisa. A escalabilidade destes algoritmos é limitada porque consomem demasiados recursos da rede em sistemas com muitos participantes. Para reduzir o número de mensagens, as redes sobrepostas não-estruturadas podem ser organizadas em topologias hierárquicas. Nestas topologias alguns nós da rede, chamados supernós, assumem um papel mais importante, responsabilizando-se pela localização de objectos. A utilização de supernós cria novos problemas, como a sua selecção e a dependência da rede de uma pequena percentagem dos nós. Esta dissertação apresenta um novo algoritmo de pesquisa, chamado FASE, criado para operar sobre redes sobrepostas não estruturadas com topologias não-hierárquicas. Este algoritmo combina uma política de replicação com uma técnica de divisão do espaço de procura para resolver pesquisas ao alcançe de um número reduzido de saltos com o menor custo possível. Adicionalmente, o algoritmo procura nivelar a contribuição dos participantes, já que todos contribuem de uma forma semelhante para o desempenho da pesquisa. A estratégia seguida pelo algo- ritmo consiste em dividir tanto os nós da rede como as chaves dos seus conteúdos por diferentes “frequências” e replicar chaves nas respectivas frequências, sem, no entanto, limitar a localização de um nó ou impor uma estrutura à rede ou mesmo aplicar uma definição rígida de chave. Com o objectivo de mitigar o problema do churn, é apresentado um algoritmo de monitorização distribuído para as réplicas originadas pelo FASE. Os algoritmos propostos são avaliados através de simulações, que validam a eficiência do FASE quando comparado com outros algoritmos de pesquisa em redes sobrepostas não-estruturadas. É também demonstrado que o FASE mantém o seu desempenho em redes sob o efeito do churn quando combinado com o algoritmo de monitorização

    5G optimized caching and downlink resource sharing for smart cities

    Get PDF

    CRAID: Online RAID upgrades using dynamic hot data reorganization

    Get PDF
    Current algorithms used to upgrade RAID arrays typically require large amounts of data to be migrated, even those that move only the minimum amount of data required to keep a balanced data load. This paper presents CRAID, a self-optimizing RAID array that performs an online block reorganization of frequently used, long-term accessed data in order to reduce this migration even further. To achieve this objective, CRAID tracks frequently used, long-term data blocks and copies them to a dedicated partition spread across all the disks in the array. When new disks are added, CRAID only needs to extend this process to the new devices to redistribute this partition, thus greatly reducing the overhead of the upgrade process. In addition, the reorganized access patterns within this partition improve the array’s performance, amortizing the copy overhead and allowing CRAID to offer a performance competitive with traditional RAIDs. We describe CRAID’s motivation and design and we evaluate it by replaying seven real-world workloads including a file server, a web server and a user share. Our experiments show that CRAID can successfully detect hot data variations and begin using new disks as soon as they are added to the array. Also, the usage of a dedicated partition improves the sequentiality of relevant data access, which amortizes the cost of reorganizations. Finally, we prove that a full-HDD CRAID array with a small distributed partition (<1.28% per disk) can compete in performance with an ideally restriped RAID-5 and a hybrid RAID-5 with a small SSD cache.Peer ReviewedPostprint (published version

    On Optimal and Fair Service Allocation in Mobile Cloud Computing

    Get PDF
    This paper studies the optimal and fair service allocation for a variety of mobile applications (single or group and collaborative mobile applications) in mobile cloud computing. We exploit the observation that using tiered clouds, i.e. clouds at multiple levels (local and public) can increase the performance and scalability of mobile applications. We proposed a novel framework to model mobile applications as a location-time workflows (LTW) of tasks; here users mobility patterns are translated to mobile service usage patterns. We show that an optimal mapping of LTWs to tiered cloud resources considering multiple QoS goals such application delay, device power consumption and user cost/price is an NP-hard problem for both single and group-based applications. We propose an efficient heuristic algorithm called MuSIC that is able to perform well (73% of optimal, 30% better than simple strategies), and scale well to a large number of users while ensuring high mobile application QoS. We evaluate MuSIC and the 2-tier mobile cloud approach via implementation (on real world clouds) and extensive simulations using rich mobile applications like intensive signal processing, video streaming and multimedia file sharing applications. Our experimental and simulation results indicate that MuSIC supports scalable operation (100+ concurrent users executing complex workflows) while improving QoS. We observe about 25% lower delays and power (under fixed price constraints) and about 35% decrease in price (considering fixed delay) in comparison to only using the public cloud. Our studies also show that MuSIC performs quite well under different mobility patterns, e.g. random waypoint and Manhattan models
    corecore