1,700 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    A Ring to Rule Them All - Revising OpenStack Internals to Operate Massively Distributed Clouds: The Discovery Initiative - Where Do We Are ?

    Get PDF
    STACK_HCERES2020The deployment of micro/nano data-centers in network point of presence offers an opportunity to deliver a more sustainable and efficient infrastructure for Cloud Computing. Among the different challenges we need to address to favor the adoption of such a model, the development of a system in charge of turning such a complex and diverse network of resources into a collection of abstracted computing facilities that are convenient to administrate and use is critical.In this report, we introduce the premises of such a system. The novelty of our work is that instead of developing a system from scratch, we revised the OpenStack solution in order to operate such an infrastructure in a distributed manner leveraging P2P mechanisms. More precisely, we describe how we revised the Nova service by leveraging a distributed key/value store instead of the centralized SQL backend. We present experiments that validated the correct behavior of our prototype, while having promising performance using several clusters composed of servers of the Grid’5000 testbed. We believe that such a strategy is promising and paves the way to a first large-scale and WAN-wide IaaS manager.La tendance actuelle pour supporter la demande croissante d'informatique utilitaire consiste Ă  construire des centres de donnĂ©es de plus en plus grands, dans un nombre limitĂ© de lieux stratĂ©giques. Cette approche permet sans aucun doute de satisfaire la demande actuelle tout en conservant une approche centralisĂ©e de la gestion de ces ressources, mais elle reste loin de pouvoir fournir des infrastructures rĂ©pondant aux contraintes actuelles et futures en termes d'efficacitĂ©, de juridiction ou encore de durabilitĂ©. L'objectif de l'initiative DISCOVERY est de concevoir le LUC OS, un systĂšme de gestion distribuĂ©e des ressources qui permettra de tirer parti de n'importe quel noeud rĂ©seau constituant la dorsale d'Internet afin de fournir une nouvelle gĂ©nĂ©ration d'informatique utilitaire, plus apte Ă  prendre en compte la dispersion gĂ©ographiquedes utilisateurs et leur demande toujours croissante.AprĂšs avoir rappelĂ© les objectifs de l'initiative DISCOVERY et expliquĂ© pourquoi les approches type fĂ©dĂ©ration ne sont pas adaptĂ©es pour opĂ©rer une infrastructure d'informatique utilitaire intĂ©grĂ©e au rĂ©seau, nous prĂ©sentons les prĂ©misses de notre systĂšme. Nous expliquerons notamment pourquoi et comment nous avons choisi de dĂ©marrer des travaux visant Ă  revisiter la conception de la solution Openstack. De notre point de vue, choisir d'appuyer nos travaux sur cette solution est une stratĂ©gie judicieuse Ă  la vue de la complexitĂ© des systĂšmes de gestion des plateformes IaaS et de la vĂ©locitĂ© des solutions open-source

    A Framework for Turn-Based Local Multiplayer Games

    Get PDF
    Mobile devices are present in people’s everyday lives and have gone from being a tool used purely to communicate. Currently they are also used as a means to entertain, by listening to music, watching videos or playing games. When it comes to games, these can be played alone (single player games) or with other people (multiplayer games), from strangers to family and friends. Local multiplayer games are a popular choice because they connect groups of physically close people to play and allow them to interact. However, there are some concerns to address. Local multiplayer games connect de vices but that alone isn’t enough to ensure correct game play. These games need to distribute the game state between the devices and solve the issues that ensue from that. These involve matching players, managing game state (making sure players get the cur rent state in a reasonable time frame, in order for the next moves to be performed), dealing with player inflow and outflow, among other problems. To reliably handle the aforementioned issues, in this thesis we propose Peppermint, a framework and runtime system to program local multiplayer games on the mobile edge. It was developed on top of Basil GardenBed, a data storage and dissemination system for the mobile edge developed at NOVA LINCS, that provides communication between devices. On the other hand, the challenges stemming from the games’ execution will be addressed by our framework, which are validated by the development and evaluation of one game according to a set of functional metrics. The results obtained during testing of our framework, mostly in a simulated setting, show that the framework is able to create and store matches, letting players join, leave and play in them. It will also discard the generated data when the match ends, so that the network doesn’t end up being cluttered with data that isn’t being accessed anymore. These characteristics constitute a framework has a set of core features that can be expanded in future work.Os dispositivos mĂłveis estĂŁo presentes no dia-a-dia das pessoas e deixaram de ser apenas utilizados para comunicar. Presentemente sĂŁo tambĂ©m usados como meio de entreteni mento, ao permitirem ouvir mĂșsica, ver vĂ­deos ou jogar jogos. Em relação a jogos, estes podem ser apenas para um jogador, ou podem ser jogados por vĂĄrias pessoas (jogos mul tijogador), desde desconhecidos a famĂ­lia e amigos. Os jogos multijogador locais sĂŁo uma escolha popular porque permitem que grupos de pessoas prĂłximas fisicamente se juntem e interajam. No entanto, existem problemas a resolver. Os jogos multijogador locais conectam dis positivos mas apenas isso nĂŁo Ă© suficiente para garantir a sua correcção. Os jogos necessi tam de distribuir o seu estado entre os dispositivos e resolver as questĂ”es que decorrem disso. Estas envolvem agrupar jogadores, gerir o estado do jogo (ao garantir que os joga dores recebem o estado mais recente atempadamente, para que os prĂłximos movimentos possam ser efectuados), lidar com o fluxo de jogadores, entre outros problemas. Para resolver os problemas mencionados, nesta tese apresentamos Peppermint, uma infraestrutura e sistema de execução para implementar jogos multijogador locais em dispositivos ligados a uma rede na mobile edge. Foi desenvolvido sobre o sistemaBasil GardenBed, um sistema de armazenamento e disseminação de dados na mobile edge de senvolvido no NOVA LINCS, que fornece comunicação entre dispositivos. Por outro lado, os desafios resultantes da execução dos jogos sĂŁo endereçados pela nossa infraestrutura, validados pelo desenvolvimento e avaliação de um jogo de acordo com um conjunto de mĂ©tricas relativas ao seu funcionamento. Os resultados, predominantemente obtidos em ambiente simulado, mostram que a infraestrutura permite criar e armazenar partidas, deixando outros jogadores entrar, sair e jogar. TambĂ©m elimina os dados criados quando estas terminam, para que a rede nĂŁo fique preenchida com dados que jĂĄ nĂŁo serĂŁo acedidos. Tudo isto forma uma infraestrutura com um conjunto de caracterĂ­sticas bĂĄsicas que podem ser expandidas em trabalho futuro

    Evaluating the benefits of key-value databases for scientific applications

    Get PDF
    The convergence of Big Data applications with High-Performance Computing requires new methodologies to store, manage and process large amounts of information. Traditional storage solutions are unable to scale and that results in complex coding strategies. For example, the brain atlas of the Human Brain Project has the challenge to process large amounts of high-resolution brain images. Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application. The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations. On the other hand, by using Apache Cassandra or ScyllaDB through Hecuba, the application code is greatly simplified. Thanks to the Key-Value data model, the number of synchronizations is reduced and the time dedicated to I/O scales when increasing the number of nodes.This project/research has received funding from the European Unions Horizon 2020 Framework Programme for Research and Innovation under the Speci c Grant Agreement No. 720270 (Human Brain Project SGA1) and the Speci c Grant Agreement No. 785907 (Human Brain Project SGA2). This work has also been supported by the Spanish Government (SEV2015-0493), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316-P), and by Generalitat de Catalunya (contract 2017-SGR-1414).Postprint (author's final draft

    PVW: Designing Virtual World Server Infrastructure

    Get PDF
    This paper presents a high level overview of PVW (Partitioned Virtual Worlds), a distributed system architecture for the management of virtual worlds. PVW is designed to support arbitrarily large and complex virtual worlds while accommodating dynamic and highly variable user population and content distribution density. The PVW approach enables the task of simulating and managing the virtual world to be distributed over many servers by spatially partitioning the environment into a hierarchical structure. This structure is useful both for balancing the simulation load across many nodes, as well as features such as geometric simplification and distribution of dynamic content

    On Constructing Persistent Identifiers with Persistent Resolution Targets

    Get PDF
    Persistent Identifiers (PID) are the foundation referencing digital assets in scientific publications, books, and digital repositories. In its realization, PIDs contain metadata and resolving targets in form of URLs that point to data sets located on the network. In contrast to PIDs, the target URLs are typically changing over time; thus, PIDs need continuous maintenance -- an effort that is increasing tremendously with the advancement of e-Science and the advent of the Internet-of-Things (IoT). Nowadays, billions of sensors and data sets are subject of PID assignment. This paper presents a new approach of embedding location independent targets into PIDs that allows the creation of maintenance-free PIDs using content-centric network technology and overlay networks. For proving the validity of the presented approach, the Handle PID System is used in conjunction with Magnet Link access information encoding, state-of-the-art decentralized data distribution with BitTorrent, and Named Data Networking (NDN) as location-independent data access technology for networks. Contrasting existing approaches, no green-field implementation of PID or major modifications of the Handle System is required to enable location-independent data dissemination with maintenance-free PIDs.Comment: Published IEEE paper of the FedCSIS 2016 (SoFAST-WS'16) conference, 11.-14. September 2016, Gdansk, Poland. Also available online: http://ieeexplore.ieee.org/document/7733372

    Applying Supernode Architecture for Scalable Multiplayer Computer Game

    Get PDF
    SĂŒsteemi skaleeritavus, kiire vastamise aeg ja madal hinnatase on tĂ€htsad atribuudid, mida tuleb arvesse vĂ”tta suurte multimĂ€ngijatega online mitmikmĂ€ngude loomisel. Sellistes sĂŒsteemides mĂ€ngib suurt rolli arhitektuur. PartnervĂ”rkude arhitektuuridel on madalad hinnad ning need suudavad saavutada jĂ€rk-jĂ€rgulise kasvu tĂ€nu nende hajususele ja koostööle. Peale selle suudavad nad kiirelt reageerida tĂ€nu otseĂŒhendustele mĂ€ngijate vahel. Samas esineb selliste arhitektuuridega mitmeid probleeme. Selles lĂ”putöös uuritakse olemasolevaid partnervĂ”rkude lahendusi suurtele multimĂ€ngijatega online olevatele mĂ€ngudele. Veel uurib see lĂ”putöö kahte hĂŒbriidarhitektuuri - esimeses on kasutatud supernode punkte koos keskse ĂŒhenduspunktiga ning teises on kasutatud keskset vĂ”rguharu ĂŒhenduspunkti ilma keskse ĂŒhenduspunktita. Lisaks sellele esitab see lĂ”putöö lahenduse supernodemultimĂ€ngijatega online mĂ€ngudele, mis pĂ”hinevad multiedastuse pĂ”himĂ”ttel.Selleks, et tulevikus analĂŒĂŒse lĂ€bi viia, on kogu sĂŒsteem implementeeritud simulatsiooniga.Scalability, fast response time and low cost are of utmost importance in designing a successful massively multiplayer online game. The underlying architecture plays an important role in meeting these conditions. Peer-to-peer architectures, have low infrastructure costs and can achieve high scalability, due to their distributed and collaborative nature. They can also achieve fast response times by creating direct connections between players. However, these architectures face many challenges.Therefore, the paper investigates existing peer to peer architecture solutions for a massively multiplayer online games. The study examines two hybrid architectures. In the first one, a supernode approach is used with a central server. In the contrast in the second one, there is no central server and pure peer to peer architecture is deployed. Moreover, the thesis proposes a solution based on multicast peer discovery and supernodes for a massively multiplayer online game. Also, all system is covered with simulation, that provides results for future analysing
    • 

    corecore