1,700 research outputs found
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Recommended from our members
FutureGRID: A Program for long-term research into GRID systems architecture
Proceedings of the 2003 UK e-Science All Hands Meeting, 31st August - 3rd September, Nottingham UKThis is a project to carry out research into long-term GRID architecture, in the University of Cambridge
Computer Laboratory and the Cambridge eScience Center, with support from the Microsoft Research
Laboratory, Cambridge.
It is part of a larger vision for future systems architectures for public computing platforms, including
both scientitic GRID and commodity level computing such as games, peer2peer computing and storage
services and so forth, based on work in the laboratories in recent years into massively scaleable distributed systems for storage, computation, content distribution and collaboration[26]
A Ring to Rule Them All - Revising OpenStack Internals to Operate Massively Distributed Clouds: The Discovery Initiative - Where Do We Are ?
STACK_HCERES2020The deployment of micro/nano data-centers in network point of presence offers an opportunity to deliver a more sustainable and efficient infrastructure for Cloud Computing. Among the different challenges we need to address to favor the adoption of such a model, the development of a system in charge of turning such a complex and diverse network of resources into a collection of abstracted computing facilities that are convenient to administrate and use is critical.In this report, we introduce the premises of such a system. The novelty of our work is that instead of developing a system from scratch, we revised the OpenStack solution in order to operate such an infrastructure in a distributed manner leveraging P2P mechanisms. More precisely, we describe how we revised the Nova service by leveraging a distributed key/value store instead of the centralized SQL backend. We present experiments that validated the correct behavior of our prototype, while having promising performance using several clusters composed of servers of the Gridâ5000 testbed. We believe that such a strategy is promising and paves the way to a first large-scale and WAN-wide IaaS manager.La tendance actuelle pour supporter la demande croissante d'informatique utilitaire consiste Ă construire des centres de donnĂ©es de plus en plus grands, dans un nombre limitĂ© de lieux stratĂ©giques. Cette approche permet sans aucun doute de satisfaire la demande actuelle tout en conservant une approche centralisĂ©e de la gestion de ces ressources, mais elle reste loin de pouvoir fournir des infrastructures rĂ©pondant aux contraintes actuelles et futures en termes d'efficacitĂ©, de juridiction ou encore de durabilitĂ©. L'objectif de l'initiative DISCOVERY est de concevoir le LUC OS, un systĂšme de gestion distribuĂ©e des ressources qui permettra de tirer parti de n'importe quel noeud rĂ©seau constituant la dorsale d'Internet afin de fournir une nouvelle gĂ©nĂ©ration d'informatique utilitaire, plus apte Ă prendre en compte la dispersion gĂ©ographiquedes utilisateurs et leur demande toujours croissante.AprĂšs avoir rappelĂ© les objectifs de l'initiative DISCOVERY et expliquĂ© pourquoi les approches type fĂ©dĂ©ration ne sont pas adaptĂ©es pour opĂ©rer une infrastructure d'informatique utilitaire intĂ©grĂ©e au rĂ©seau, nous prĂ©sentons les prĂ©misses de notre systĂšme. Nous expliquerons notamment pourquoi et comment nous avons choisi de dĂ©marrer des travaux visant Ă revisiter la conception de la solution Openstack. De notre point de vue, choisir d'appuyer nos travaux sur cette solution est une stratĂ©gie judicieuse Ă la vue de la complexitĂ© des systĂšmes de gestion des plateformes IaaS et de la vĂ©locitĂ© des solutions open-source
A Framework for Turn-Based Local Multiplayer Games
Mobile devices are present in peopleâs everyday lives and have gone from being a tool
used purely to communicate. Currently they are also used as a means to entertain, by
listening to music, watching videos or playing games. When it comes to games, these
can be played alone (single player games) or with other people (multiplayer games), from
strangers to family and friends. Local multiplayer games are a popular choice because
they connect groups of physically close people to play and allow them to interact.
However, there are some concerns to address. Local multiplayer games connect de vices but that alone isnât enough to ensure correct game play. These games need to
distribute the game state between the devices and solve the issues that ensue from that.
These involve matching players, managing game state (making sure players get the cur rent state in a reasonable time frame, in order for the next moves to be performed), dealing
with player inflow and outflow, among other problems.
To reliably handle the aforementioned issues, in this thesis we propose Peppermint, a
framework and runtime system to program local multiplayer games on the mobile edge.
It was developed on top of Basil GardenBed, a data storage and dissemination system
for the mobile edge developed at NOVA LINCS, that provides communication between
devices. On the other hand, the challenges stemming from the gamesâ execution will be
addressed by our framework, which are validated by the development and evaluation of
one game according to a set of functional metrics.
The results obtained during testing of our framework, mostly in a simulated setting,
show that the framework is able to create and store matches, letting players join, leave
and play in them. It will also discard the generated data when the match ends, so that the
network doesnât end up being cluttered with data that isnât being accessed anymore. These
characteristics constitute a framework has a set of core features that can be expanded in
future work.Os dispositivos mĂłveis estĂŁo presentes no dia-a-dia das pessoas e deixaram de ser apenas
utilizados para comunicar. Presentemente sĂŁo tambĂ©m usados como meio de entreteni mento, ao permitirem ouvir mĂșsica, ver vĂdeos ou jogar jogos. Em relação a jogos, estes
podem ser apenas para um jogador, ou podem ser jogados por vĂĄrias pessoas (jogos mul tijogador), desde desconhecidos a famĂlia e amigos. Os jogos multijogador locais sĂŁo uma
escolha popular porque permitem que grupos de pessoas prĂłximas fisicamente se juntem
e interajam.
No entanto, existem problemas a resolver. Os jogos multijogador locais conectam dis positivos mas apenas isso não é suficiente para garantir a sua correcção. Os jogos necessi tam de distribuir o seu estado entre os dispositivos e resolver as questÔes que decorrem
disso. Estas envolvem agrupar jogadores, gerir o estado do jogo (ao garantir que os joga dores recebem o estado mais recente atempadamente, para que os prĂłximos movimentos
possam ser efectuados), lidar com o fluxo de jogadores, entre outros problemas.
Para resolver os problemas mencionados, nesta tese apresentamos Peppermint, uma
infraestrutura e sistema de execução para implementar jogos multijogador locais em
dispositivos ligados a uma rede na mobile edge. Foi desenvolvido sobre o sistemaBasil
GardenBed, um sistema de armazenamento e disseminação de dados na mobile edge de senvolvido no NOVA LINCS, que fornece comunicação entre dispositivos. Por outro lado,
os desafios resultantes da execução dos jogos são endereçados pela nossa infraestrutura,
validados pelo desenvolvimento e avaliação de um jogo de acordo com um conjunto de
métricas relativas ao seu funcionamento.
Os resultados, predominantemente obtidos em ambiente simulado, mostram que a
infraestrutura permite criar e armazenar partidas, deixando outros jogadores entrar, sair e
jogar. Também elimina os dados criados quando estas terminam, para que a rede não fique
preenchida com dados que jĂĄ nĂŁo serĂŁo acedidos. Tudo isto forma uma infraestrutura com
um conjunto de caracterĂsticas bĂĄsicas que podem ser expandidas em trabalho futuro
Evaluating the benefits of key-value databases for scientific applications
The convergence of Big Data applications with High-Performance Computing requires new methodologies to store, manage and process large amounts of information. Traditional storage solutions are unable to scale and that results in complex coding strategies. For example, the brain atlas of the Human Brain Project has the challenge to process large amounts of high-resolution brain images. Given the computing needs, we study the effects of replacing a traditional storage system with a distributed Key-Value database on a cell segmentation application. The original code uses HDF5 files on GPFS through an intricate interface, imposing synchronizations. On the other hand, by using Apache Cassandra or ScyllaDB through Hecuba, the application code is greatly simplified. Thanks to the Key-Value data model, the number of synchronizations is reduced and the time dedicated to I/O scales when increasing the number of nodes.This project/research has received funding from the European Unions Horizon
2020 Framework Programme for Research and Innovation under the Speci c
Grant Agreement No. 720270 (Human Brain Project SGA1) and the Speci c
Grant Agreement No. 785907 (Human Brain Project SGA2). This work has also
been supported by the Spanish Government (SEV2015-0493), by the Spanish
Ministry of Science and Innovation (contract TIN2015-65316-P), and by Generalitat
de Catalunya (contract 2017-SGR-1414).Postprint (author's final draft
PVW: Designing Virtual World Server Infrastructure
This paper presents a high level overview of PVW (Partitioned Virtual Worlds), a distributed system architecture for the management of virtual worlds. PVW is designed to support arbitrarily large and complex virtual worlds while accommodating dynamic and highly variable user population and content distribution density. The PVW approach enables the task of simulating and managing the virtual world to be distributed over many servers by spatially partitioning the environment into a hierarchical structure. This structure is useful both for balancing the simulation load across many nodes, as well as features such as geometric simplification and distribution of dynamic content
On Constructing Persistent Identifiers with Persistent Resolution Targets
Persistent Identifiers (PID) are the foundation referencing digital assets in
scientific publications, books, and digital repositories. In its realization,
PIDs contain metadata and resolving targets in form of URLs that point to data
sets located on the network. In contrast to PIDs, the target URLs are typically
changing over time; thus, PIDs need continuous maintenance -- an effort that is
increasing tremendously with the advancement of e-Science and the advent of the
Internet-of-Things (IoT). Nowadays, billions of sensors and data sets are
subject of PID assignment. This paper presents a new approach of embedding
location independent targets into PIDs that allows the creation of
maintenance-free PIDs using content-centric network technology and overlay
networks. For proving the validity of the presented approach, the Handle PID
System is used in conjunction with Magnet Link access information encoding,
state-of-the-art decentralized data distribution with BitTorrent, and Named
Data Networking (NDN) as location-independent data access technology for
networks. Contrasting existing approaches, no green-field implementation of PID
or major modifications of the Handle System is required to enable
location-independent data dissemination with maintenance-free PIDs.Comment: Published IEEE paper of the FedCSIS 2016 (SoFAST-WS'16) conference,
11.-14. September 2016, Gdansk, Poland. Also available online:
http://ieeexplore.ieee.org/document/7733372
Applying Supernode Architecture for Scalable Multiplayer Computer Game
SĂŒsteemi skaleeritavus, kiire vastamise aeg ja madal hinnatase on tĂ€htsad atribuudid, mida tuleb arvesse vĂ”tta suurte multimĂ€ngijatega online mitmikmĂ€ngude loomisel. Sellistes sĂŒsteemides mĂ€ngib suurt rolli arhitektuur. PartnervĂ”rkude arhitektuuridel on madalad hinnad ning need suudavad saavutada jĂ€rk-jĂ€rgulise kasvu tĂ€nu nende hajususele ja koostööle. Peale selle suudavad nad kiirelt reageerida tĂ€nu otseĂŒhendustele mĂ€ngijate vahel. Samas esineb selliste arhitektuuridega mitmeid probleeme. Selles lĂ”putöös uuritakse olemasolevaid partnervĂ”rkude lahendusi suurtele multimĂ€ngijatega online olevatele mĂ€ngudele. Veel uurib see lĂ”putöö kahte hĂŒbriidarhitektuuri - esimeses on kasutatud supernode punkte koos keskse ĂŒhenduspunktiga ning teises on kasutatud keskset vĂ”rguharu ĂŒhenduspunkti ilma keskse ĂŒhenduspunktita. Lisaks sellele esitab see lĂ”putöö lahenduse supernodemultimĂ€ngijatega online mĂ€ngudele, mis pĂ”hinevad multiedastuse pĂ”himĂ”ttel.Selleks, et tulevikus analĂŒĂŒse lĂ€bi viia, on kogu sĂŒsteem implementeeritud simulatsiooniga.Scalability, fast response time and low cost are of utmost importance in designing a successful massively multiplayer online game. The underlying architecture plays an important role in meeting these conditions. Peer-to-peer architectures, have low infrastructure costs and can achieve high scalability, due to their distributed and collaborative nature. They can also achieve fast response times by creating direct connections between players. However, these architectures face many challenges.Therefore, the paper investigates existing peer to peer architecture solutions for a massively multiplayer online games. The study examines two hybrid architectures. In the first one, a supernode approach is used with a central server. In the contrast in the second one, there is no central server and pure peer to peer architecture is deployed. Moreover, the thesis proposes a solution based on multicast peer discovery and supernodes for a massively multiplayer online game. Also, all system is covered with simulation, that provides results for future analysing
- âŠ