55,435 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    The Hierarchic treatment of marine ecological information from spatial networks of benthic platforms

    Get PDF
    Measuring biodiversity simultaneously in different locations, at different temporal scales, and over wide spatial scales is of strategic importance for the improvement of our understanding of the functioning of marine ecosystems and for the conservation of their biodiversity. Monitoring networks of cabled observatories, along with other docked autonomous systems (e.g., Remotely Operated Vehicles [ROVs], Autonomous Underwater Vehicles [AUVs], and crawlers), are being conceived and established at a spatial scale capable of tracking energy fluxes across benthic and pelagic compartments, as well as across geographic ecotones. At the same time, optoacoustic imaging is sustaining an unprecedented expansion in marine ecological monitoring, enabling the acquisition of new biological and environmental data at an appropriate spatiotemporal scale. At this stage, one of the main problems for an effective application of these technologies is the processing, storage, and treatment of the acquired complex ecological information. Here, we provide a conceptual overview on the technological developments in the multiparametric generation, storage, and automated hierarchic treatment of biological and environmental information required to capture the spatiotemporal complexity of a marine ecosystem. In doing so, we present a pipeline of ecological data acquisition and processing in different steps and prone to automation. We also give an example of population biomass, community richness and biodiversity data computation (as indicators for ecosystem functionality) with an Internet Operated Vehicle (a mobile crawler). Finally, we discuss the software requirements for that automated data processing at the level of cyber-infrastructures with sensor calibration and control, data banking, and ingestion into large data portals.Peer ReviewedPostprint (published version
    • …
    corecore