6,205 research outputs found

    Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

    Full text link
    Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

    Vectorwise: Beyond Column Stores

    Get PDF
    textabstractThis paper tells the story of Vectorwise, a high-performance analytical database system, from multiple perspectives: its history from academic project to commercial product, the evolution of its technical architecture, customer reactions to the product and its future research and development roadmap. One take-away from this story is that the novelty in Vectorwise is much more than just column-storage: it boasts many query processing innovations in its vectorized execution model, and an adaptive mixed row/column data storage model with indexing support tailored to analytical workloads. Another one is that there is a long road from research prototype to commercial product, though database research continues to achieve a strong innovative influence on product development

    A Technology Proposal for a Management Information System for the Director’s Office, NAL.

    Get PDF
    This technology proposal attempts in giving a viable solution for a Management Information System (MIS) for the Director's Office. In today's IT scenario, an Organization's success greatly depends on its ability to get accurate and timely data on its operations of varied nature and to manage this data effectively to guide its activities and meet its goals. To cater to the information needs of an Organization or an Office like the Director's Office, information systems are developed and deployed to gather and process data in ways that produce a variety of information to the end-user. MIS can therefore can be defined as an integrated user-machine system for providing information to support operations, management and decision-making functions in an Organization. The system in a nutshell, utilizes computer hardware and software, manual procedures, models for analysis planning, control and decision-making and a database. Using state-of-the-art front-end and back-end web based tools, this technology proposal attempts to provide a single-point Information Management, Information Storage, Information Querying and Information Retrieval interface to the Director and his office for handling all information traffic flow in and out of the Director's Office

    CaosDB - Research Data Management for Complex, Changing, and Automated Research Workflows

    Full text link
    Here we present CaosDB, a Research Data Management System (RDMS) designed to ensure seamless integration of inhomogeneous data sources and repositories of legacy data. Its primary purpose is the management of data from biomedical sciences, both from simulations and experiments during the complete research data lifecycle. An RDMS for this domain faces particular challenges: Research data arise in huge amounts, from a wide variety of sources, and traverse a highly branched path of further processing. To be accepted by its users, an RDMS must be built around workflows of the scientists and practices and thus support changes in workflow and data structure. Nevertheless it should encourage and support the development and observation of standards and furthermore facilitate the automation of data acquisition and processing with specialized software. The storage data model of an RDMS must reflect these complexities with appropriate semantics and ontologies while offering simple methods for finding, retrieving, and understanding relevant data. We show how CaosDB responds to these challenges and give an overview of the CaosDB Server, its data model and its easy-to-learn CaosDB Query Language. We briefly discuss the status of the implementation, how we currently use CaosDB, and how we plan to use and extend it

    Principles of Multimedia News Systems for Business Applications

    Get PDF
    In the past few years considerable demand for business oriented multimedia information systems has developed. A multimedia information system is one that can create, import, integrate, store, retrieve, edit, and delete two or more types of media materials in digital form, such as audio, image, full-motion video, and text information. Multimedia information systems play a central role in many business activities. They represent a very special class complex computing systems. This paper surveys a special type of multimedia information systems: multimedia news systems. Multimedia news systems deal with architectures to manage complex multimedia news databases, online presentation and distribution services or the integration of several existing services to meta-services using intelligent news retrieval engines. The leading presentation platform in multimedia news presentation is news networks providing television services and Internet content distribution. The primary focus is on advanced multimedia news systems infrastructure, document standards, application architecture and principles for multimedia news on the Web that suggest long-term trends in this increasingly important area.multimedia, information systems, business applications

    RETHINK big: European roadmap for hardware anc networking optimizations for big data

    Get PDF
    This paper discusses the results of the RETHINK big Project, a 2-year Collaborative Support Action funded by the European Commission in order to write the European Roadmap for Hardware and Networking optimizations for Big Data. This industry-driven project was led by the Barcelona Supercomputing Center (BSC), and it included large industry partners, SMEs and academia. The roadmap identifies business opportunities from 89 in-depth interviews with 70 European industry stakeholders in the area of Big Data and predicts the future technologies that will disrupt the state of the art in Big Data processing in terms of hardware and networking optimizations. Moreover, it presents coordinated technology development recommendations (focused on optimizations in networking and hardware) that would be in the best interest of European Big Data companies to undertake in concert as a matter of competitive advantage.This project has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement n° 619788. It has also been supported by the Spanish Government (grant SEV2015-0493 of the Severo Ochoa Program), by the Spanish Ministry of Science and Innovation (contract TIN2015-65316) and by Generalitat de Catalunya (contracts 2014-SGR-1051 and 2014-SGR-1272).Peer ReviewedPostprint (author's final draft

    Extract, Transform, and Load data from Legacy Systems to Azure Cloud

    Get PDF
    Internship report presented as partial requirement for obtaining the Master’s degree in Information Management, with a specialization in Knowledge Management and Business IntelligenceIn a world with continuously evolving technologies and hardened competitive markets, organisations need to continually be on guard to grasp cutting edge technology and tools that will help them to surpass any competition that arises. Modern data platforms that incorporate cloud technologies, support organisations to strive and get ahead of their competitors by providing solutions that help them capture and optimally use untapped data, and scalable storages to adapt to ever-growing data quantities. Also, adopt data processing and visualisation tools that help to improve the decision-making process. With many cloud providers available in the market, from small players to major technology corporations, this offers much flexibility to organisations to choose the best cloud technology that will align with their use cases and overall products and services strategy. This internship came up at the time when one of Accenture’s significant client in the financial industry decided to migrate from legacy systems to a cloud-based data infrastructure that is Microsoft Azure cloud. During this internship, development of the data lake, which is a core part of the MDP, was done to understand better the type of challenges that can be faced when migrating data from on-premise legacy systems to a cloud-based infrastructure. Also, provided in this work, are the main recommendations and guidelines when it comes to performing a large scale data migration
    corecore