22,120 research outputs found

    Building data warehouses in the era of big data: an approach for scalable and flexible big data warehouses

    Get PDF
    During the last few years, the concept of Big Data Warehousing gained significant attention from the scientific community, highlighting the need to make design changes to the traditional Data Warehouse (DW) due to its limitations, in order to achieve new characteristics relevant in Big Data contexts (e.g., scalability on commodity hardware, real-time performance, and flexible storage). The state-of-the-art in Big Data Warehousing reflects the young age of the concept, as well as ambiguity and the lack of common approaches to build Big Data Warehouses (BDWs). Consequently, an approach to design and implement these complex systems is of major relevance to business analytics researchers and practitioners. In this tutorial, the design and implementation of BDWs is targeted, in order to present a general approach that researchers and practitioners can follow in their Big Data Warehousing projects, exploring several demonstration cases focusing on system design and data modelling examples in areas like smart cities, retail, finance, manufacturing, among others

    Improving lifecycle query in integrated toolchains using linked data and MQTT-based data warehousing

    Full text link
    The development of increasingly complex IoT systems requires large engineering environments. These environments generally consist of tools from different vendors and are not necessarily integrated well with each other. In order to automate various analyses, queries across resources from multiple tools have to be executed in parallel to the engineering activities. In this paper, we identify the necessary requirements on such a query capability and evaluate different architectures according to these requirements. We propose an improved lifecycle query architecture, which builds upon the existing Tracked Resource Set (TRS) protocol, and complements it with the MQTT messaging protocol in order to allow the data in the warehouse to be kept updated in real-time. As part of the case study focusing on the development of an IoT automated warehouse, this architecture was implemented for a toolchain integrated using RESTful microservices and linked data.Comment: 12 pages, worksho

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft
    • …
    corecore