644 research outputs found

    Integrating data warehouses with web data : a survey

    Get PDF
    This paper surveys the most relevant research on combining Data Warehouse (DW) and Web data. It studies the XML technologies that are currently being used to integrate, store, query, and retrieve Web data and their application to DWs. The paper reviews different DW distributed architectures and the use of XML languages as an integration tool in these systems. It also introduces the problem of dealing with semistructured data in a DW. It studies Web data repositories, the design of multidimensional databases for XML data sources, and the XML extensions of OnLine Analytical Processing techniques. The paper addresses the application of information retrieval technology in a DW to exploit text-rich document collections. The authors hope that the paper will help to discover the main limitations and opportunities that offer the combination of the DW and the Web fields, as well as to identify open research line

    Business Intelligence for Small and Middle-Sized Entreprises

    Full text link
    Data warehouses are the core of decision support sys- tems, which nowadays are used by all kind of enter- prises in the entire world. Although many studies have been conducted on the need of decision support systems (DSSs) for small businesses, most of them adopt ex- isting solutions and approaches, which are appropriate for large-scaled enterprises, but are inadequate for small and middle-sized enterprises. Small enterprises require cheap, lightweight architec- tures and tools (hardware and software) providing on- line data analysis. In order to ensure these features, we review web-based business intelligence approaches. For real-time analysis, the traditional OLAP architecture is cumbersome and storage-costly; therefore, we also re- view in-memory processing. Consequently, this paper discusses the existing approa- ches and tools working in main memory and/or with web interfaces (including freeware tools), relevant for small and middle-sized enterprises in decision making

    Scalable Model-Based Management of Correlated Dimensional Time Series in ModelarDB+

    Full text link
    To monitor critical infrastructure, high quality sensors sampled at a high frequency are increasingly used. However, as they produce huge amounts of data, only simple aggregates are stored. This removes outliers and fluctuations that could indicate problems. As a remedy, we present a model-based approach for managing time series with dimensions that exploits correlation in and among time series. Specifically, we propose compressing groups of correlated time series using an extensible set of model types within a user-defined error bound (possibly zero). We name this new category of model-based compression methods for time series Multi-Model Group Compression (MMGC). We present the first MMGC method GOLEMM and extend model types to compress time series groups. We propose primitives for users to effectively define groups for differently sized data sets, and based on these, an automated grouping method using only the time series dimensions. We propose algorithms for executing simple and multi-dimensional aggregate queries on models. Last, we implement our methods in the Time Series Management System (TSMS) ModelarDB (ModelarDB+). Our evaluation shows that compared to widely used formats, ModelarDB+ provides up to 13.7 times faster ingestion due to high compression, 113 times better compression due to the adaptivity of GOLEMM, 630 times faster aggregates by using models, and close to linear scalability. It is also extensible and supports online query processing.Comment: 12 Pages, 28 Figures, and 1 Tabl

    CFBM - A Framework for Data Driven Approach in Agent-Based Modeling and Simulation

    Get PDF
    Recently, there has been a shift from modeling driven approach to data driven approach in Agent Based Modeling and Simulation (ABMS). This trend towards the use of data-driven approaches in simulation aims at using more and more data available from the observation systems into simulation models [1, 2]. In a data driven approach, the empirical data collected from the target system are used not only for the design of the simulation models but also in initialization, evaluation of the output of the simulation platform. That raises the question how to manage empirical data, simulation data and compare those data in such agent-based simulation platform. In this paper, we first introduce a logical framework for data driven approach in agent-based modeling and simulation. The introduced framework is based on the combination of Business Intelligence solution and a multi-agent based platform called CFBM (Combination Framework of Business intelligence and Multi-agent based platform). Secondly, we demonstrate the application of CFBM for data driven approach via the development of a Brown Plant Hopper Surveillance Models (BSMs), where CFBM is used not only to manage and integrate the whole empirical data collected from the target system and the data produced by the simulation model, but also to initialize and validate the models. The successful development of the CFBM consists not only in remedying the limitation of agent-based modeling and simulation with regard to data management but also in dealing with the development of complex simulation systems with large amount of input and output data supporting a data driven approach

    Relatório de Estágio - Solução de BI Roaming Data Science (RoaDS) em ambiente Vodafone

    Get PDF
    A telecom company (Vodafone), had the need to implement a Business Intelligence solution for Roaming data across a wide set of different data sources. Based on the data visualization of this solution, its key users with decision power, can make a business analysis and needs of infrastructure and software expansion. This document aims to expose the scientific papers produced with the various stages of production of the solution (state of the art, architecture design and implementation results), this Business Intelligence solution was designed and implemented with OLAP methodologies and technologies in a Data Warehouse composed of Data Marts arranged in constellation, the visualization layer was custom made in JavaScript (VueJS). As a base for the results a questionnaire was created to be filled in by the key users of the solution. Based on this questionnaire it was possible to ascertain that user acceptance was satisfactory. The proposed objectives for the implementation of the BI solution with all the requirements was achieved with the infrastructure itself created from scratch in Kubernetes. This BI platform can be expanded using column storage databases created specifically with OLAP workloads in mind, removing the need for an OLAP cube layer. Based on Machine Learning algorithms, the platform will be able to perform the predictions needed to make decisions about Vodafone's Roaming infrastructure

    A Holistic Approach to OLAP Sessions Composition: The Falseto Experience

    Get PDF
    International audienceOLAP is the main paradigm for flexible and effective exploration of multidimensional cubes in data warehouses. During an OLAP session the user analyzes the results of a query and determines a new query that will give her a better understanding of information. Given the huge size of the data space, this exploration process is often tedious and may leave the user disoriented and frustrated. This paper presents an OLAP tool 1 named Falseto (Former AnalyticaL Sessions for lEss Tedious Olap), that is meant to assist query and session composition, by letting the user summarize, browse, query, and reuse former analytical sessions. Falseto's implementation on top of a formal framework is detailed. We also report the experiments we run to obtain and analyze real OLAP sessions and assess Falseto with them. Finally, we discuss how Falseto can be seen as a starting point for bridging OLAP with exploratory search, a search paradigm centered on the user and the evolution of her knowledge
    • …
    corecore