8 research outputs found

    Enabling instant- and interval-based semantics in multidimensional data models: the T+MultiDim Model

    Get PDF
    Time is a vital facet of every human activity. Data warehouses, which are huge repositories of historical information, must provide analysts with rich mechanisms for managing the temporal aspects of information. In this paper, we (i) propose T+MultiDim, a multidimensional conceptual data model enabling both instant- and interval-based semantics over temporal dimensions, and (ii) provide suitable OLAP (On-Line Analytical Processing) operators for querying temporal information. T+MultiDim allows one to design typical concepts of a data warehouse including temporal dimensions, and provides one with the new possibility of conceptually connecting different temporal dimensions for exploiting temporally aggregated data. The proposed approach allows one to specify and to evaluate powerful OLAP queries over information from data warehouses. In particular, we define a set of OLAP operators to deal with interval-based temporal data. Such operators allow the user to derive new measure values associated to different intervals/instants, according to different temporal semantics. Moreover, we propose and discuss through examples from the healthcare domain the SQL specification of all the temporal OLAP operators we define. (C) 2019 Elsevier Inc. All rights reserved

    Agricultura biológica em Portugal : a importância da utilização de ferramentas de business intelligence na integração e visualização de dados

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceO presente relatório de projeto tem como objetivo demonstrar uma aplicação prática da implementação de um novo sistema tecnológico de gestão de dados para a Direção-Geral da Agricultura e Desenvolvimento Rural (DGADR), nomeadamente para o Observatório Nacional de Produção Biológica (ONPB). Este sistema de gestão de base de dados é denominado por Data Warehouse, isto é, a organização de dados de forma integrada e concebida para otimizar a sua análise. Devido ao papel que ONPB assume no contexto agrícola português, uma vez que o seu principal prepósito é recolher, tratar e divulgar a informação disponível sobre produção biológica, a implementação de um Data Warehouse sobre recolha de dados operacionais em conjunto com o desenvolvimento de um processo de extração, transformação e carregamento (ETL), permitirá um aumento no dinamismo e na forma como se lida com a informação recolhida, tornando possível obter vantagens competitivas. Desta forma, há que destacar algumas das melhorias que serão obtidas deste projeto como a melhoria no processo de recolha de dados e na qualidade dos mesmos e a potencialidade de criação de sistemas analíticos de informação, que funcionem como sistemas de apoio à decisão dos utilizadores.This project report aims to show a practical application of the implementation of a new technological data management system for the Direção-Geral da Agricultura e Desenvolvimento Rural (DGADR), specifically for the Observatório Nacional de Produção Biológica (ONPB). This database management system is called Data Warehouse, that is, the organization of data in an integrated way and designed to optimize its analysis. Due to the important role that ONPB plays in the Portuguese organic agricultural context since its main purpose is to collect, process and disseminate available information on the production, the implementation of a Data Warehouse on existing operational data collection and the development of an extraction, load and transform (ETL) process will allow an increase in dynamism and in the way it deals with the information collected, making it possible to obtain competitive advantages. Saying that, we must highlight some of the improvements that will be obtained from this project like the improvement in the data collection process and quality of data and the potential for the creation of information reporting systems that function as decision support systems for the users

    Data analysis para la gestión del eje de investigación en La Universidad Técnica del Norte

    Get PDF
    Desarrollar una herramienta informática de toma de decisiones (BI) para la gestión del eje de investigación de la UTN.En las instituciones educativas de nivel Superior, la automatización de sus procesos son estrategias básicas para la permanencia dentro de su ámbito de influencia y despliegue de sus productos o servicios. En este contexto la data transaccional del día a día se vuelve un activo importante para las empresas e instituciones, por lo tanto, la implementación de sistemas de Business Intelligence (BI) para la toma de decisiones son cada vez más comunes. La presente investigación, pretende adaptar el concepto de E-portafolios como medio integrador de recursos y servicios tecnológicos que ayude a la gestión y la toma de decisiones del eje de investigación de la Universidad Técnica del Norte (UTN). Para efectos de la investigación se tomó un enfoque cualitativo descriptivo, con revisiones bibliográficas de los principales conceptos del proceso de Data Analisys, en la recolección de información se realizaron entrevistas a los stakeholder involucrados con el fin de obtener el modelo de negocio con sus indicadores claves para la gestión y toma de decisiones del eje de investigación de la UTN. La implementación del E-Portafolio permitió mejorar la gestión del eje de investigación, presentando información alineada a los objetivos estratégicos, además la integración de Business Intelligence permitió transformar los datos existentes en conocimiento orientado al modelo del negocio para la toma de decisiones

    Uma abordagem baseada em métricas para explorar alternativas de esquemas de dados no processo de conversão de RDB para NoSQL

    Get PDF
    Orientadora: Profa. Dra. Leticia Mara PeresCoorientador: Prof. Dr. Marcos Didonet Del FabroTese (doutorado) - Universidade Federal do Paraná, Setor de Ciências Exatas, Programa de Pós-Graduação em Informática. Defesa : Curitiba, 21/10/2020Inclui referências: p. 102-105Área de concentração: Ciência da ComputaçãoResumo: Com o surgimento de novas aplicações surgiram também novos requisitos sobre os sistemas de armazenamento. Cenários envolvendo dados estruturados, semiestruturados e não-estruturados são cada vez mais comuns. Os bancos de dados relacionais (RDB, do inglês Relational Database), amplamente usados para armazenar dados de diversas aplicações, já não atendem de forma adequada todas as questões impostas pelos diferentes cenários. Como alternativa surgiram os bancos de dados NoSQL (do inglês, Not only SQL), flexíveis em relação ao modelo de dados e projetados para fornecer alta escalabilidade e disponibilidade. Bancos de dados relacionais e bancos de dados NoSQL coexistirão por longo período de tempo e, como consequência, novas abordagens para converter o modelo relacional para modelos de dados NoSQL foram propostas. No entanto, a maioria dessas abordagens se destina a conversão de dados relacionais para um modelo de dados NoSQL específico e fornecem pouco suporte para customizações do processo de conversão, como seleção de campos, tabelas, instâncias e outros aspectos relativos à customização do esquema de dados produzido. Além disso, há diversas formas de estruturar os dados (ou definir esquemas de dados) ao converter RDB para NoSQL. A escolha do esquema de dados adequado não é trivial e envolve vários aspectos, como o padrão de acesso aos dados, o nível de redundância de dados desejado, o tamanho do banco de dados NoSQL resultante, o esforço de manutenção da aplicação, dentre outros. Nesta tese é definida uma abordagem para converter e migrar dados relacionais para bases NoSQL orientadas a documentos e família de colunas, composta por uma etapa de avaliação de esquemas NoSQL candidatos. A abordagem usa grafos acíclicos direcionados (DAG, do inglês Directed Acyclic Graph) para especificar a estrutura das entidades que serão migradas para o modelo de dados NoSQL e, também, para representar o padrão de acesso da aplicação (consultas). Para avaliar a abordagem foram realizados experimentos envolvendo cenários de conversão de RDB para NoSQL compostos por diferentes esquemas NoSQL candidatos. Os resultados dos experimentos mostraram que a abordagem é eficaz para identificar cenários em que há maior esforço de implementação das consultas, auxiliando o usuário no processo de seleção de esquemas NoSQL, antes de migrar de dados. Palavras-chave: Transformação de dados. Bancos de dados relacionais. Bancos de dados NoSQL. Conversão de bancos de dados. Métricas. Avaliação.Abstract: With the emergence of new applications, new requirements on storage systems have also emerged. Scenarios involving structured, semi-structured and unstructured data are increasingly common. Relational databases, widely used to store data from different applications, no longer adequately address all issues imposed by different scenarios. As an alternative, NoSQL databases have emerged, which are flexible in relation to the data model and designed to provide high scalability and availability. Relational databases and NoSQL databases will coexist for a long period of time and, as a consequence, new approaches to converting the relational model to NoSQL data models have been proposed. However, most of these approaches are aimed at converting relational data to a specific NoSQL data model and provide little support for customizing the conversion process, such as selection of fields, tables, instances, and other aspects related to the customization of the data schema produced. In addition, there are several ways to structure the data (or ways to define data schemas) when converting RDB to NoSQL. The choice of the appropriate data schema is not trivial and involves several aspects, such as the data access pattern, the desired level of data redundancy, the size of the resulting NoSQL database, the application maintenance effort, among others. This thesis defines an approach to convert and migrate relational data to document-oriented and column family NoSQL models, composed of an evaluation step of candidate NoSQL schemas. The approach uses directed acyclic graphs (DAG) to specify the structure of the entities that will be migrated to the NoSQL data model and also to represent the application's access pattern (queries). To evaluate candidate schemas, a set of metrics and scores was defined, which aims to measure the coverage of the NoSQL schema in relation to the set of queries. As NoSQL schema and query are defined through DAGs, it is possible to perform evaluations and comparisons objectively. To evaluate the approach, we performed experiments involving RDB to NoSQL conversion scenarios composed by different candidate NoSQL schemas. The results of the experiments showed that the approach is effective to identify scenarios in which there is a greater effort to implement the queries, assisting the user in the process of selecting NoSQL schemas, before executing the data migration. Keywords: Data transformation. Relational databases. NoSQL databases. Database conversion. Metrics. Evaluation

    MLED_BI: A Novel Business Intelligence Design Approach to Support Multilingualism

    Get PDF
    With emerging markets and expanding international cooperation, there is a requirement to support Business Intelligence (BI) applications in multiple languages, a process which we refer to as Multilingualism (ML). ML in BI is understood in this research as the ability to store descriptive content (such as descriptions of attributes in BI reports) in more than one language at Data Warehousing (DWH) level and to use this information at presentation level to provide reports, queries or dashboards in more than one language. Design strategies for data warehouses are typically based on the assumption of a single language environment. The motivations for this research are the design and performance challenges encountered when implementing ML in a BI data warehouse environment. These include design issues, slow response times, delays in updating reports and changing languages between reports, the complexity of amending existing reports and the performance overhead. The literature review identified that the underlying cause of these problems is that existing approaches used to enable ML in BI are primarily ad-hoc workarounds which introduce dependency between elements and lead to excessive redundancy. From the literature review, it was concluded that a satisfactory solution to the challenge of ML in BI requires a design approach based on data independence the concept of immunity from changes and that such a solution does not currently exist. This thesis presents MLED_BI (Multilingual Enabled Design for Business Intelligence). MLED_BI is a novel design approach which supports data independence and immunity from changes in the design of ML data warehouses and BI systems. MLED_BI extends existing data warehouse design approaches by revising the role of the star schema and introducing a ML design layer to support the separation of language elements. This also facilitates ML at presentation level by enabling the use of a ML content management system. Compared to existing workarounds for ML, the MLED_BI design approach has a theoretical underpinning which allows languages to be added, amended and deleted without requiring a redesign of the star schema; provides support for the manipulation of ML content; improves performance and streamlines data warehouse operations such as ETL (Extract, Transform, Load). Minor contributions include the development of a novel BI framework to address the limitations of existing BI frameworks and the development of a tool to evaluate changes to BI reporting solutions. The MLED_BI design approach was developed based on the literature review and a mixed methods approach was used for validation. Technical elements were validated experimentally using performance metrics while end user acceptance was validated qualitatively with end users and technical users from a number of countries, reflecting the ML basis of the research. MLED_BI requires more resources at design and initial implementation stage than existing ML workarounds but this is outweighed by improved performance and by the much greater flexibility in ML made possible by the data independence approach of MLED_BI. The MLED_BI design approach enhances existing BI design approaches for use in ML environments

    Cost-benefit analysis of data warehouse design methodologies

    No full text
    Methodologies for data warehouse design are increasing more and more in last years, and each of them proposes a different point of view. Among all the methodologies present in literature, the promising ones are the hybrid methodologies—because they represent the only way to ensure a multidimensional schema to be both consistent with data sources and adherent to user business goals—and those able to support the designer by providing some kind of automation. However, the results obtainable by the methodologies can differ substantially in terms of schema quality and required efforts. In this paper, we provide metrics for evaluating the quality of multidimensional schemata in reference to the effort spent in the design process and the automation degree of the methodology. As a case study, we apply our evaluation to the major emerging hybrid methodologies for data warehouse schema design
    corecore