157 research outputs found

    Using Ontologies for the Design of Data Warehouses

    Get PDF
    Obtaining an implementation of a data warehouse is a complex task that forces designers to acquire wide knowledge of the domain, thus requiring a high level of expertise and becoming it a prone-to-fail task. Based on our experience, we have detected a set of situations we have faced up with in real-world projects in which we believe that the use of ontologies will improve several aspects of the design of data warehouses. The aim of this article is to describe several shortcomings of current data warehouse design approaches and discuss the benefit of using ontologies to overcome them. This work is a starting point for discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure

    An online analytical processing multi-dimensional data warehouse for malaria data

    Get PDF
    Malaria is a vector-borne disease that contributes substantially to the global burden of morbidity and mortality. The management of malaria-related data from heterogeneous, autonomous, and distributed data sources poses unique challenges and requirements. Although online data storage systems exist that address specific malaria-related issues, a globally integrated online resource to address different aspects of the disease does not exist. In this article, we describe the design, implementation, and applications of a multidimensional, online analytical processing data warehouse, named the VecNet Data Warehouse (VecNet-DW). It is the first online, globally-integrated platform that provides efficient search, retrieval and visualization of historical, predictive, and static malaria-related data, organized in data marts. Historical and static data are modelled using star schemas, while predictive data are modelled using a snowflake schema. The major goals, characteristics, and components of the DW are described along with its data taxonomy and ontology, the external data storage systems and the logical modelling and physical design phases. Results are presented as screenshots of a Dimensional Data browser, a Lookup Tables browser, and a Results Viewer interface. The power of the DW emerges from integrated querying of the different data marts and structuring those queries to the desired dimensions, enabling users to search, view, analyse, and store large volumes of aggregated data, and responding better to the increasing demands of users

    Integration of heterogeneous multidimensional data marts

    Get PDF
     Data analysts often require access to integrated multidimensional data from local and external data warehouses. The integration process is often undertaken by expert database practitioners who will need to analyze the structure of the data, and match schemas and data before creating an integrated view of the data for visualization and analysis. Such a manual process may be acceptable for databases used in transaction processing applications but does not help decision makers who need access to the information quickly and cost effective in a constantly changing environment. This thesis addresses several challenges towards automating the integration of data warehouses based on a dimensional model known as Star schema. We recognize that the structure of multidimensional data, namely dimension hierarchies, is critical to the accuracy of the integration but is not always available or accessible. To address this problem, we infer dimension hierarchies from their instances, and demonstrate that they are sufficient to ensure the accuracy of the integration even though they may vary from the intended hierarchies. To improve the accuracy of matching Star schemas, we propose a more precise representation of Star schemas and demonstrate its effectiveness by comparing it against the existing approaches that treat Star schemas as relational models. To match instances of dimensions, we demonstrate that a graph matching algorithm is effective and performs with a high level of accuracy. We propose algorithms which enforce the tree structure of integrated data which is necessary for correct aggregation, and reduce false positive cases occurring during the instance matching. The effectiveness of our algorithms is shown through experiments with real life data. Despite perfectly matching schemas and hierarchies, there are often dimensions with mismatching data which restrict the scope of the integration. We propose to relax the requirement for dimension compatibility, and introduce measures that quantify the loss of data resulting from the less strict requirement. These measures enable data analysts to identify lossless fragments of data, and thereby, extend the scope of the integrated data. To provide a more comprehensive view of data for analysis, we link the integrated data with the data exclusive to each source by extending the navigation operation for multidimensional data. These contributions help towards shifting the integration problem away from expert database practitioners to empowered data analysts in combining multidimensional data from multiple sources in real time, and in a cost effective manner

    Presenting Business Insights on Advanced Pricing Agreements Using a Business Intelligence Framework

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceIn companies that use advanced pricing agreements, pricing managers are responsible for setting the new and adjusted discounts from time to time. These companies are usually of great dimension and so the number of products and customers is extensive, which causes the decision-making to be challenging for the pricing managers. To aid in this process, this project report incorporates a business intelligence framework to model the data into a dimensional model that will provide the pricing managers with business insights by allowing them to have a more targeted and detailed view of the data through multiple contextual perspectives. The data sources used were provided by a client at BI4ALL and consist of two different JDE extracts: an export of the advanced pricing agreements that include all the pricing rules and an export of the sales data following those pricing rules. Both sources of data will be used to implement a business intelligence framework. The final outcome of this project report is presented in a dashboard with multiple visualizations, where the pricing manager can navigate and obtain data in a dynamic way according to the information requested. This will allow for a better analysis, and thus, for better pricing adjustment and optimization

    OLAPing Reflexive Multidimensional Fact

    Get PDF
    The multidimensional data model and implementations of social networks come with a set of specific constraints, such as missing data, reflexive relationship on fact instance. However, the conventional OLAP operators and existing models do not provide solutions for handling those specificities. Therefore, further efforts should be invested to extend these operators to take into consideration the specificities of multidimensional modeling of tweets and their manipulation. Face to this issue, we propose, in this paper, two new OLAP operators that enhance existing solutions for OLAP analyses involving a reflexive relationship on the fact instances. For each OLAP operator, we suggest a user-oriented definition as an algebraic formalization, along with an implementation algorithmic

    The Digital Persona and Trust Bank: A Privacy Management Framework

    Get PDF
    Recently, the government of India embarked on an ambitious project of designing and deploying the Integrated National Agricultural Resources Information System (INARIS) data warehouse for the agricultural sector. The system’s purpose is to support macro level planning. This paper presents some of the challenges faced in designing the data warehouse, specifically dimensional and deployment challenges of the warehouse. We also present some early user evaluations of the warehouse. Governmental data warehouse implementations are rare, especially at the national level. Furthermore, the motivations are significantly different from private sectors. Designing the INARIS agricultural data warehouse posed unique and significant challenges because, traditionally, the collection and dissemination of information are localized

    A Data Centric Privacy Preserved Mining Model for Business Intelligence Applications

    Get PDF
    In present day competitive scenario, the techniques such as data warehouse and on-line analytical process (OLAP) have become a very significant approach for decision support in data centric applications and industries. In fact the decision support mechanism puts certain moderately varied needs on database technology as compared to OLAP based applications. Data centric decision support schemes (DSS) like data warehouse might play a significant role in extracting details from various areas and for standardizing data throughout the organization to achieve a singular way of detail presentation. Such efficient data presentation facilitates information for decision making in business intelligence (BI) applications in various industrial services. In order to enhance the effectiveness and robust computation in BI applications, the optimization in data mining and its processing is must. On the other hand, being in a multiuser scenario, the security of data on warehouse is also a critical issue, which is not explored till date. In this paper a data centric and service oriented privacy preserved model for BI applications has been proposed. The optimization in data mining has been accomplished by means of C5.0 classification algorithm and comparative study has been done with C4.5 algorithm. The implementation of enhanced C5.0 algorithm with BI model would provide much better performance with privacy preservation facility for Business Intelligence applications

    Flexible Integration and Efficient Analysis of Multidimensional Datasets from the Web

    Get PDF
    If numeric data from the Web are brought together, natural scientists can compare climate measurements with estimations, financial analysts can evaluate companies based on balance sheets and daily stock market values, and citizens can explore the GDP per capita from several data sources. However, heterogeneities and size of data remain a problem. This work presents methods to query a uniform view - the Global Cube - of available datasets from the Web and builds on Linked Data query approaches

    A client focused business intelligence & analytics solution for the hospitality sector

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceOne of the greatest needs of today's business is to know the customer or the type of customer it wants to reach, which makes a customer database a strategic weapon and one of the most important investments a company can make. The business world is becoming more competitive every day, we are constantly overwhelmed with advertisements of products we may like, product promotions we usually buy or discounts on the next purchase if we subscribe to the company’s newsletter. All of this creates a client customization, and any company that is not able to do this cannot keep up with its competition. This report details the project developed at Pestana Hotel Group, which consisted of a Business Intelligence solution, more specifically the development of a customer database with the creation of two tabular models using SQL Server tools, one specific for loyal customers and another, more general, with information about all Pestana customers, and two Power BI reports that allow the visualization of the information obtained in an effective and simplified way. This report contains a literature review that situates the reader on the subject addressed in this project, a chapter dedicated to the data modeling used to create the tabular models, and another on the creation of the reports.Uma das maiores necessidades dos negócios atuais é conhecer o seu cliente ou o tipo de cliente que quer atingir, o que torna uma base de dados de cliente uma arma estratégica e um dos mais importantes investimentos. O mundo empresarial está cada dia mais competitivo, somos constantemente assoberbados com anúncios de produtos que podemos gostar, promoções de produtos que costumamos comprar ou descontos na próxima compra caso subscrevamos a newsletter. Tudo isto cria uma personalização para o cliente, e qualquer empresa que não o consiga fazer não conseguirá acompanhar a concorrência. Este relatório detalha o projeto feito no Pestana Hotel Group, que consistiu numa solução de Business Intelligence, mais especificamente na construção de uma base de dados do cliente com a criação de dois modelos tabulares através de ferramentas do SQL Server, um específico para clientes fidelizados e outro mais geral com informação sobre todos os clientes Pestana, e dois relatórios em Power BI que permitem a visualização da informação obtida de uma forma eficaz e simplificada. O relatório contém uma revisão de literatura que situa o leitor sobre os assuntos abordados neste projeto, um capítulo dedicado à modelação dos dados de forma a criar os modelos tabulares e outro sobre a criação dos relatórios

    Business intelligence to support NOVA IMS academic services BI system

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceKimball argues that Business Intelligence is one of the most important assets of any organization, allowing it to store, explore and add value to the organization’s data which will ultimately help in the decision making process. Nowadays, some organizations and, in this specific case, some schools are not yet transforming data into their full potential and business intelligence is one of the most known tools to help schools in this issue, seen as some of them are still using out-dated information systems, and do not yet apply business intelligence techniques to their increasing amounts of data so as to turn it into useful information and knowledge. In the present report, I intend to analyse the current NOVA IMS academic services data and the rationales behind the need to work with this data, so as to propose a solution that will ultimately help the school board or the academic services to make better-supported decisions. In order to do so, it was developed a Data Warehouse that will clean and transform the source database. Another important step to help the academic services is to present a series of reports to discover information in the decision making process
    corecore