33 research outputs found

    Model Reka Bentuk Konseptual Operasian Storan Data Bagi Aplikasi Kepintaran Perniagaan

    Get PDF
    The development of business intelligence (BI) applications, involving of data sources, Data Warehouse (DW), Data Mart (DM) and Operational Data Store (ODS), imposes a major challenge to BI developers. This is mainly due to the lack of established models, guidelines and techniques in the development process as compared to system development in the discipline of software engineering. Furthermore, the present BI applications emphasize on the development of strategic information in contrast to operational and tactical. Therefore, the main aim of this study is to propose a conceptual design model for BI applications using ODS (CoDMODS). Through expert validation, the proposed conceptual design model that was developed by means of design science research approach, was found to satisfy nine quality model dimensions, which are, easy to understand, covers clear steps, is relevant and timeless, demonstrates flexibility, scalability, accuracy, completeness and consistency. Additionally, the two prototypes that were developed based on CoDMODS for water supply service (iUBIS) and telecommunication maintenance (iPMS) recorded a high usability average min value of 5.912 using Computer System Usability Questionnaire (CSUQ) instrument. The outcomes of this study, particularly the proposed model, contribute to the analysis and design method for the development of the operational and tactical information in BI applications. The model can be referred as guidelines by BI developers. Furthermore, the prototypes that were developed in the case studies can assist the organizations in using quality information for business operations

    A Goal and Ontology Based Approach for Generating ETL Process Specifications

    Get PDF
    Data warehouse (DW) systems development involves several tasks such as defining requirements, designing DW schemas, and specifying data transformation operations. Indeed, the success of DW systems is very much dependent on the proper design of the extracting, transforming, and loading (ETL) processes. However, the common design-related problems in the ETL processes such as defining user requirements and data transformation specifications are far from being resolved. These problems are due to data heterogeneity in data sources, ambiguity of user requirements, and the complexity of data transformation activities. Current approaches have limitations on the reconciliation of DW requirement semantics towards designing the ETL processes. As a result, this has prolonged the process of the ETL processes specifications generation. The semantic framework of DW systems established from this study is used to develop the requirement analysis method for designing the ETL processes (RAMEPs) from the different perspectives of organization, decision-maker, and developer by using goal and ontology approaches. The correctness of RAMEPs approach was validated by using modified and newly developed compliant tools. The RAMEPs was evaluated in three real case studies, i.e., Student Affairs System, Gas Utility System, and Graduate Entrepreneur System. These case studies were used to illustrate how the RAMEPs approach can be implemented for designing and generating the ETL processes specifications. Moreover, the RAMEPs approach was reviewed by the DW experts for assessing the strengths and weaknesses of this method, and the new approach is accepted. The RAMEPs method proves that the ETL processes specifications can be derived from the early phases of DW systems development by using the goal-ontology approach

    Conceptual design model using operational data store(CoDMODS) for developing business intelligence applications

    Get PDF
    Building a Business Intelligence (BI) application is very challenging as it is a young discipline and does not yet offer well-established strategies and techniques for the developments process when compared to the software engineering discipline. Furthermore, information requirements analysis for BI applications which integrate data from heterogeneous sources differs significantly from requirements analysis for a conventional information system.Conceptual Design Model Operational Data Store (CoDMODS) to build BI application that focuses on operational information to support business operations is proposed.In this model, combination of community interaction and data integration approach were used to identify the requirements for developing BI application.Furthermore, how the operational data store can be used for operational and tactical information and can be transferred to a data warehouse for supporting analytical information and decision making is also presented. Finally, to verify and validate the proposed model, the case study approach using web application development in selected subject areas is elaborated

    Temporal and Contextual Dependencies in Relational Data Modeling

    Get PDF
    Although a solid theoretical foundation of relational data modeling has existed for decades, critical reassessment from temporal requirements’ perspective reveals shortcomings in its integrity constraints. We identify the need for this work by discussing how existing relational databases fail to ensure correctness of data when the data to be stored is time sensitive. The analysis presented in this work becomes particularly important in present times where, because of relational databases’ inadequacy to cater to all the requirements, new forms of database systems such as temporal databases, active databases, real time databases, and NoSQL (non-relational) databases have been introduced. In relational databases, temporal requirements have been dealt with either at application level using scripts or through manual assistance, but no attempts have been made to address them at design level. These requirements are the ones that need changing metadata as the time progresses, which remains unsupported by Relational Database Management System (RDBMS) to date. Starting with shortcomings of data, entity, and referential integrity in relational data modeling, we propose a new form of integrity that works at a more detailed level of granularity. We also present several important concepts including temporal dependency, contextual dependency, and cell level integrity. We then introduce cellular-constraints to implement the proposed integrity and dependencies, and also how they can be incorporated into the relational data model to enable RDBMS to handle temporal requirements in future. Overall, we provide a formal description to address the temporal requirements’ problem in relational data model, and design a framework for solving this problem. We have supplemented our proposition using examples, experiments and results

    Community and data integration approach using requirement centric operational data store model (ReCODS-Model) for business intelligence applications

    Get PDF
    Building a Business Intelligence (BI) application is very challenging as it is a young discipline and does not yet offer well-established strategies and techniques for the developments process when compared to the software engineering discipline. Furthermore, information requirements analysis for BI applications which integrate data from heterogeneous sources differs significantly from requirements analysis for a conventional information system. Requirement Centric Operational Data Store Model (ReCODS-M) to build BI application that focuses on operational information to support business operations is proposed. In this model, combination of community interaction and data integration approach were used to identify the requirements for developing BI application. Furthermore, how the operational data store can be used for operational and tactical information and can be transferred to a data warehouse for supporting analytical information and decision making is also presented. Finally, to verify and validate the proposed model, the case study approach using web application development in selected subject areas is elaborated

    Formal design of data warehouse and OLAP systems : a dissertation presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Systems at Massey University, Palmerston North, New Zealand

    Get PDF
    A data warehouse is a single data store, where data from multiple data sources is integrated for online business analytical processing (OLAP) of an entire organisation. The rationale being single and integrated is to ensure a consistent view of the organisational business performance independent from different angels of business perspectives. Due to its wide coverage of subjects, data warehouse design is a highly complex, lengthy and error-prone process. Furthermore, the business analytical tasks change over time, which results in changes in the requirements for the OLAP systems. Thus, data warehouse and OLAP systems are rather dynamic and the design process is continuous. In this thesis, we propose a method that is integrated, formal and application-tailored to overcome the complexity problem, deal with the system dynamics, improve the quality of the system and the chance of success. Our method comprises three important parts: the general ASMs method with types, the application tailored design framework for data warehouse and OLAP, and the schema integration method with a set of provably correct refinement rules. By using the ASM method, we are able to model both data and operations in a uniform conceptual framework, which enables us to design an integrated approach for data warehouse and OLAP design. The freedom given by the ASM method allows us to model the system at an abstract level that is easy to understand for both users and designers. More specifically, the language allows us to use the terms from the user domain not biased by the terms used in computer systems. The pseudo-code like transition rules, which gives the simplest form of operational semantics in ASMs, give the closeness to programming languages for designers to understand. Furthermore, these rules are rooted in mathematics to assist in improving the quality of the system design. By extending the ASMs with types, the modelling language is tailored for data warehouse with the terms that are well developed for data-intensive applications, which makes it easy to model the schema evolution as refinements in the dynamic data warehouse design. By providing the application-tailored design framework, we break down the design complexity by business processes (also called subjects in data warehousing) and design concerns. By designing the data warehouse by subjects, our method resembles Kimball's "bottom-up" approach. However, with the schema integration method, our method resolves the stovepipe issue of the approach. By building up a data warehouse iteratively in an integrated framework, our method not only results in an integrated data warehouse, but also resolves the issues of complexity and delayed ROI (Return On Investment) in Inmon's "top-down" approach. By dealing with the user change requests in the same way as new subjects, and modelling data and operations explicitly in a three-tier architecture, namely the data sources, the data warehouse and the OLAP (online Analytical Processing), our method facilitates dynamic design with system integrity. By introducing a notion of refinement specific to schema evolution, namely schema refinement, for capturing the notion of schema dominance in schema integration, we are able to build a set of correctness-proven refinement rules. By providing the set of refinement rules, we simplify the designers's work in correctness design verification. Nevertheless, we do not aim for a complete set due to the fact that there are many different ways for schema integration, and neither a prescribed way of integration to allow designer favored design. Furthermore, given its °exibility in the process, our method can be extended for new emerging design issues easily

    Business Intelligence on Non-Conventional Data

    Get PDF
    The revolution in digital communications witnessed over the last decade had a significant impact on the world of Business Intelligence (BI). In the big data era, the amount and diversity of data that can be collected and analyzed for the decision-making process transcends the restricted and structured set of internal data that BI systems are conventionally limited to. This thesis investigates the unique challenges imposed by three specific categories of non-conventional data: social data, linked data and schemaless data. Social data comprises the user-generated contents published through websites and social media, which can provide a fresh and timely perception about people’s tastes and opinions. In Social BI (SBI), the analysis focuses on topics, meant as specific concepts of interest within the subject area. In this context, this thesis proposes meta-star, an alternative strategy to the traditional star-schema for modeling hierarchies of topics to enable OLAP analyses. The thesis also presents an architectural framework of a real SBI project and a cross-disciplinary benchmark for SBI. Linked data employ the Resource Description Framework (RDF) to provide a public network of interlinked, structured, cross-domain knowledge. In this context, this thesis proposes an interactive and collaborative approach to build aggregation hierarchies from linked data. Schemaless data refers to the storage of data in NoSQL databases that do not force a predefined schema, but let database instances embed their own local schemata. In this context, this thesis proposes an approach to determine the schema profile of a document-based database; the goal is to facilitate users in a schema-on-read analysis process by understanding the rules that drove the usage of the different schemata. A final and complementary contribution of this thesis is an innovative technique in the field of recommendation systems to overcome user disorientation in the analysis of a large and heterogeneous wealth of data

    On Pattern Mining in Graph Data to Support Decision-Making

    Get PDF
    In recent years graph data models became increasingly important in both research and industry. Their core is a generic data structure of things (vertices) and connections among those things (edges). Rich graph models such as the property graph model promise an extraordinary analytical power because relationships can be evaluated without knowledge about a domain-specific database schema. This dissertation studies the usage of graph models for data integration and data mining of business data. Although a typical company's business data implicitly describes a graph it is usually stored in multiple relational databases. Therefore, we propose the first semi-automated approach to transform data from multiple relational databases into a single graph whose vertices represent domain objects and whose edges represent their mutual relationships. This transformation is the base of our conceptual framework BIIIG (Business Intelligence with Integrated Instance Graphs). We further proposed a graph-based approach to data integration. The process is executed after the transformation. In established data mining approaches interrelated input data is mostly represented by tuples of measure values and dimension values. In the context of graphs these values must be attached to the graph structure and aggregated measure values are graph attributes. Since the latter was not supported by any existing model, we proposed the use of collections of property graphs. They act as data structure of the novel Extended Property Graph Model (EPGM). The model supports vertices and edges that may appear in different graphs as well as graph properties. Further on, we proposed some operators that benefit from this data structure, for example, graph-based aggregation of measure values. A primitive operation of graph pattern mining is frequent subgraph mining (FSM). However, existing algorithms provided no support for directed multigraphs. We extended the popular gSpan algorithm to overcome this limitation. Some patterns might not be frequent while their generalizations are. Generalized graph patterns can be mined by attaching vertices to taxonomies. We proposed a novel approach to Generalized Multidimensional Frequent Subgraph Mining (GM-FSM), in particular the first solution to generalized FSM that supports not only directed multigraphs but also multiple dimensional taxonomies. In scenarios that compare patterns of different categories, e.g., fraud or not, FSM is not sufficient since pattern frequencies may differ by category. Further on, determining all pattern frequencies without frequency pruning is not an option due to the computational complexity of FSM. Thus, we developed an FSM extension to extract patterns that are characteristic for a specific category according to a user-defined interestingness function called Characteristic Subgraph Mining (CSM). Parts of this work were done in the context of GRADOOP, a framework for distributed graph analytics. To make the primitive operation of frequent subgraph mining available to this framework, we developed Distributed In-Memory gSpan (DIMSpan), a frequent subgraph miner that is tailored to the characteristics of shared-nothing clusters and distributed dataflow systems. Finally, the results of use case evaluations in cooperation with a large scale enterprise will be presented. This includes a report of practical experiences gained in implementation and application of the proposed algorithms
    corecore