635 research outputs found
Knowledge and Metadata Integration for Warehousing Complex Data
With the ever-growing availability of so-called complex data, especially on
the Web, decision-support systems such as data warehouses must store and
process data that are not only numerical or symbolic. Warehousing and analyzing
such data requires the joint exploitation of metadata and domain-related
knowledge, which must thereby be integrated. In this paper, we survey the types
of knowledge and metadata that are needed for managing complex data, discuss
the issue of knowledge and metadata integration, and propose a CWM-compliant
integration solution that we incorporate into an XML complex data warehousing
framework we previously designed.Comment: 6th International Conference on Information Systems Technology and
its Applications (ISTA 07), Kharkiv : Ukraine (2007
A unified view of data-intensive flows in business intelligence systems : a survey
Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft
Using Ontologies for the Design of Data Warehouses
Obtaining an implementation of a data warehouse is a complex task that forces
designers to acquire wide knowledge of the domain, thus requiring a high level
of expertise and becoming it a prone-to-fail task. Based on our experience, we
have detected a set of situations we have faced up with in real-world projects
in which we believe that the use of ontologies will improve several aspects of
the design of data warehouses. The aim of this article is to describe several
shortcomings of current data warehouse design approaches and discuss the
benefit of using ontologies to overcome them. This work is a starting point for
discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure
Quality measures for ETL processes: from goals to implementation
Extraction transformation loading (ETL) processes play an increasingly important role for the support of modern business operations. These business processes are centred around artifacts with high variability and diverse lifecycles, which correspond to key business entities. The apparent complexity of these activities has been examined through the prism of business process management, mainly focusing on functional requirements and performance optimization. However, the quality dimension has not yet been thoroughly investigated, and there is a need for a more human-centric approach to bring them closer to business-users requirements. In this paper, we take a first step towards this direction by defining a sound model for ETL process quality characteristics and quantitative measures for each characteristic, based on existing literature. Our model shows dependencies among quality characteristics and can provide the basis for subsequent analysis using goal modeling techniques. We showcase the use of goal modeling for ETL process design through a use case, where we employ the use of a goal model that includes quantitative components (i.e., indicators) for evaluation and analysis of alternative design decisions.Peer ReviewedPostprint (author's final draft
Using Semantic Web technologies in the development of data warehouses: A systematic mapping
The exploration and use of Semantic Web technologies have attracted considerable attention from researchers examining data warehouse (DW) development. However, the impact of this research and the maturity level of its results are still unclear. The objective of this study is to examine recently published research articles that take into account the use of Semantic Web technologies in the DW arena with the intention of summarizing their results, classifying their contributions to the field according to publication type, evaluating the maturity level of the results, and identifying future research challenges. Three main conclusions were derived from this study: (a) there is a major technological gap that inhibits the wide adoption of Semantic Web technologies in the business domain;(b) there is limited evidence that the results of the analyzed studies are applicable and transferable to industrial use; and (c) interest in researching the relationship between DWs and Semantic Web has decreased because new paradigms, such as linked open data, have attracted the interest of researchers.This study was supported by the Universidad de La Frontera, Chile, PROY. DI15-0020. Universidad de la Frontera, Chile, Grant Numbers: DI15-0020 and DI17-0043
The potential of semantic paradigm in warehousing of big data
Big data have analytical potential that was hard to realize with available technologies. After new storage paradigms intended for big data such as NoSQL databases emerged, traditional systems got pushed out of the focus. The current research is focused on their reconciliation on different levels or paradigm replacement. Similarly, the emergence of NoSQL databases has started to push traditional (relational) data warehouses out of the research and even practical focus. Data warehousing is known for the strict modelling process, capturing the essence of the business processes. For that reason, a mere integration to bridge the NoSQL gap is not enough. It is necessary to deal with this issue on a higher abstraction level during the modelling phase. NoSQL databases generally lack clear, unambiguous schema, making the comprehension of their contents difficult and their integration and analysis harder. This motivated involving semantic web technologies to enrich NoSQL database contents by additional meaning and context. This paper reviews the application of semantics in data integration and data warehousing and analyses its potential in integrating NoSQL data and traditional data warehouses with some focus on document stores. Also, it gives a proposal of the future pursuit directions for the big data warehouse modelling phases
Incorporation of ontologies in data warehouse/business intelligence systems - A systematic literature review
Semantic Web (SW) techniques, such as ontologies, are used in Information Systems (IS) to cope with the growing need for sharing and reusing data and knowledge in various research areas. Despite the increasing emphasis on unstructured data analysis in IS, structured data and its analysis remain critical for organizational performance management. This systematic literature review aims at analyzing the incorporation and impact of ontologies in Data Warehouse/Business Intelligence (DW/BI) systems, contributing to the current literature by providing a classification of works based on the field of each case study, SW techniques used, and the authors’ motivations for using them, with a focus on DW/BI design, development and exploration tasks. A search strategy was developed, including the definition of keywords, inclusion and exclusion criteria, and the selection of search engines. Ontologies are mainly defined using the Ontology Web Language standard to support multiple DW/BI tasks, such as Dimensional Modeling, Requirement Analysis, Extract-Transform-Load, and BI Application Design. Reviewed authors present a variety of motivations for ontology-driven solutions in DW/BI, such as eliminating or solving data heterogeneity/semantics problems, increasing interoperability, facilitating integration, or providing semantic content for requirements and data analysis. Further, implications for practice and research agenda are indicated.info:eu-repo/semantics/publishedVersio
A Goal and Ontology Based Approach for Generating ETL Process Specifications
Data warehouse (DW) systems development involves several tasks such as defining requirements, designing DW schemas, and specifying data transformation operations. Indeed, the success of DW systems is very much dependent on the proper design of the extracting, transforming, and loading (ETL) processes. However, the common design-related problems in the ETL processes such as defining user requirements and data transformation specifications are far from being resolved. These problems are due to data heterogeneity in data sources, ambiguity of user requirements, and the complexity of data transformation activities. Current approaches have limitations on the reconciliation of DW requirement semantics towards designing the ETL processes. As a result, this has prolonged the process of the ETL processes specifications generation. The semantic framework of DW systems established from this study is used to develop the requirement analysis method for designing the ETL processes (RAMEPs) from the different perspectives of organization, decision-maker, and developer by using goal and ontology approaches. The correctness of RAMEPs approach was validated by using modified and newly developed compliant tools. The RAMEPs was evaluated in three real case studies, i.e., Student Affairs System, Gas Utility System, and Graduate Entrepreneur System. These case studies were used to illustrate how the RAMEPs approach can be implemented for designing and generating the ETL processes specifications. Moreover, the RAMEPs approach was reviewed by the DW experts for assessing the strengths and weaknesses of this method, and the new approach is accepted. The RAMEPs method proves that the ETL processes specifications can be derived from the early phases of DW systems development by using the goal-ontology approach
- …