Definición y validación de medidas para procesos ETL en almacenes de datos

Abstract

In data warehousing, ETL (Extract, Transform, and Load) processes are in charge of extracting the data from data sources that will be contained in the data warehouse. Due to their relevance, the quality of these processes should be formally assessed from early stages of development, in order to avoid making bad decisions as a result of incorrect data. In this paper, a set of measures is presented to evalu- ate the structural complexity of ETL process models at conceptual level. Moreover, this study is accompanied by one controlled experiment whose aim is the empirical validation of the proposed measures. The use of these measures can aid designers to predict the e®ort associated with the maintenance tasks of ETL processes. This pro- posal is based on UML (Uni¯ed Modeling Language) activity diagrams for modeling ETL processes, and on the FMESP (Framework for the Modeling and Evaluation of Software Processes) framework for the validation of the measures.In data warehousing, ETL (Extract, Transform, and Load) processes are in charge of extracting the data from data sources that will be contained in the data warehouse. Due to their relevance, the quality of these processes should be formally assessed from early stages of development, in order to avoid making bad decisions as a result of incorrect data. In this paper, a set of measures is presented to evalu- ate the structural complexity of ETL process models at conceptual level. Moreover, this study is accompanied by one controlled experiment whose aim is the empirical validation of the proposed measures. The use of these measures can aid designers to predict the e®ort associated with the maintenance tasks of ETL processes. This pro- posal is based on UML (Uni¯ed Modeling Language) activity diagrams for modeling ETL processes, and on the FMESP (Framework for the Modeling and Evaluation of Software Processes) framework for the validation of the measures

    Similar works