3 research outputs found

    Computer Science's Digest Volume 3

    Get PDF
    This series of textbooks was created for the students of the Systems Engineering Program at the University of Nariño. They have been intentionally written in English to promote reading in a foreign language. The textbooks are a collection of reflections and workshops on specific situations in the field of computer science, based on the authors’ experiences. The main purpose of these textbooks is essentially academic. The way in which the reflections and workshops were constructed follows a didactic structure, to facilitate teaching and learning, making use of English as a second language. This book covers Professional Issues in Computing and Programming the Interne

    Reusing dynamic data marts for query management in an on-demand ETL architecture

    Get PDF
    Data analysts working often have a requirement to integrate an in-house data warehouse with external datasets, especially web-based datasets. Doing so can give them important insights into their performance when compared with competitors, their industry in general on a global scale, and make predictions as to sales, providing important decision support services. The quality of these insights depends on the quality of the data imported into the analysis dataset. There is a wealth of data freely available from government sources online but little unity between data sources, leading to a requirement for a data processing layer wherein various types of quality issues and heterogeneities can be resolved. Traditionally, this is achieved with an Extract-Transform-Load (ETL) series of processes which are performed on all of the available data, in advance, in a batch process typically run outside of business hours. While this is recognized as a powerful knowledge-based support, it is very expensive to build and maintain, and is very costly to update, in the event that new data sources become available. On-demand ETL offers a solution in that data is only acquired when needed and new sources can be added as they come online. However, this form of dynamic ETL is very difficult to deliver. In this research dissertation, we explore the possibilities of creating dynamic data marts which can be created using non-warehouse data to support the inclusion of new sources. We then examine how these dynamic structures can be used for query fulfillment andhow they can support an overall on-demand query mechanism. At each step of the research and development, we employ a robust validation using a real-world data warehouse from the agricultural domain with selected Agri web sources to test the dynamic elements of the proposed architecture
    corecore