14 research outputs found
Computer-Aided Warehouse Engineering (CAWE): Leveraging MDA and ADM for the Development of Data Warehouses
During the last decade, data warehousing has reached a high maturity and is a well-accepted technology in decision support systems. Nevertheless, development and maintenance are still tedious tasks since the systems grow over time and complex architectures have been established. The paper at hand adopts the concepts of Model Driven Architecture (MDA) and Architecture Driven Modernization (ADM) taken from the software engineering discipline to the data warehousing discipline. We show the works already available, outline further research directions and give hints for implementation of Computer-Aided Warehouse Engineering systems
Model Reka Bentuk Konseptual Operasian Storan Data Bagi Aplikasi Kepintaran Perniagaan
The development of business intelligence (BI) applications, involving of data sources, Data Warehouse (DW), Data Mart (DM) and Operational Data Store (ODS), imposes a major challenge to BI developers. This is mainly due to the lack of established models, guidelines and techniques in the development process as compared to system development in the discipline of software engineering. Furthermore, the present BI applications emphasize on the development of strategic information in contrast to operational and tactical. Therefore, the main aim of this study is to propose a conceptual design model for BI applications using ODS (CoDMODS). Through expert validation, the proposed conceptual design model that was developed by means of design science research approach, was found to satisfy nine quality model dimensions, which are, easy to understand, covers clear steps, is relevant and timeless, demonstrates flexibility, scalability, accuracy, completeness and consistency. Additionally, the two prototypes that were developed based on CoDMODS for water supply service (iUBIS) and telecommunication maintenance (iPMS) recorded a high usability average min value of 5.912 using Computer System Usability Questionnaire (CSUQ) instrument. The outcomes of this study, particularly the proposed model, contribute to the analysis and design method for the development of the operational and tactical information in BI applications. The model can be referred as guidelines by BI developers. Furthermore, the prototypes that were developed in the case studies can assist the organizations in using quality information for business operations
Making the Case for a Business Intelligence Framework
This research is intended to develop evidence for whether or not large organizations should spend a large amount of time and resources on building Business Intelligence Frameworks by examining Project Manager’s perceptions of complex information systems. Project Managers in a large organization provide a cross functional reporting role that requires them to delve into information technology systems in complex ways when querying for simple metrics related to projects they manage. Using an online survey, this study found that project manager’s perceptions changed more positively towards IT systems performing automatic queries, web based queries, IT systems, and business intelligence system dashboards if they did not already have a business intelligence framework in place, and if they were less experienced. More experienced project managers had lower perceptions of current IT systems, automatic queries, web-based queries, and dashboards. There is evidence to suggest that business intelligence frameworks will be positively perceived for project managers with lower experience, and where these systems have not already been introduced
Leveraging query logs for user-centric OLAP
OLAP (On-Line Analytical Processing), the process of efficiently enabling common analytical operations on the multidimensional view of data, is a corner stone of Business Intelligence.While OLAP is now a mature, efficiently implemented technology, very little attention has been paid to the effectiveness of the analysis and the user-friendliness of this technology, often considered tedious of use.This dissertation is a contribution to developing user-centric OLAP, focusing on the use of former queries logged by an OLAP server to enhance subsequent analyses. It shows how logs of OLAP queries can be modeled, constructed, manipulated, compared, and finally leveraged for personalization and recommendation.Logs are modeled as sets of analytical sessions, sessions being modeled as sequences of OLAP queries. Three main approaches are presented for modeling queries: as unevaluated collections of fragments (e.g., group by sets, sets of selection predicates, sets of measures), as sets of references obtained by partially evaluating the query over dimensions, or as query answers. Such logs can be constructed even from sets of SQL query expressions, by translating these expressions into a multidimensional algebra, and bridging the translations to detect analytical sessions. Logs can be searched, filtered, compared, combined, modified and summarized with a language inspired by the relational algebra and parametrized by binary relations over sessions. In particular, these relations can be specialization relations or based on similarity measures tailored for OLAP queries and analytical sessions. Logs can be mined for various hidden knowledge, that, depending on the query model used, accurately represents the user behavior extracted.This knowledge includes simple preferences, navigational habits and discoveries made during former explorations,and can be it used in various query personalization or query recommendation approaches.Such approaches vary in terms of formulation effort, proactiveness, prescriptiveness and expressive power:query personalization, i.e., coping with a current query too few or too many results, can use dedicated operators for expressing preferences, or be based on query expansion;query recommendation, i.e., suggesting queries to pursue an analytical session,can be based on information extracted from the current state of the database and the query, or be purely history based, i.e., leveraging the query log.While they can be immediately integrated into a complete architecture for User-Centric Query Answering in data warehouses, the models and approaches introduced in this dissertation can also be seen as a starting point for assessing the effectiveness of analytical sessions, with the ultimate goal to enhance the overall decision making process
Recommended from our members
Investigating pluralistic data architectures in data warehousing
This thesis was submitted for the award of Doctor of Philosophy and was awarded by Brunel University LondonUnderstanding and managing change is a strategic objective for many organisations to successfully compete in a market place; as a result, organisations are leveraging their data asset and implementing data warehouses to gain business intelligence necessary to improve their businesses. Data warehouses are expensive initiatives, one-half to two-thirds of most data warehousing efforts end in failure. In the absence of well-formalised design methodology in the industry and in the context of the debate on data architecture in data warehousing, this thesis examines why multidimensional and relational data models define the data architecture landscape in the industry. The study develops a number of propositions from the literature and empirical data to understand the factors impacting the choice of logical data model in data warehousing. Using a comparative case study method as the mean of collecting empirical data from the case organisations, the research proposes a conceptual model for logical data model adoption. The model provides a framework that guides decision making for adopting a logical data model for a data warehouse. The research conceptual model identifies the characteristics of business requirements and decision pathways for multidimensional and relational data warehouses. The conceptual model adds value by identifying the business requirements which a multidimensional and relational logical data model is empirically applicable
Metodología para el diseño conceptual de almacenes de datos
A partir de la introducción del modelo de datos multidimensional como formalismo de modelado para
Almacenes de Datos (ADs), se han realizado distintas propuestas metodológicas para capturar la estructura del AD
a nivel conceptual. Las soluciones propuestas parten de diferentes aspectos de diseño: los requisitos de usuario, el
análisis del esquema de la base de datos operacional o una combinación de ambos (técnicas mixtas).
Model Driven Architecture (MDA) es un nuevo estándar para
el desarrollo de sistemas dirigido por modelos. MDA propone
tres puntos de vista: Computation Independent Model (CIM),
Platform Independent Model (PIM) and Platform Specific
Model (PSM).
Esta tesis, se enmarca en el área del diseño de ADs con MDA
(una metodología para el diseño conceptual de ADs). Este
método, es empleado con una metodología compuesta y
consiste de tres fases. La primera fase, esta dedicada a
examinar el esquema ER de la base de datos operacional,
generando los esquemas multidimensionales candidatos para
el AD. La solución a esta fase, se ha abordado en el contexto
de MDA para esto, hemos definido un conjunto de reglas de
transformación entre el PIM Entidad Relación (ER) y el PIM
On-Line Analytical Processing (OLAP).
En la segunda fase, los requisitos de usuario son recogidos
por medio de entrevistas. El propósito de las entrevistas es
obtener información acerca de las necesidades de análisis de
los usuarios. Como base para esta fase, adaptamos un método
de elicitación de requisitos basado en metas. La tercera fase,
contrasta la información obtenida en la segunda fase, con los
esquemas multidimensional candidatos formados en la
primera fase generando así, la mejor solución (soportada por
las bases datos operacionales) que mejor reflejan los
requisitos de usuario.Zepeda Sánchez, LZ. (2008). Metodología para el diseño conceptual de almacenes de datos [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/2506Palanci
Semantic metadata for supporting exploratory OLAP
Cotutela Universitat Politècnica de Catalunya i Aalborg UniversitetOn-Line Analytical Processing (OLAP) is an approach widely used for data analysis. OLAP is based on the multidimensional (MD) data model where factual data are related to their analytical perspectives called dimensions and together they form an n-dimensional data space referred to as data cube. MD data are typically stored in a data warehouse, which integrates data from in-house data sources, and then analyzed by means of OLAP operations, e.g., sales data can be (dis)aggregated along the location dimension. As OLAP proved to be quite intuitive, it became broadly accepted by non-technical and business users. However, as users still encountered difficulties in their analysis, different approaches focused on providing user assistance. These approaches collect situational metadata about users and their actions and provide suggestions and recommendations that can help users' analysis. However, although extensively exploited and evidently needed, little attention is paid to metadata in this context. Furthermore, new emerging tendencies call for expanding the use of OLAP to consider external data sources and heterogeneous settings. This leads to the Exploratory OLAP approach that especially argues for the use of Semantic Web (SW) technologies to facilitate the description and integration of external sources. With data becoming publicly available on the (Semantic) Web, the number and diversity of non-technical users are also significantly increasing. Thus, the metadata to support their analysis become even more relevant.
This PhD thesis focuses on metadata for supporting Exploratory OLAP. The study explores the kinds of metadata artifacts used for the user assistance purposes and how they are exploited to provide assistance. Based on these findings, the study then aims at providing theoretical and practical means such as models, algorithms, and tools to address the gaps and challenges identified. First, based on a survey of existing user assistance approaches related to OLAP, the thesis proposes the analytical metadata (AM) framework. The framework includes the definition of the assistance process, the AM artifacts that are classified in a taxonomy, and the artifacts organization and related types of processing to support the user assistance. Second, the thesis proposes a semantic metamodel for AM. Hence, Resource Description Framework (RDF) is used to represent the AM artifacts in a flexible and re-usable manner, while the metamodeling abstraction level is used to overcome the heterogeneity of (meta)data models in the Exploratory OLAP context. Third, focusing on the schema as a fundamental metadata artifact for enabling OLAP, the thesis addresses some important challenges on constructing an MD schema on the SW using RDF. It provides the algorithms, method, and tool to construct an MD schema over statistical linked open data sets. Especially, the focus is on enabling that even non-technical users can perform this task. Lastly, the thesis deals with queries as the second most relevant artifact for user assistance. In the spirit of Exploratory OLAP, the thesis proposes an RDF-based model for OLAP queries created by instantiating the previously proposed metamodel. This model supports the sharing and reuse of queries across the SW and facilitates the metadata preparation for the assistance exploitation purposes. Finally, the results of this thesis provide metadata foundations for supporting Exploratory OLAP and advocate for greater attention to the modeling and use of semantics related to metadata.El processament analític en línia (OLAP) és una tècnica àmpliament utilitzada per a l'anàlisi de dades. OLAP es basa en el model multi-dimensional (MD) de dades, on dades factuals es relacionen amb les seves perspectives analítiques, anomenades dimensions, i conjuntament formen un espai de dades n-dimensional anomenat cub de dades. Les dades MD s'emmagatzemen típicament en un data warehouse (magatzem de dades), el qual integra dades de fonts internes, les quals posteriorment s'analitzen mitjançant operacions OLAP, per exemple, dades de vendes poden ser (des)agregades a partir de la dimensió ubicació. Un cop OLAP va ser provat com a intuïtiu, va ser ampliament acceptat tant per usuaris no tècnics com de negoci. Tanmateix, donat que els usuaris encara trobaven dificultats per a realitzar el seu anàlisi, diferents tècniques s'han enfocat en la seva assistència. Aquestes tècniques recullen metadades situacionals sobre els usuaris i les seves accions, i proporcionen suggerències i recomanacions per tal d'ajudar en aquest anàlisi. Tot i ésser extensivament emprades i necessàries, poca atenció s'ha prestat a les metadades en aquest context. A més a més, les noves tendències demanden l'expansió d'ús d'OLAP per tal de considerar fonts de dades externes en escenaris heterogenis. Això ens porta a la tècnica d'OLAP exploratori, la qual es basa en l'ús de tecnologies en la web semàntica (SW) per tal de facilitar la descripció i integració d'aquestes fonts externes. Amb les dades essent públicament disponibles a la web (semàntica), el nombre i diversitat d'usuaris no tècnics també incrementa signifícativament. Així doncs, les metadades per suportar el seu anàlisi esdevenen més rellevants. Aquesta tesi doctoral s'enfoca en l'ús de metadades per suportar OLAP exploratori. L'estudi explora els tipus d'artefactes de metadades utilitzats per l'assistència a l'usuari, i com aquests són explotats per proporcionar assistència. Basat en aquestes troballes, l'estudi preté proporcionar mitjans teòrics i pràctics, com models, algorismes i eines, per abordar els reptes identificats. Primerament, basant-se en un estudi de tècniques per assistència a l'usuari en OLAP, la tesi proposa el marc de treball de metadades analítiques (AM). Aquest marc inclou la definició del procés d'assistència, on els artefactes d'AM són classificats en una taxonomia, i l'organització dels artefactes i tipus relacionats de processament pel suport d'assistència a l'usuari. En segon lloc, la tesi proposa un meta-model semàntic per AM. Així doncs, s'utilitza el Resource Description Framework (RDF) per representar els artefactes d'AM d'una forma flexible i reusable, mentre que el nivell d'abstracció de metamodel s'utilitza per superar l'heterogeneïtat dels models de (meta)dades en un context d'OLAP exploratori. En tercer lloc, centrant-se en l'esquema com a artefacte fonamental de metadades per a OLAP, la tesi adreça reptes importants en la construcció d'un esquema MD en la SW usant RDF. Proporciona els algorismes, mètodes i eines per construir un esquema MD sobre conjunts de dades estadístics oberts i relacionats. Especialment, el focus rau en permetre que usuaris no tècnics puguin realitzar aquesta tasca. Finalment, la tesi tracta amb consultes com el segon artefacte més rellevant per l'assistència a usuari. En l'esperit d'OLAP exploratori, la tesi proposa un model basat en RDF per consultes OLAP instanciant el meta-model prèviament proposat. Aquest model suporta el compartiment i reutilització de consultes sobre la SW i facilita la preparació de metadades per l'explotació de l'assistència. Finalment, els resultats d'aquesta tesi proporcionen els fonaments en metadades per suportar l'OLAP exploratori i propugnen la major atenció al model i ús de semàntica relacionada a metadades.On-Line Analytical Processing (OLAP) er en bredt anvendt tilgang til dataanalyse. OLAP er baseret på den multidimensionelle (MD) datamodel, hvor faktuelle data relateres til analytiske synsvinkler, såkaldte dimensioner. Tilsammen danner de et n-dimensionelt rum af data kaldet en data cube. Multidimensionelle
data er typisk lagret i et data warehouse, der integrerer data fra forskellige interne datakilder, og kan analyseres ved hjælp af OLAPoperationer. For eksempel kan salgsdata disaggregeres langs sted-dimensionen.
OLAP har vist sig at være intuitiv at forstå og er blevet taget i brug af ikketekniske og orretningsorienterede brugere. Nye tilgange er siden blevet udviklet i forsøget på at afhjælpe de problemer, som denne slags brugere dog stadig står over for. Disse tilgange indsamler metadata om brugerne og deres handlinger og kommer efterfølgende med forslag og anbefalinger, der kan bidrage til brugernes analyse. På trods af at der er en klar nytteværdi i metadata (givet deres udbredelse), har stadig ikke været meget opmærksomhed på metadata i denne kotekst. Desuden lægger nye fremspirende teknikker nu op til en udvidelse af brugen af OLAP til også at bruge eksterne og uensartede datakilder. Dette har ført til Exploratory OLAP, en tilgang til OLAP, der benytter teknologier fra Semantic Web til at understøtte beskrivelse og integration af eksterne kilder. Efterhånden som mere data gøres offentligt tilgængeligt via Semantic Web, kommer flere og mere forskelligartede ikketekniske brugere også til. Derfor er metadata til understøttelsen af deres dataanalyser endnu mere relevant.
Denne ph.d.-afhandling omhandler metadata, der understøtter Exploratory OLAP. Der foretages en undersøgelse af de former for metadata, der benyttes til at hjælpe brugere, og af, hvordan sådanne metadata kan udnyttes. Med grundlag i disse fund søges der løsninger til de identificerede problemer
igennem teoretiske såvel som praktiske midler. Det vil sige modeller, algoritmer og værktøjer. På baggrund af en afdækning af eksisterende tilgange til brugerassistance i forbindelse med OLAP præsenteres først rammeværket Analytical Metadata (AM). Det inkluderer definition af assistanceprocessen, en taksonomi over tilhørende artefakter og endelig relaterede processeringsformer til brugerunderstøttelsen. Dernæst præsenteres en semantisk metamodel for AM. Der benyttes Resource Description Framework (RDF)
til at repræsentere AM-artefakterne på en genbrugelig og fleksibel facon, mens metamodellens abstraktionsniveau har til formål at nedbringe uensartetheden af (meta)data i Exploratory OLAPs kontekst. Så fokuseres der på skemaet som en fundamental metadata-artefakt i OLAP, og afhandlingen tager fat i vigtige udfordringer i forbindelse med konstruktionen af multidimensionelle skemaer i Semantic Web ved brug af RDF. Der præsenteres algoritmer, metoder og redskaber til at konstruere disse skemaer sammenkoblede åbne statistiske datasæt. Der lægges særlig vægt på, at denne proces skal kunne udføres af ikke-tekniske brugere. Til slut tager afhandlingen fat i forespørgsler som anden vigtig artefakt inden for bruger-assistance.
I samme ånd som Exploratory OLAP foreslås en RDF-baseret model for OLAP-forespørgsler, hvor førnævnte metamodel benyttes. Modellen understøtter deling og genbrug af forespørgsler over Semantic Web og fordrer klargørelsen af metadata med øje for assistance-relaterede formål. Endelig leder resultaterne af afhandlingen til fundamenterne for metadata i støttet Exploratory OLAP og opfordrer til en øget opmærksomhed på modelleringen og brugen af semantik i forhold til metadataPostprint (published version
Recommended from our members
A Dementia Care Mapping (DCM) data warehouse as a resource for improving the quality of dementia care. Exploring requirements for secondary use of DCM data using a user-driven approach and discussing their implications for a data warehouse
The secondary use of Dementia Care Mapping (DCM) data, if that data were
held in a data warehouse, could contribute to global efforts in monitoring and
improving dementia care quality. This qualitative study identifies
requirements for the secondary use of DCM data within a data warehouse
using a user-driven approach. The thesis critically analyses various technical
methodologies and then argues the use and further demonstrates the
applicability of a modified grounded theory as a user-driven methodology for
a data warehouse. Interviews were conducted with 29 DCM researchers,
trainers and practitioners in three phases. 19 interviews were face to face
with the others on Skype and telephone with an average length of individual
interview 45-60 minutes. The interview data was systematically analysed
using open, axial and selective coding techniques and constant comparison
methods.
The study data highlighted benchmarking, mappers’ support and research as
three perceived potential secondary uses of DCM data within a data
warehouse. DCM researchers identified concerns regarding the quality and
security of DCM data for secondary uses, which led to identifying the
requirements for additional provenance, ethical and contextual data to be
included in a warehouse alongside DCM data to meet requirements for
secondary uses of this data for research. The study data was also used to
extrapolate three main factors such as an individual mapper, the organization
and an electronic data management that can influence the quality and
availability of DCM data for secondary uses. The study makes further
recommendations for designing a future DCM data warehouse