544 research outputs found

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    Using Ontologies for the Design of Data Warehouses

    Get PDF
    Obtaining an implementation of a data warehouse is a complex task that forces designers to acquire wide knowledge of the domain, thus requiring a high level of expertise and becoming it a prone-to-fail task. Based on our experience, we have detected a set of situations we have faced up with in real-world projects in which we believe that the use of ontologies will improve several aspects of the design of data warehouses. The aim of this article is to describe several shortcomings of current data warehouse design approaches and discuss the benefit of using ontologies to overcome them. This work is a starting point for discussing the convenience of using ontologies in data warehouse design.Comment: 15 pages, 2 figure

    Engineering XML solutions using views

    Get PDF
    In industrial informatics, engineering data intensive Enterprise Information Systems (EIS) is a challenging task without abstraction and partitioning. Further, the introduction of semi-structured data (namely XML) and its rapid adaptation by the commercial and industrial systems increased the complexity for data engineering. Conversely, the introduction of OMG's MDA presents an interesting paradigm for EIS and system modelling, where a system is designed at a higher level of abstraction. This presents an interesting problem to investigate data engineering XML solutions under the MDA initiatives, where, models and framework requires higher level of abstraction. In this paper we investigate a view model that can provide layered design methodology for modelling data intensive XML solutions for EIS paradigm, with sufficient level of abstraction

    Modeling ontology views: An abstract view model for semantic web

    Get PDF
    The emergence of Semantic Web (SW) and the related technologies promise to make the web a meaningful experience. However, high level modelling, design and querying techniques proves to be a challenging task for organizations that are hoping to utilize the SW paradigm for their industrial applications. To address one such issue, in this paper, we propose an abstract view model with conceptual extensions for the SW. First we outline the view model, its properties and some modelling issues with the help of an industrial case study example. Then, we provide some discussions on constructing such views (at the conceptual level) using a set of operators. Later we provide a brief discussion on how such this view model can utilized in the MOVE [1] system, to design and construct materialized Ontology views to support Ontology extraction

    Ingeniería de requerimientos orientado a objetivos en almacenes de datos: un estudio comparativo

    Get PDF
    Data warehouses provide historical information about the organization that needs to be analyzed by the decision makers; therefore, it is essential to develop them in the context of a strategic business plan. In recent years, a number of engineering approaches for goal-oriented requirements have been proposed, which can obtain the information requirements of a data warehouse using traditional techniques and the objectives of the modeling. This paper provides an overview and a comparative study of the treatment of the requirements in the existing approaches to serve as a starting point for further research.Los almacenes de datos proveen información histórica de la organización que requiere ser analizada por los tomadores de decisiones, por lo que es primordial desarrollarlos en el contexto del plan estratégicos del negocio. En los últimos años se han propuesto una serie de enfoques de ingeniería de requerimientos orientada a objetivos que permiten obtener los requisitos de información, a cubrir por el almacén de datos, mediante técnicas tradicionales y del modelado de objetivos. Este trabajo, ofrece una visión general y un estudio comparativo del tratamiento de los requisitos en los actuales enfoques con el fin de servir de punto de inicio a posteriores trabajos de investigación.This work was financed by the Universidad de La Frontera. DIU-FRO Project DI13-0047

    Automating the multidimensional design of data warehouses

    Get PDF
    Les experiències prèvies en l'àmbit dels magatzems de dades (o data warehouse), mostren que l'esquema multidimensional del data warehouse ha de ser fruit d'un enfocament híbrid; això és, una proposta que consideri tant els requeriments d'usuari com les fonts de dades durant el procés de disseny.Com a qualsevol altre sistema, els requeriments són necessaris per garantir que el sistema desenvolupat satisfà les necessitats de l'usuari. A més, essent aquest un procés de reenginyeria, les fonts de dades s'han de tenir en compte per: (i) garantir que el magatzem de dades resultant pot ésser poblat amb dades de l'organització, i, a més, (ii) descobrir capacitats d'anàlisis no evidents o no conegudes per l'usuari.Actualment, a la literatura s'han presentat diversos mètodes per donar suport al procés de modelatge del magatzem de dades. No obstant això, les propostes basades en un anàlisi dels requeriments assumeixen que aquestos són exhaustius, i no consideren que pot haver-hi informació rellevant amagada a les fonts de dades. Contràriament, les propostes basades en un anàlisi exhaustiu de les fonts de dades maximitzen aquest enfocament, i proposen tot el coneixement multidimensional que es pot derivar des de les fonts de dades i, conseqüentment, generen massa resultats. En aquest escenari, l'automatització del disseny del magatzem de dades és essencial per evitar que tot el pes de la tasca recaigui en el dissenyador (d'aquesta forma, no hem de confiar únicament en la seva habilitat i coneixement per aplicar el mètode de disseny elegit). A més, l'automatització de la tasca allibera al dissenyador del sempre complex i costós anàlisi de les fonts de dades (que pot arribar a ser inviable per grans fonts de dades).Avui dia, els mètodes automatitzables analitzen en detall les fonts de dades i passen per alt els requeriments. En canvi, els mètodes basats en l'anàlisi dels requeriments no consideren l'automatització del procés, ja que treballen amb requeriments expressats en llenguatges d'alt nivell que un ordenador no pot manegar. Aquesta mateixa situació es dona en els mètodes híbrids actual, que proposen un enfocament seqüencial, on l'anàlisi de les dades es complementa amb l'anàlisi dels requeriments, ja que totes dues tasques pateixen els mateixos problemes que els enfocament purs.En aquesta tesi proposem dos mètodes per donar suport a la tasca de modelatge del magatzem de dades: MDBE (Multidimensional Design Based on Examples) and AMDO (Automating the Multidimensional Design from Ontologies). Totes dues consideren els requeriments i les fonts de dades per portar a terme la tasca de modelatge i a més, van ser pensades per superar les limitacions dels enfocaments actuals.1. MDBE segueix un enfocament clàssic, en el que els requeriments d'usuari són coneguts d'avantmà. Aquest mètode es beneficia del coneixement capturat a les fonts de dades, però guia el procés des dels requeriments i, conseqüentment, és capaç de treballar sobre fonts de dades semànticament pobres. És a dir, explotant el fet que amb uns requeriments de qualitat, podem superar els inconvenients de disposar de fonts de dades que no capturen apropiadament el nostre domini de treball.2. A diferència d'MDBE, AMDO assumeix un escenari on es disposa de fonts de dades semànticament riques. Per aquest motiu, dirigeix el procés de modelatge des de les fonts de dades, i empra els requeriments per donar forma i adaptar els resultats generats a les necessitats de l'usuari. En aquest context, a diferència de l'anterior, unes fonts de dades semànticament riques esmorteeixen el fet de no tenir clars els requeriments d'usuari d'avantmà.Cal notar que els nostres mètodes estableixen un marc de treball combinat que es pot emprar per decidir, donat un escenari concret, quin enfocament és més adient. Per exemple, no es pot seguir el mateix enfocament en un escenari on els requeriments són ben coneguts d'avantmà i en un escenari on aquestos encara no estan clars (un cas recorrent d'aquesta situació és quan l'usuari no té clares les capacitats d'anàlisi del seu propi sistema). De fet, disposar d'uns bons requeriments d'avantmà esmorteeix la necessitat de disposar de fonts de dades semànticament riques, mentre que a l'inversa, si disposem de fonts de dades que capturen adequadament el nostre domini de treball, els requeriments no són necessaris d'avantmà. Per aquests motius, en aquesta tesi aportem un marc de treball combinat que cobreix tots els possibles escenaris que podem trobar durant la tasca de modelatge del magatzem de dades.Previous experiences in the data warehouse field have shown that the data warehouse multidimensional conceptual schema must be derived from a hybrid approach: i.e., by considering both the end-user requirements and the data sources, as first-class citizens. Like in any other system, requirements guarantee that the system devised meets the end-user necessities. In addition, since the data warehouse design task is a reengineering process, it must consider the underlying data sources of the organization: (i) to guarantee that the data warehouse must be populated from data available within the organization, and (ii) to allow the end-user discover unknown additional analysis capabilities.Currently, several methods for supporting the data warehouse modeling task have been provided. However, they suffer from some significant drawbacks. In short, requirement-driven approaches assume that requirements are exhaustive (and therefore, do not consider the data sources to contain alternative interesting evidences of analysis), whereas data-driven approaches (i.e., those leading the design task from a thorough analysis of the data sources) rely on discovering as much multidimensional knowledge as possible from the data sources. As a consequence, data-driven approaches generate too many results, which mislead the user. Furthermore, the design task automation is essential in this scenario, as it removes the dependency on an expert's ability to properly apply the method chosen, and the need to analyze the data sources, which is a tedious and timeconsuming task (which can be unfeasible when working with large databases). In this sense, current automatable methods follow a data-driven approach, whereas current requirement-driven approaches overlook the process automation, since they tend to work with requirements at a high level of abstraction. Indeed, this scenario is repeated regarding data-driven and requirement-driven stages within current hybrid approaches, which suffer from the same drawbacks than pure data-driven or requirement-driven approaches.In this thesis we introduce two different approaches for automating the multidimensional design of the data warehouse: MDBE (Multidimensional Design Based on Examples) and AMDO (Automating the Multidimensional Design from Ontologies). Both approaches were devised to overcome the limitations from which current approaches suffer. Importantly, our approaches consider opposite initial assumptions, but both consider the end-user requirements and the data sources as first-class citizens.1. MDBE follows a classical approach, in which the end-user requirements are well-known beforehand. This approach benefits from the knowledge captured in the data sources, but guides the design task according to requirements and consequently, it is able to work and handle semantically poorer data sources. In other words, providing high-quality end-user requirements, we can guide the process from the knowledge they contain, and overcome the fact of disposing of bad quality (from a semantical point of view) data sources.2. AMDO, as counterpart, assumes a scenario in which the data sources available are semantically richer. Thus, the approach proposed is guided by a thorough analysis of the data sources, which is properly adapted to shape the output result according to the end-user requirements. In this context, disposing of high-quality data sources, we can overcome the fact of lacking of expressive end-user requirements.Importantly, our methods establish a combined and comprehensive framework that can be used to decide, according to the inputs provided in each scenario, which is the best approach to follow. For example, we cannot follow the same approach in a scenario where the end-user requirements are clear and well-known, and in a scenario in which the end-user requirements are not evident or cannot be easily elicited (e.g., this may happen when the users are not aware of the analysis capabilities of their own sources). Interestingly, the need to dispose of requirements beforehand is smoothed by the fact of having semantically rich data sources. In lack of that, requirements gain relevance to extract the multidimensional knowledge from the sources.So that, we claim to provide two approaches whose combination turns up to be exhaustive with regard to the scenarios discussed in the literaturePostprint (published version

    A study of multidimensional modeling approaches for data warehouse

    Get PDF
    Data warehouse system is used to support the process of organizational decision making. Hence, the system must extract and integrate information from heterogeneous data sources in order to uncover relevant knowledge suitable for decision making process. However, the development of data warehouse is a difficult and complex process especially in its conceptual design (multidimensional modeling). Thus, there have been various approaches proposed to overcome the difficulty. This study surveys and compares the approaches of multidimensional modeling and highlights the issues, trend and solution proposed to date. The contribution is on the state of the art of the multidimensional modeling design

    Towards specification formalisms for data warehouse systems design

    Get PDF
    Text in English with abstracts and keywords in English, Afrikaans and SetswanaSeveral studies have been conducted on formal methods; however, few of these studies have used formal methods in the data warehousing area, specifically system development. Many reasons may be linked to that, such as that few experts know how to use them. Formal methods have been used in software development using mathematical notations. Despite the advantages of using formal methods in software development, their application in the data warehousing area has been restricted when compared with the use of informal (natural language) and semi-formal notations. This research aims to determine the extent to which formal methods may mitigate failures that mostly occur in the development of data warehouse systems. As part of this research, an enhanced framework was proposed to facilitate the usage of formal methods in the development of such systems. The enhanced framework focuses mainly on the requirements definition, the Unified Modelling Language (UML) constructs, the Star model and formal specification. A medium-sized case study of a data mart was considered to validate the enhanced framework. This dissertation also discusses the object-orientation paradigm and UML notations. The requirements specification of a data warehouse system is presented in natural language and formal notation to show how a formal specification may be drifted from natural language to UML structures and thereafter to the Z specification using an established strategy as a guideline to construct a Z specificationAlhoewel verskeie studies oor formele metodes gedoen is, het min hiervan formele metodes in die databergingarea, spesifiek stelselontwerp, gebruik. Dit kan aan baie redes toegeskryf word, soos dat min kundiges weet hoe om dit te gebruik. Formele metodes is in sagtewareontwikkeling gebruik wat wiskundige notasies gebruik. Ten spyte van die voordele van formele metodes in sagtewareontwikkeling, is die toepassing daarvan in die databergingarea beperk wanneer dit met die gebruik van informele (natuurlike taal) en semiformele notasies vergelyk word. Hierdie navorsing beoog om te bepaal tot watter mate formele metodes foute kan uitskakel wat hoofsaaklik in die ontwikkeling van databeringstelsels voorkom. As deel van hierdie navorsing is 'n beter raamwerk voorgestel om die gebruik van formele metodes in die ontwikkeling van sulke stelsels te fasiliteer. Die beter raamwerk fokus hoofsaaklik op die definisie van vereistes, die Unified Modelling Language (UML) - konstukte, die Star-model en formele spesifikasies. Die mediumgrootte gevallestudie van 'n datamark is oorweeg om die beter raamwerk geldig te verklaar. Hierdie verhandeling bespreek ook die voorwerpgeoriënteerde paradigma en die UML-notasies. Die vereiste spesifikasie van 'n databergingstelsel word in natuurlike taal en formele notasie voorgehou om aan te dui hoe 'n formele spesifikasie van natuurlik taal na UML strukture kan verskuif en daarna na die Z-spesifiekasie deur 'n gevestigde strategie as 'n riglyn te gebruik om 'n Z-spesifikasie te konstrueer.Go nnile le dithutopatlisiso di le mmalwa ka mekgwa e e fomale, fela ga se dithutopatlisiso tse dintsi tsa tseno tse di dirisitseng mekgwa e e fomale mo karolong ya bobolokelobogolo jwa data, bogolo segolo mo ntlheng ya thadiso ya ditsamaiso tsa dikhomphiutha. Go ka nna le mabaka a le mantsi a a ka golaganngwang le seno, go tshwana le gore ga se baitseanape ba le kalo ba ba itseng go e dirisa. Mekgwa e e fomale e e dirisitswe mo tlhabololong ya dirweboleta go dirisiwa matshwao a dipalo. Le fa go na le melemo ya go dirisa mekgwa e e fomale mo tlhabololong ya dirweboleta, tiriso ya yona mo bobolokelobogolong jwa data e lekanyeditswe fa e tshwantshanngwa le tiriso ya matshwao a a seng fomale (puo ya tlwaelo) le a a batlang a le fomale. Patlisiso eno e ikaelela go bona gore a mekgwa e e fomale e ka fokotsa go retelelwa go go diragalang gantsi mo tlhabololong ya ditsamaiso tsa bobolokelobogolo jwa data. Jaaka karolo ya patlisiso eno, go tshitshintswe letlhomeso le le tokafaditsweng go bebofatsa tiriso ya mekgwa e e fomale mo tlhabololong ya ditsamaiso tse di jalo. Letlhomeso le le tokafaditsweng le tota ditlhokego tsa tlhaloso, megopolo ya Unified Modelling Language (UML), sekao sa Star le ditlhokego tse di rulaganeng. Go dirisitswe patlisiso ya tobiso e e magareng ya data mart go tlhomamisa letlhomeso le le tokafaditsweng. Tlhotlhomisi eno gape e lebelela pharataeme e e totileng sedirwa/selo le matshwao a UML. Ditlhokego tsa tsamaiso ya polokelokgolo ya data di tlhagisiwa ka puo ya tlholego le matshwao a a fomale go bontsha ka moo tlhagiso e e fomale e ka lebisiwang go tswa kwa puong ya tlholego go ya kwa dipopegong tsa UML mme morago e lebe kwa tlhalosong ya ditlhokego ya Z go dirisiwa togamaano e e ntseng e le gona jaaka kaedi ya go aga tlhaloso ya ditlhokego ya Z.School of ComputingM. Sc. (Computing

    A Goal and Ontology Based Approach for Generating ETL Process Specifications

    Get PDF
    Data warehouse (DW) systems development involves several tasks such as defining requirements, designing DW schemas, and specifying data transformation operations. Indeed, the success of DW systems is very much dependent on the proper design of the extracting, transforming, and loading (ETL) processes. However, the common design-related problems in the ETL processes such as defining user requirements and data transformation specifications are far from being resolved. These problems are due to data heterogeneity in data sources, ambiguity of user requirements, and the complexity of data transformation activities. Current approaches have limitations on the reconciliation of DW requirement semantics towards designing the ETL processes. As a result, this has prolonged the process of the ETL processes specifications generation. The semantic framework of DW systems established from this study is used to develop the requirement analysis method for designing the ETL processes (RAMEPs) from the different perspectives of organization, decision-maker, and developer by using goal and ontology approaches. The correctness of RAMEPs approach was validated by using modified and newly developed compliant tools. The RAMEPs was evaluated in three real case studies, i.e., Student Affairs System, Gas Utility System, and Graduate Entrepreneur System. These case studies were used to illustrate how the RAMEPs approach can be implemented for designing and generating the ETL processes specifications. Moreover, the RAMEPs approach was reviewed by the DW experts for assessing the strengths and weaknesses of this method, and the new approach is accepted. The RAMEPs method proves that the ETL processes specifications can be derived from the early phases of DW systems development by using the goal-ontology approach

    Designing secure data warehouses by using MDA and QVT

    Get PDF
    The Data Warehouse (DW) design is based on multidimensional (MD) modeling which structures information into facts and dimensions. Due to the confidentiality of the data that it stores, it is crucial to specify security and audit measures from the early stages of design and to enforce them throughout the lifecycle. Moreover, the standard framework for software development, Model Driven Architecture (MDA), allows us to define transformations between models by proposing Query/View/Transformations (QVT). This proposal permits the definition of formal, elegant and unequivocal transformations between Platform Independent Models (PIM) and Platform Specific Models (PSM). This paper introduces a new framework for the design of secure DWs based on MDA and QVT, which covers all the design phases (conceptual, logical and physical) and specifies security measures in all of them. We first define two metamodels with which to represent security and audit measures at the conceptual and logical levels. We then go on to define a transformation between these models through which to obtain the traceability of the security rules from the early stages of development to the final implementation. Finally, in order to show the benefits of our proposal, it is applied to a case study.This work has been partially supported by the METASIGN project (TIN2004-00779) from the Spanish Ministry of Education and Science, of the Regional Government of Valencia, and by the QUASIMODO and MISTICO projects of the Regional Science and Technology Ministry of Castilla-La Mancha (Spain)
    corecore