89 research outputs found

    XML documents schema design

    Get PDF
    The eXtensible Markup Language (XML) is fast emerging as the dominant standard for storing, describing and interchanging data among various systems and databases on the intemet. It offers schema such as Document Type Definition (DTD) or XML Schema Definition (XSD) for defining the syntax and structure of XML documents. To enable efficient usage of XML documents in any application in large scale electronic environment, it is necessary to avoid data redundancies and update anomalies. Redundancy and anomalies in XML documents can lead not only to higher data storage cost but also to increased costs for data transfer and data manipulation.To overcome this problem, this thesis proposes to establish a formal framework of XML document schema design. To achieve this aim, we propose a method to improve and simplify XML schema design by incorporating a conceptual model of the DTD with a theory of database normalization. A conceptual diagram, Graph-Document Type Definition (G-DTD) is proposed to describe the structure of XML documents at the schema level. For G- DTD itself, we define a structure which incorporates attributes, simple elements, complex elements, and relationship types among them. Furthermore, semantic constraints are also precisely defined in order to capture semantic meanings among the defined XML objects.In addition, to provide a guideline to a well-designed schema for XML documents, we propose a set of normal forms for G-DTD on the basis of rules proposed by Arenas and Libkin and Lv. et al. The corresponding normalization rules to transform from a G- DTD into a normal form schema are also discussed. A case study is given to illustrate the applicability of the concept. As a result, we found that the new normal forms are more concise and practical, in particular as they allow the user to find an 'optimal' structure of XML elements/attributes at the schema level. To prove that our approach is applicable for the database designer, we develop a prototype of XML document schema design using a Z formal specification language. Finally, using the same case study, this formal specification is tested to check for correctness and consistency of the specification. Thus, this gives a confidence that our prototype can be implemented successfully to generate an automatic XML schema design

    Model Transformation Modularization as a Many-Objective Optimization Problem

    Get PDF
    Model transformation programs are iteratively refined, restructured, and evolved due to many reasons such as fixing bugs and adapting existing transformation rules to new metamodels version. Thus, modular design is a desirable property for model transformations as it can significantly improve their evolution, comprehensibility, maintainability, reusability, and thus, their overall quality. Although language support for modularization of model transformations is emerging, model transformations are created as monolithic artifacts containing a huge number of rules. To the best of our knowledge, the problem of automatically modularizing model transformation programs was not addressed before in the current literature. These programs written in transformation languages, such as ATL, are implemented as one main module including a huge number of rules. To tackle this problem and improve the quality and maintainability of model transformation programs, we propose an automated search-based approach to modularize model transformations based on higher-order transformations. Their application and execution is guided by our search framework which combines an in-place transformation engine and a search-based algorithm framework. We demonstrate the feasibility of our approach by using ATL as concrete transformation language and NSGA-III as search algorithm to find a trade-off between different well-known conflicting design metrics for the fitness functions to evaluate the generated modularized solutions. To validate our approach, we apply it to a comprehensive dataset of model transformations. As the study shows, ATL transformations can be modularized automatically, efficiently, and effectively by our approach. We found that, on average, the majority of recommended modules, for all the ATL programs, by NSGA-III are considered correct with more than 84% of precision and 86% of recall when compared to manual solutions provided by active developers. The statistical analysis of our experiments over several runs shows that NSGA-III performed significantly better than multi-objective algorithms and random search. We were not able to compare with existing model transformations modularization approaches since our study is the first to address this problem. The software developers considered in our experiments confirm the relevance of the recommended modularization solutions for several maintenance activities based on different scenarios and interviews.Comisión Interministerial de Ciencia y Tecnología TIN2015-70560-RJunta de Andalucía P12-TIC-186

    Ontological approach for database integration

    Get PDF
    Database integration is one of the research areas that have gained a lot of attention from researcher. It has the goal of representing the data from different database sources in one unified form. To reach database integration we have to face two obstacles. The first one is the distribution of data, and the second is the heterogeneity. The Web ensures addressing the distribution problem, and for the case of heterogeneity there are many approaches that can be used to solve the database integration problem, such as data warehouse and federated databases. The problem in these two approaches is the lack of semantics. Therefore, our approach exploits the Semantic Web methodology. The hybrid ontology method can be facilitated in solving the database integration problem. In this method two elements are available; the source (database) and the domain ontology, however, the local ontology is missing. In fact, to ensure the success of this method the local ontologies should be produced. Our approach obtains the semantics from the logical model of database to generate local ontology. Then, the validation and the enhancement can be acquired from the semantics obtained from the conceptual model of the database. Now, our approach can be applied in the generation phase and the validation-enrichment phase. In the generation phase in our approach, we utilise the reverse engineering techniques in order to catch the semantics hidden in the SQL language. Then, the approach reproduces the logical model of the database. Finally, our transformation system will be applied to generate an ontology. In our transformation system, all the concepts of classes, relationships and axioms will be generated. Firstly, the process of class creation contains many rules participating together to produce classes. Our unique rules succeeded in solving problems such as fragmentation and hierarchy. Also, our rules eliminate the superfluous classes of multi-valued attribute relation as well as taking care of neglected cases such as: relationships with additional attributes. The final class creation rule is for generic relation cases. The rules of the relationship between concepts are generated with eliminating the relationships between integrated concepts. Finally, there are many rules that consider the relationship and the attributes constraints which should be transformed to axioms in the ontological model. The formal rules of our approach are domain independent; also, it produces a generic ontology that is not restricted to a specific ontology language. The rules consider the gap between the database model and the ontological model. Therefore, some database constructs would not have an equivalent in the ontological model. The second phase consists of the validation and the enrichment processes. The best way to validate the transformation result is to facilitate the semantics obtained from the conceptual model of the database. In the validation phase, the domain expert captures the missing or the superfluous concepts (classes or relationships). In the enrichment phase, the generalisation method can be applied to classes that share common attributes. Also, the concepts of complex or composite attributes can be represented as classes. We implement the transformation system by a tool called SQL2OWL in order to show the correctness and the functionally of our approach. The evaluation of our system showed the success of our proposed approach. The evaluation goes through many techniques. Firstly, a comparative study is held between the results produced by our approach and the similar approaches. The second evaluation technique is the weighting score system which specify the criteria that affect the transformation system. The final evaluation technique is the score scheme. We consider the quality of the transformation system by applying the compliance measure in order to show the strength of our approach compared to the existing approaches. Finally the measures of success that our approach considered are the system scalability and the completeness

    Adaptive Caching of Distributed Components

    Get PDF
    Die Zugriffslokalität referenzierter Daten ist eine wichtige Eigenschaft verteilter Anwendungen. Lokales Zwischenspeichern abgefragter entfernter Daten (Caching) wird vielfach bei der Entwicklung solcher Anwendungen eingesetzt, um diese Eigenschaft auszunutzen. Anschliessende Zugriffe auf diese Daten können so beschleunigt werden, indem sie aus dem lokalen Zwischenspeicher bedient werden. Gegenwärtige Middleware-Architekturen bieten dem Anwendungsprogrammierer jedoch kaum Unterstützung für diesen nicht-funktionalen Aspekt. Die vorliegende Arbeit versucht deshalb, Caching als separaten, konfigurierbaren Middleware-Dienst auszulagern. Durch die Einbindung in den Softwareentwicklungsprozess wird die frühzeitige Modellierung und spätere Wiederverwendung caching-spezifischer Metadaten gewährleistet. Zur Laufzeit kann sich das entwickelte System außerdem bezüglich der Cachebarkeit von Daten adaptiv an geändertes Nutzungsverhalten anpassen.Locality of reference is an important property of distributed applications. Caching is typically employed during the development of such applications to exploit this property by locally storing queried data: Subsequent accesses can be accelerated by serving their results immediately form the local store. Current middleware architectures however hardly support this non-functional aspect. The thesis at hand thus tries outsource caching as a separate, configurable middleware service. Integration into the software development lifecycle provides for early capturing, modeling, and later reuse of cachingrelated metadata. At runtime, the implemented system can adapt to caching access characteristics with respect to data cacheability properties, thus healing misconfigurations and optimizing itself to an appropriate configuration. Speculative prefetching of data probably queried in the immediate future complements the presented approach

    Integration of Legacy and Heterogeneous Databases

    Get PDF

    Model-driven round-trip engineering of REST APIs

    Get PDF
    Les API web s'han convertit cada vegada més en un actiu clau per a les empreses, que n'han promogut la implementació i la integració en les seves activitats quotidianes. A la pràctica, la majoria d'aquestes API web són "REST-like", que significa que s'adhereixen parcialment a l'estil arquitectònic conegut com transferència d'estat representacional ('representational state transfer', REST en anglés). De fet, REST és un paradigma de disseny i no proposa cap estàndard. Com a conseqüència, tant desenvolupar com consumir API REST són tasques difícils i costoses per als proveïdors i clients de l'API. L'objectiu d'aquesta tesi és facilitar el disseny, la implementació, la composició i el consum de les API REST, basant-se en tècniques d'enginyeria dirigida per models ('model-driven engineering', MDE en anglés). Aquesta tesi proposa les contribucions següents: EMF-REST, APIDiscoverer, APITester, APIGenerator, i APIComposer. Aquestes contribucions constitueixen un ecosistema que avança l'estat de la qüestió al camp de l'enginyeria de programari automàtica per al desenvolupament i el consum de les API REST.Las API Web se han convertido en una pieza fundamental para un gran número de compañías, que han promovido su implementación e integración en las actividades cotidianas del negocio. En la práctica, estas API Web son "REST-like", lo que significa que se adhieren parcialmente al estilo arquitectónico conocido como transferencia de estado representacional ('representational state transfer', REST en inglés). De hecho, REST es un paradigma de diseño y no propone ningún estándar. Por ello, tanto el desarrollo como el consumo de API REST son tareas difíciles y que demandan mucho tiempo de los proveedores y los clientes de API. El objetivo de esta tesis es facilitar el diseño, la implementación, la composición y el consumo de API REST, apoyándose en el desarrollo de software dirigido por modelos (DSDM). Esta tesis propone las siguientes contribuciones: EMF-REST, APIDiscoverer, APITester, APIGenerator y APIComposer. Estas contribuciones constituyen un ecosistema que avanza el estado de la cuestión en el área de la ingeniería del software referida a la automatización de las tareas relacionadas con el desarrollo y consumo de API REST.Web APIs have become an increasingly key asset for businesses, and their implementation and integration in companies' daily activities has thus been on the rise. In practice, most of these Web APIs are "REST-like", meaning that they adhere partially to the Representational State Transfer (REST) architectural style. In fact, REST is a design paradigm and does not propose any standard, so developing and consuming REST APIs end up being challenging and time-consuming tasks for API providers and clients. Therefore, the aim of this thesis is to facilitate the design, implementation, composition and consumption of REST APIs by relying on Model-Driven Engineering (MDE). Likewise, it offers the following contributions: EMF-REST, APIDiscoverer, APITester, APIGenerator and APIComposer. Together, these contributions make up an ecosystem which advances the state of the art of automated software engineering for REST APIs

    Linked data wrapper curation: A platform perspective

    Get PDF
    131 p.Linked Data Wrappers (LDWs) turn Web APIs into RDF end-points, leveraging the LOD cloud with current data. This potential is frequently undervalued, regarding LDWs as mere by-products of larger endeavors, e.g. developing mashup applications. However, LDWs are mainly data-driven, not contaminated by application semantics, hence with an important potential for reuse. If LDWs could be decoupled from their breakout projects, this would increase the chances of LDWs becoming truly RDF end-points. But this vision is still under threat by LDW fragility upon API upgrades, and the risk of unmaintained LDWs. LDW curation might help. Similar to dataset curation, LDW curation aims to clean up datasets but, in this case, the dataset is implicitly described by the LDW definition, and ¿stains¿ are not limited to those related with the dataset quality but also include those related to the underlying API. This requires the existence of LDW Platforms that leverage existing code repositories with additional functionalities that cater for LDW definition, deployment and curation. This dissertation contributes to this vision through: (1) identifying a set of requirements for LDW Platforms; (2) instantiating these requirements in SYQL, a platform built upon Yahoo's YQL; (3) evaluating SYQL through a fully-developed proof of concept; and (4), validating the extent to which this approach facilitates LDW curation
    corecore