Search CORE

572 research outputs found

XML Matchers: approaches and challenges

Author: Agreste Santa
De Meo Pasquale
Ferrara Emilio
Ursino Domenico
Publication venue: 'Elsevier BV'
Publication date: 10/07/2014
Field of study

Schema Matching, i.e. the process of discovering semantic correspondences between concepts adopted in different data source schemas, has been a key topic in Database and Artificial Intelligence research areas for many years. In the past, it was largely investigated especially for classical database models (e.g., E/R schemas, relational databases, etc.). However, in the latest years, the widespread adoption of XML in the most disparate application fields pushed a growing number of researchers to design XML-specific Schema Matching approaches, called XML Matchers, aiming at finding semantic matchings between concepts defined in DTDs and XSDs. XML Matchers do not just take well-known techniques originally designed for other data models and apply them on DTDs/XSDs, but they exploit specific XML features (e.g., the hierarchical structure of a DTD/XSD) to improve the performance of the Schema Matching process. The design of XML Matchers is currently a well-established research area. The main goal of this paper is to provide a detailed description and classification of XML Matchers. We first describe to what extent the specificities of DTDs/XSDs impact on the Schema Matching task. Then we introduce a template, called XML Matcher Template, that describes the main components of an XML Matcher, their role and behavior. We illustrate how each of these components has been implemented in some popular XML Matchers. We consider our XML Matcher Template as the baseline for objectively comparing approaches that, at first glance, might appear as unrelated. The introduction of this template can be useful in the design of future XML Matchers. Finally, we analyze commercial tools implementing XML Matchers and introduce two challenging issues strictly related to this topic, namely XML source clustering and uncertainty management in XML Matchers.Comment: 34 pages, 8 tables, 7 figure

arXiv.org e-Print Archive

IRIS UniversitÃ Politecnica delle Marche

Abstraction Raising in General-Purpose Compilers

Author: Chelini Lorenzo
Publication venue: Eindhoven University of Technology
Publication date: 31/08/2021
Field of study

Pure OAI Repository

Efficient Pattern Matching in Python

Author: Bachmair L.
Bachmair L.
Behnel S.
Clausen M.
Kirchner H.
Klop J. W.
Kounalis E.
Nedjah N.
Rivoal F.
Robie J.
Publication venue
Publication date: 29/09/2017
Field of study

Pattern matching is a powerful tool for symbolic computations. Applications include term rewriting systems, as well as the manipulation of symbolic expressions, abstract syntax trees, and XML and JSON data. It also allows for an intuitive description of algorithms in the form of rewrite rules. We present the open source Python module MatchPy, which offers functionality and expressiveness similar to the pattern matching in Mathematica. In particular, it includes syntactic pattern matching, as well as matching for commutative and/or associative functions, sequence variables, and matching with constraints. MatchPy uses new and improved algorithms to efficiently find matches for large pattern sets by exploiting similarities between patterns. The performance of MatchPy is investigated on several real-world problems

arXiv.org e-Print Archive

Crossref

Measuring the Propagation of Information in Partial Evaluation

Author: Rohde Henning Korsholm
Publication venue: 'Aarhus University Library'
Publication date: 11/08/2005
Field of study

We present the first measurement-based analysis of the information propagated by a partial evaluator. Our analysis is based on measuring implementations of string-matching algorithms, based on the observation that the sequence of character comparisons accurately reflects maintained information. Notably, we can easily prove matchers to be different and we show that they display more variety and finesse than previously believed. As a consequence, we are able to pinpoint differences and inaccuracies in many results previously considered equivalent. Our analysis includes a framework that lets us obtain string matchers - notably the family of Boyer-Moore algorithms - in a systematic formalism-independent way from a few information-propagation primitives. By leveraging the existing research in string matching, we show that the landscape of information propagation is non-trivial in the sense that small changes in information propagation may dramatically change the properties of the resulting string matchers. We thus expect that this work will prove useful as a test and feedback mechanism for information propagation in the development of advanced program transformations, such as GPC or Supercompilation

Tidsskrift.dk (Det Kongelige Bibliotek)

Automatic Generation of Trace Links in Model-driven Software Development

Author: Grammel Birgit
Publication venue
Publication date: 17/02/2014
Field of study

Traceability data provides the knowledge on dependencies and logical relations existing amongst artefacts that are created during software development. In reasoning over traceability data, conclusions can be drawn to increase the quality of software. The paradigm of Model-driven Software Engineering (MDSD) promotes the generation of software out of models. The latter are specified through different modelling languages. In subsequent model transformations, these models are used to generate programming code automatically. Traceability data of the involved artefacts in a MDSD process can be used to increase the software quality in providing the necessary knowledge as described above. Existing traceability solutions in MDSD are based on the integral model mapping of transformation execution to generate traceability data. Yet, these solutions still entail a wide range of open challenges. One challenge is that the collected traceability data does not adhere to a unified formal definition, which leads to poorly integrated traceability data. This aggravates the reasoning over traceability data. Furthermore, these traceability solutions all depend on the existence of a transformation engine. However, not in all cases pertaining to MDSD can a transformation engine be accessed, while taking into account proprietary transformation engines, or manually implemented transformations. In these cases it is not possible to instrument the transformation engine for the sake of generating traceability data, resulting in a lack of traceability data. In this work, we address these shortcomings. In doing so, we propose a generic traceability framework for augmenting arbitrary transformation approaches with a traceability mechanism. To integrate traceability data from different transformation approaches, our approach features a methodology for augmentation possibilities based on a design pattern. The design pattern supplies the engineer with recommendations for designing the traceability mechanism and for modelling traceability data. Additionally, to provide a traceability mechanism for inaccessible transformation engines, we leverage parallel model matching to generate traceability data for arbitrary source and target models. This approach is based on a language-agnostic concept of three similarity measures for matching. To realise the similarity measures, we exploit metamodel matching techniques for graph-based model matching. Finally, we evaluate our approach according to a set of transformations from an SAP business application and the domain of MDSD

Technische Universität Dresden: Qucosa

EqFix: Fixing LaTeX Equation Errors by Examples

Author: He Fei
Zhu Fengmin
Publication venue
Publication date: 01/07/2021
Field of study

LaTeX is a widely-used document preparation system. Its powerful ability in mathematical equation editing is perhaps the main reason for its popularity in academia. Sometimes, however, even an expert user may spend much time on fixing an erroneous equation. In this paper, we present EqFix, a synthesis-based repairing system for LaTeX equations. It employs a set of fixing rules, and can suggest possible repairs for common errors in LaTeX equations. A domain specific language is proposed for formally expressing the fixing rules. The fixing rules can be automatically synthesized from a set of input-output examples. An extension of relaxer is also introduced to enhance the practicality of EqFix. We evaluate EqFix on real-world examples and find that it can synthesize rules with high generalization ability. Compared with a state-of-the-art string transformation synthesizer, EqFix solved 37% more cases and spent only one third of their synthesis time

arXiv.org e-Print Archive

OP2-Clang : a source-to-source translator using Clang/LLVM LibTooling

Author: Antao S. F.
Balogh G. D.
Bertolli C.
Mudalige Gihan R.
Reguly I. Z.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/02/2019
Field of study

Domain Specific Languages or Active Library frameworks have recently emerged as an important method for gaining performance portability, where an application can be efficiently executed on a wide range of HPC architectures without significant manual modifications. Embedded DSLs such as OP2, provides an API embedded in general purpose languages such as C/C++/Fortran. They rely on source-to-source translation and code refactorization to translate the higher-level API calls to platform specific parallel implementations. OP2 targets the solution of unstructured-mesh computations, where it can generate a variety of parallel implementations for execution on architectures such as CPUs, GPUs, distributed memory clusters and heterogeneous processors making use of a wide range of platform specific optimizations. Compiler tool-chains supporting source-to-source translation of code written in mainstream languages currently lack the capabilities to carry out such wide-ranging code transformations. Clang/LLVM’s Tooling library (LibTooling) has long been touted as having such capabilities but have only demonstrated its use in simple source refactoring tasks. In this paper we introduce OP2-Clang, a source-to-source translator based on LibTooling, for OP2’s C/C++ API, capable of generating target parallel code based on SIMD, OpenMP, CUDA and their combinations with MPI. OP2-Clang is designed to significantly reduce maintenance, particularly making it easy to be extended to generate new parallelizations and optimizations for hardware platforms. In this research, we demonstrate its capabilities including (1) the use of LibTooling’s AST matchers together with a simple strategy that use parallelization templates or skeletons to significantly reduce the complexity of generating radically different and transformed target code and (2) chart the challenges and solution to generating optimized parallelizations for OpenMP, SIMD and CUDA. Results indicate that OP2-Clang produces near-identical parallel code to that of OP2’s current source-to-source translator. We believe that the lessons learnt in OP2-Clang can be readily applied to developing other similar source-to-source translators, particularly for DSLs

Crossref

Warwick Research Archives Portal Repository

OP2-Clang : a source-to-source translator using Clang/LLVM LibTooling

Author: Antao S. F.
Balogh G. D.
Bertolli C.
Mudalige Gihan R.
Reguly I. Z.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Crossref

Warwick Research Archives Portal Repository

Repository of the Academy's Library