1,596 research outputs found

    Description and Optimization of Abstract Machines in a Dialect of Prolog

    Full text link
    In order to achieve competitive performance, abstract machines for Prolog and related languages end up being large and intricate, and incorporate sophisticated optimizations, both at the design and at the implementation levels. At the same time, efficiency considerations make it necessary to use low-level languages in their implementation. This makes them laborious to code, optimize, and, especially, maintain and extend. Writing the abstract machine (and ancillary code) in a higher-level language can help tame this inherent complexity. We show how the semantics of most basic components of an efficient virtual machine for Prolog can be described using (a variant of) Prolog. These descriptions are then compiled to C and assembled to build a complete bytecode emulator. Thanks to the high level of the language used and its closeness to Prolog, the abstract machine description can be manipulated using standard Prolog compilation and optimization techniques with relative ease. We also show how, by applying program transformations selectively, we obtain abstract machine implementations whose performance can match and even exceed that of state-of-the-art, highly-tuned, hand-crafted emulators.Comment: 56 pages, 46 figures, 5 tables, To appear in Theory and Practice of Logic Programming (TPLP

    The 4th Conference of PhD Students in Computer Science

    Get PDF

    The ciao preprocessor

    Get PDF
    Abstract is not available

    Table Augmentation in Data Lakes

    Get PDF
    Data lakes are centralized repositories that store large quantities of raw, unstructured, and structured data, allowing for ad-hoc data analysis, exploratory data analysis, and machine learning. However, the lack of metadata and schema in data lakes makes it challenging to work with tabular data and find related information stored in different tables. However, it is still an open problem how efficiently retrieve these tables at large scale when the settings of a data lake holds. The thesis introduces a novel approach to table augmentation that enables efficient data integration from multiple sources in a data lake. Table augmentation involves adding new data to an existing table in a horizontal fashion (by retrieving tables that can be horizontally concatenated to a query that serves as query table). The proposed approach consists of several components, including data lakes hashing, join search, similarity, and augmentation. The proposed approach is named TASH. TASH is a framework based on a spatial index in which tables are mapped and queried. Its goal is to identify the most useful columns for subsequent machine learning tasks. The table retrieval process employs a combination of set containment search and similarity search. Candidate tables are initially identified using set containment search and then ranked based on their similarity to the query. Experimental results demonstrate that TASH can effectively identify joinable tables and select the most relevant features, thereby enabling efficient table augmentation in data lakes. This research contributes to the field of big data by providing a practical solution to the challenges of data integration and analysis in data lake environments

    Non-Local Configuration of Component Interfaces by Constraint Satisfaction

    Get PDF
    © 2020 Springer-Verlag. The final publication is available at Springer via https://doi.org/10.1007/s10601-020-09309-y.Service-oriented computing is the paradigm that utilises services as fundamental elements for developing applications. Service composition, where data consistency becomes especially important, is still a key challenge for service-oriented computing. We maintain that there is one aspect of Web service communication on the data conformance side that has so far escaped the researchers attention. Aggregation of networked services gives rise to long pipelines, or quasi-pipeline structures, where there is a profitable form of inheritance called flow inheritance. In its presence, interface reconciliation ceases to be a local procedure, and hence it requires distributed constraint satisfaction of a special kind. We propose a constraint language for this, and present a solver which implements it. In addition, our approach provides a binding between the language and C++, whereby the assignment to the variables found by the solver is automatically translated into a transformation of C++ code. This makes the C++ Web service context compliant without any further communication. Besides, it uniquely permits a very high degree of flexibility of a C++ coded Web service without making public any part of its source code.Peer reviewe

    Multiword expression processing: A survey

    Get PDF
    Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word boundaries that are both idiosyncratic and pervasive across different languages. The structure of linguistic processing that depends on the clear distinction between words and phrases has to be re-thought to accommodate MWEs. The issue of MWE handling is crucial for NLP applications, where it raises a number of challenges. The emergence of solutions in the absence of guiding principles motivates this survey, whose aim is not only to provide a focused review of MWE processing, but also to clarify the nature of interactions between MWE processing and downstream applications. We propose a conceptual framework within which challenges and research contributions can be positioned. It offers a shared understanding of what is meant by "MWE processing," distinguishing the subtasks of MWE discovery and identification. It also elucidates the interactions between MWE processing and two use cases: Parsing and machine translation. Many of the approaches in the literature can be differentiated according to how MWE processing is timed with respect to underlying use cases. We discuss how such orchestration choices affect the scope of MWE-aware systems. For each of the two MWE processing subtasks and for each of the two use cases, we conclude on open issues and research perspectives

    CBR and MBR techniques: review for an application in the emergencies domain

    Get PDF
    The purpose of this document is to provide an in-depth analysis of current reasoning engine practice and the integration strategies of Case Based Reasoning and Model Based Reasoning that will be used in the design and development of the RIMSAT system. RIMSAT (Remote Intelligent Management Support and Training) is a European Commission funded project designed to: a.. Provide an innovative, 'intelligent', knowledge based solution aimed at improving the quality of critical decisions b.. Enhance the competencies and responsiveness of individuals and organisations involved in highly complex, safety critical incidents - irrespective of their location. In other words, RIMSAT aims to design and implement a decision support system that using Case Base Reasoning as well as Model Base Reasoning technology is applied in the management of emergency situations. This document is part of a deliverable for RIMSAT project, and although it has been done in close contact with the requirements of the project, it provides an overview wide enough for providing a state of the art in integration strategies between CBR and MBR technologies.Postprint (published version
    corecore