4,395 research outputs found

    A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

    Get PDF

    HUDDL for description and archive of hydrographic binary data

    Get PDF
    Many of the attempts to introduce a universal hydrographic binary data format have failed or have been only partially successful. In essence, this is because such formats either have to simplify the data to such an extent that they only support the lowest common subset of all the formats covered, or they attempt to be a superset of all formats and quickly become cumbersome. Neither choice works well in practice. This paper presents a different approach: a standardized description of (past, present, and future) data formats using the Hydrographic Universal Data Description Language (HUDDL), a descriptive language implemented using the Extensible Markup Language (XML). That is, XML is used to provide a structural and physical description of a data format, rather than the content of a particular file. Done correctly, this opens the possibility of automatically generating both multi-language data parsers and documentation for format specification based on their HUDDL descriptions, as well as providing easy version control of them. This solution also provides a powerful approach for archiving a structural description of data along with the data, so that binary data will be easy to access in the future. Intending to provide a relatively low-effort solution to index the wide range of existing formats, we suggest the creation of a catalogue of format descriptions, each of them capturing the logical and physical specifications for a given data format (with its subsequent upgrades). A C/C++ parser code generator is used as an example prototype of one of the possible advantages of the adoption of such a hydrographic data format catalogue

    Knowledge Rich Natural Language Queries over Structured Biological Databases

    Full text link
    Increasingly, keyword, natural language and NoSQL queries are being used for information retrieval from traditional as well as non-traditional databases such as web, document, image, GIS, legal, and health databases. While their popularity are undeniable for obvious reasons, their engineering is far from simple. In most part, semantics and intent preserving mapping of a well understood natural language query expressed over a structured database schema to a structured query language is still a difficult task, and research to tame the complexity is intense. In this paper, we propose a multi-level knowledge-based middleware to facilitate such mappings that separate the conceptual level from the physical level. We augment these multi-level abstractions with a concept reasoner and a query strategy engine to dynamically link arbitrary natural language querying to well defined structured queries. We demonstrate the feasibility of our approach by presenting a Datalog based prototype system, called BioSmart, that can compute responses to arbitrary natural language queries over arbitrary databases once a syntactic classification of the natural language query is made

    Cardinality heterogeneities in Web service composition: Issues and solutions

    Full text link
    Data exchanges between Web services engaged in a composition raise several heterogeneities. In this paper, we address the problem of data cardinality heterogeneity in a composition. Firstly, we build a theoretical framework to describe different aspects of Web services that relate to data cardinality, and secondly, we solve this problem by developing a solution for cardinality mediation based on constraint logic programming

    CLP-based protein fragment assembly

    Full text link
    The paper investigates a novel approach, based on Constraint Logic Programming (CLP), to predict the 3D conformation of a protein via fragments assembly. The fragments are extracted by a preprocessor-also developed for this work- from a database of known protein structures that clusters and classifies the fragments according to similarity and frequency. The problem of assembling fragments into a complete conformation is mapped to a constraint solving problem and solved using CLP. The constraint-based model uses a medium discretization degree Ca-side chain centroid protein model that offers efficiency and a good approximation for space filling. The approach adapts existing energy models to the protein representation used and applies a large neighboring search strategy. The results shows the feasibility and efficiency of the method. The declarative nature of the solution allows to include future extensions, e.g., different size fragments for better accuracy.Comment: special issue dedicated to ICLP 201

    Extending and Relating Semantic Models of Compensating CSP

    No full text
    Business transactions involve multiple partners coordinating and interacting with each other. These transactions have hierarchies of activities which need to be orchestrated. Usual database approaches (e.g.,checkpoint, rollback) are not applicable to handle faults in a long running transaction due to interaction with multiple partners. The compensation mechanism handles faults that can arise in a long running transaction. Based on the framework of Hoare's CSP process algebra, Butler et al introduced Compensating CSP (cCSP), a language to model long-running transactions. The language introduces a method to declare a transaction as a process and it has constructs for orchestration of compensation. Butler et al also defines a trace semantics for cCSP. In this thesis, the semantic models of compensating CSP are extended by defining an operational semantics, describing how the state of a program changes during its execution. The semantics is encoded into Prolog to animate the specification. The semantic models are further extended to define the synchronisation of processes. The notion of partial behaviour is defined to model the behaviour of deadlock that arises during process synchronisation. A correspondence relationship is then defined between the semantic models and proved by using structural induction. Proving the correspondence means that any of the presentation can be accepted as a primary definition of the meaning of the language and each definition can be used correctly at different times, and for different purposes. The semantic models and their relationships are mechanised by using the theorem prover PVS. The semantic models are embedded in PVS by using Shallow embedding. The relationships between semantic models are proved by mutual structural induction. The mechanisation overcomes the problems in hand proofs and improves the scalability of the approach

    A logic programming framework for modeling temporal objects

    Get PDF
    Published versio
    corecore