20,658 research outputs found

    Algorithms and implementation of functional dependency discovery in XML : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University

    Get PDF
    1.1 Background Following the advent of the web, there has been a great demand for data interchange between applications using internet infrastructure. XML (extensible Markup Language) provides a structured representation of data empowered by broad adoption and easy deployment. As a subset of SGML (Standard Generalized Markup Language), XML has been standardized by the World Wide Web Consortium (W3C) [Bray et al., 2004], XML is becoming the prevalent data exchange format on the World Wide Web and increasingly significant in storing semi-structured data. After its initial release in 1996, it has evolved and been applied extensively in all fields where the exchange of structured documents in electronic form is required. As with the growing popularity of XML, the issue of functional dependency in XML has recently received well deserved attention. The driving force for the study of dependencies in XML is it is as crucial to XML schema design, as to relational database(RDB) design [Abiteboul et al., 1995]

    Dependency parsing resources for French: Converting acquired lexical functional grammar F-Structure annotations and parsing F-Structures directly

    Get PDF
    Recent years have seen considerable success in the generation of automatically obtained wide-coverage deep grammars for natural language processing, given reliable and large CFG-like treebanks. For research within Lexical Functional Grammar framework, these deep grammars are typically based on an extended PCFG parsing scheme from which dependencies are extracted. However, increasing success in statistical dependency parsing suggests that such deep grammar approaches to statistical parsing could be streamlined. We explore this novel approach to deep grammar parsing within the framework of LFG in this paper, for French, showing that best results (an f-score of 69.46) for the established integrated architecture may be obtained for French

    Liberating Composition from Language Dictatorship

    Get PDF
    Historically, programming languages have been—although benevolent—dictators: fixing a lot of semantics into built-in language constructs. Over the years, (some) programming languages have freed the programmers from restrictions to use only built-in libraries, built-in data types, or built-in type checking rules. Even though, arguably, such freedom could lead to anarchy, or people shooting themselves in the foot, the contrary tends to be the case: a language that does not allow for extensibility, is depriving software engineers from the ability to construct proper abstractions and to structure software in the most optimal way. Instead, the software becomes less structured and maintainable than would be possible if the software engineer could express the behavior of the program with the most appropriate abstractions. The new idea proposed by this paper is to move composition from built-in language constructs to programmable, first-class abstractions in the language. As an emerging result, we present the Co-op concept of a language, which shows that it is possible with a relatively simple model to express a wide range of compositions as first-class concepts

    Trustworthy Refactoring via Decomposition and Schemes: A Complex Case Study

    Get PDF
    Widely used complex code refactoring tools lack a solid reasoning about the correctness of the transformations they implement, whilst interest in proven correct refactoring is ever increasing as only formal verification can provide true confidence in applying tool-automated refactoring to industrial-scale code. By using our strategic rewriting based refactoring specification language, we present the decomposition of a complex transformation into smaller steps that can be expressed as instances of refactoring schemes, then we demonstrate the semi-automatic formal verification of the components based on a theoretical understanding of the semantics of the programming language. The extensible and verifiable refactoring definitions can be executed in our interpreter built on top of a static analyser framework.Comment: In Proceedings VPT 2017, arXiv:1708.0688

    Four Lessons in Versatility or How Query Languages Adapt to the Web

    Get PDF
    Exposing not only human-centered information, but machine-processable data on the Web is one of the commonalities of recent Web trends. It has enabled a new kind of applications and businesses where the data is used in ways not foreseen by the data providers. Yet this exposition has fractured the Web into islands of data, each in different Web formats: Some providers choose XML, others RDF, again others JSON or OWL, for their data, even in similar domains. This fracturing stifles innovation as application builders have to cope not only with one Web stack (e.g., XML technology) but with several ones, each of considerable complexity. With Xcerpt we have developed a rule- and pattern based query language that aims to give shield application builders from much of this complexity: In a single query language XML and RDF data can be accessed, processed, combined, and re-published. Though the need for combined access to XML and RDF data has been recognized in previous work (including the W3C’s GRDDL), our approach differs in four main aspects: (1) We provide a single language (rather than two separate or embedded languages), thus minimizing the conceptual overhead of dealing with disparate data formats. (2) Both the declarative (logic-based) and the operational semantics are unified in that they apply for querying XML and RDF in the same way. (3) We show that the resulting query language can be implemented reusing traditional database technology, if desirable. Nevertheless, we also give a unified evaluation approach based on interval labelings of graphs that is at least as fast as existing approaches for tree-shaped XML data, yet provides linear time and space querying also for many RDF graphs. We believe that Web query languages are the right tool for declarative data access in Web applications and that Xcerpt is a significant step towards a more convenient, yet highly efficient data access in a “Web of Data”
    corecore