697 research outputs found

    An XML-Based Approach to Handling Tables in Documents

    Get PDF
    We explore application of XML technology for handling tables in legacy semi-structured documents. Specifically, we analyze annotating heterogeneous documents containing tables to obtain a formalized XML Master document that improves traceability (hence easing verification and update) and enables manipulation using XSLT stylesheets. This approach is useful when table instances far outnumber distinct table types because the effort required to annotate a table instance is relatively less compared to formalizing table processing that respects table’s semantics. This work is also relevant for authoring new documents with tables that should be accessible to both humans and machines

    Identification of Design Principles

    Get PDF
    This report identifies those design principles for a (possibly new) query and transformation language for the Web supporting inference that are considered essential. Based upon these design principles an initial strawman is selected. Scenarios for querying the Semantic Web illustrate the design principles and their reflection in the initial strawman, i.e., a first draft of the query language to be designed and implemented by the REWERSE working group I4

    Semantic Web Technologies

    Get PDF
    This study attempts to highlight the great importance of developing Semantic Web as one of the best discovery of better data management and presentation within the WWW. Since the W3C\u27s was discovered, initially providing classic web content as web 1.0 that had link / hyperlink of document`s location, then web 2.0 as web-applications have more advanced technologies to connect data, and finally semantic web as extension of web 3.0 also known as Linked Data. The results show that in addition to the rapid development of the Semantic Web, the demand to use its features by data publishers and data readers is rapidly expanding due to the time saving to publish multiple times the same data on other web pages. Moreover, we will present the features of the Semantic Web, its technologies, development history, advantages and weaknesses, the potential benefits, and so on, including standards, frameworks, and programming languages that are being used in its development like: RDF (Resource Description Framework), XML etc

    Named Entity Extraction for Knowledge Graphs: A Literature Overview

    Get PDF
    An enormous amount of digital information is expressed as natural-language (NL) text that is not easily processable by computers. Knowledge Graphs (KG) offer a widely used format for representing information in computer-processable form. Natural Language Processing (NLP) is therefore needed for mining (or lifting) knowledge graphs from NL texts. A central part of the problem is to extract the named entities in the text. The paper presents an overview of recent advances in this area, covering: Named Entity Recognition (NER), Named Entity Disambiguation (NED), and Named Entity Linking (NEL). We comment that many approaches to NED and NEL are based on older approaches to NER and need to leverage the outputs of state-of-the-art NER systems. There is also a need for standard methods to evaluate and compare named-entity extraction approaches. We observe that NEL has recently moved from being stepwise and isolated into an integrated process along two dimensions: the first is that previously sequential steps are now being integrated into end-to-end processes, and the second is that entities that were previously analysed in isolation are now being lifted in each other's context. The current culmination of these trends are the deep-learning approaches that have recently reported promising results.publishedVersio

    RDF, the semantic web, Jordan, Jordan and Jordan

    Get PDF
    This collection is addressed to archivists and library professionals, and so has a slight focus on implications implications for them. This chapter is nonetheless intended to be a more-or-less generic introduction to the Semantic Web and RDF, which isn't specific to that domain

    Knowledge representation on the web

    Get PDF
    Exploiting the full potential of the World Wide Web will require semantic as well as syntactic interoperability. This can best be achieved by providing a further representation and inference layer that builds on existing and proposed web standards. The OIL language extends the RDF schema standard to provide just such a layer. It combines the most attractive features of frame based languages with the expressive power, formal rigour and reasoning services of a very expressive description logic.

    An Information Extraction Approach to Reorganizing and Summarizing Specifications

    Get PDF
    Materials and Process Specifications are complex semi-structured documents containing numeric data, text, and images. This article describes a coarse-grain extraction technique to automatically reorganize and summarize spec content. Specifically, a strategy for semantic-markup, to capture content within a semantic ontology, relevant to semi-automatic extraction, has been developed and experimented with. The working prototypes were built in the context of Cohesia\u27s existing software infrastructure, and use techniques from Information Extraction, XML technology, etc
    corecore