267 research outputs found

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    DRIVER Technology Watch Report

    Get PDF
    This report is part of the Discovery Workpackage (WP4) and is the third report out of four deliverables. The objective of this report is to give an overview of the latest technical developments in the world of digital repositories, digital libraries and beyond, in order to serve as theoretical and practical input for the technical DRIVER developments, especially those focused on enhanced publications. This report consists of two main parts, one part focuses on interoperability standards for enhanced publications, the other part consists of three subchapters, which give a landscape picture of current and surfacing technologies and communities crucial to DRIVER. These three subchapters contain the GRID, CRIS and LTP communities and technologies. Every chapter contains a theoretical explanation, followed by case studies and the outcomes and opportunities for DRIVER in this field

    A semantic linking framework to provide critical value-added services for E-journals on classics

    Get PDF
    In the field of Classical Studies the use of e-journals as effective means for scholarly research still needs to bootstrapped. This paper proposes a possible implementation of two value-added services to be provided by e-journals that is to reach this goal: reference linking and reference indexing. Both these services would contribute to make more machine-evident the hidden bond but of utmost importance which link together primary and secondary sources in this field of studies. On a technical level, this paper proposes the use of Microformats and of the Canonical Texts Service (CTS) protocol to build the semantic and nonproprietary linking framework necessary to provide scholars reading e-journals with some advanced and critical features

    TecnologĂ­as para el manejo de metadatos en artĂ­culos cientĂ­ficos

    Get PDF
    (Eng) The use of Semantic Web technologies has been increasing, so it is common using them in different ways. This article evaluates how these technologies can contribute to improve the indexing in articles in scientific journals. Initially, there is a conceptual review about metadata. Later, studying the most important technologies for the use of metadata in Web and, this way, choosing one of them to apply it in the case of study of scientific articles indexing, in order to determine the metadata based in those used in impact research journals, and building a model for indexing scientific articles using Semantic Web technologies.(Spa) El uso de tecnologĂ­as de la Web SemĂĄntica ha venido acrecentĂĄndose, por lo que es comĂșn usarlo en diferentes aspectos. Este trabajo evalĂșa como estas tecnologĂ­as pueden contribuir a mejorar la indexaciĂłn de artĂ­culos en revistas cientĂ­ficas. Inicialmente, se hace una revisiĂłn conceptual de los metadatos, para posteriormente estudiar las tecnologĂ­as mĂĄs importantes para el uso de metadatos en la Web y, de esta manera, escoger una para aplicarla en el caso de estudio de indexaciĂłn de artĂ­culos cientĂ­ficos, determinando los metadatos con bases en los usados por las revistas de investigaciĂłn de impacto y construir un modelo para la indexaciĂłn de artĂ­culos cientĂ­ficos usando una tecnologĂ­a de Web SemĂĄntica

    WSMO-Lite and hRESTS: lightweight semantic annotations for Web services and RESTful APIs

    Get PDF
    Service-oriented computing has brought special attention to service description, especially in connection with semantic technologies. The expected proliferation of publicly accessible services can benefit greatly from tool support and automation, both of which are the focus of Semantic Web Service (SWS) frameworks that especially address service discovery, composition and execution. As the first SWS standard, in 2007 the World Wide Web Consortium produced a lightweight bottom-up specification called SAWSDL for adding semantic annotations to WSDL service descriptions. Building on SAWSDL, this article presents WSMO-Lite, a lightweight ontology of Web service semantics that distinguishes four semantic aspects of services: function, behavior, information model, and nonfunctional properties, which together form a basis for semantic automation. With the WSMO-Lite ontology, SAWSDL descriptions enable semantic automation beyond simple input/output matchmaking that is supported by SAWSDL itself. Further, to broaden the reach of WSMO-Lite and SAWSDL tools to the increasingly common RESTful services, the article adds hRESTS and MicroWSMO, two HTML microformats that mirror WSDL and SAWSDL in the documentation of RESTful services, enabling combining RESTful services with WSDL-based ones in a single semantic framework. To demonstrate the feasibility and versatility of this approach, the article presents common algorithms for Web service discovery and composition adapted to WSMO-Lite
    • 

    corecore