6,492 research outputs found

    The State-of-the-arts in Focused Search

    Get PDF
    The continuous influx of various text data on the Web requires search engines to improve their retrieval abilities for more specific information. The need for relevant results to a user’s topic of interest has gone beyond search for domain or type specific documents to more focused result (e.g. document fragments or answers to a query). The introduction of XML provides a format standard for data representation, storage, and exchange. It helps focused search to be carried out at different granularities of a structured document with XML markups. This report aims at reviewing the state-of-the-arts in focused search, particularly techniques for topic-specific document retrieval, passage retrieval, XML retrieval, and entity ranking. It is concluded with highlight of open problems

    Model transformations and Tool Integration

    Get PDF
    Model transformations are increasingly recognised as being of significant importance to many areas of software development and integration. Recent attention on model transformations has particularly focused on the OMGs Queries/Views/Transformations (QVT) Request for Proposals (RFP). In this paper I motivate the need for dedicated approaches to model transformations, particularly for the data involved in tool integration, outline the challenges involved, and then present a number of technologies and techniques which allow the construction of flexible, powerful and practical model transformations

    Use-cases on evolution

    Get PDF
    This report presents a set of use cases for evolution and reactivity for data in the Web and Semantic Web. This set is organized around three different case study scenarios, each of them is related to one of the three different areas of application within Rewerse. Namely, the scenarios are: “The Rewerse Information System and Portal”, closely related to the work of A3 – Personalised Information Systems; “Organizing Travels”, that may be related to the work of A1 – Events, Time, and Locations; “Updates and evolution in bioinformatics data sources” related to the work of A2 – Towards a Bioinformatics Web

    High-level Thesaurus (HILT) Phase III [Final Report] : Evaluation Report

    Get PDF
    An evaluation stage of the HILT Phase III pilot M2M demonstrator was to be undertaken following completion of the main development work (November/December 2006). The aim was to determine whether the pilot demonstrator operates as specified in the requirements document and, hence, whether it correctly delivers the functionality needed to meet the five use cases (devised during the preceding feasibility study). Outcomes will be used to inform the system refinement process, due to occur in January 2007. Six SOAP functions were designed to meet the functionality required by each of the use cases, either singly or in combination, and the working pilot is best tested by examining whether each part of the system architecture (see Figure 1) operates as specified in the requirements document when any given one of the functions is called. This report documents the use cases being addressed, the nature of the functions designed to meet the use cases, how each part of the system is required to operate when a function is called, methodologies determined to assess the satisfactory performance of functions, and associated results. It is not the intention of this evaluation to study the quality of mappings or retrieval performance. Results presented will enable the identification of issues or errors within the system as it is currently implemented (or requirements as currently specified) and any additional requirements for development beyond Phase III will be noted

    Reasoning & Querying – State of the Art

    Get PDF
    Various query languages for Web and Semantic Web data, both for practical use and as an area of research in the scientific community, have emerged in recent years. At the same time, the broad adoption of the internet where keyword search is used in many applications, e.g. search engines, has familiarized casual users with using keyword queries to retrieve information on the internet. Unlike this easy-to-use querying, traditional query languages require knowledge of the language itself as well as of the data to be queried. Keyword-based query languages for XML and RDF bridge the gap between the two, aiming at enabling simple querying of semi-structured data, which is relevant e.g. in the context of the emerging Semantic Web. This article presents an overview of the field of keyword querying for XML and RDF

    Ontology-based specific and exhaustive user profiles for constraint information fusion for multi-agents

    Get PDF
    Intelligent agents are an advanced technology utilized in Web Intelligence. When searching information from a distributed Web environment, information is retrieved by multi-agents on the client site and fused on the broker site. The current information fusion techniques rely on cooperation of agents to provide statistics. Such techniques are computationally expensive and unrealistic in the real world. In this paper, we introduce a model that uses a world ontology constructed from the Dewey Decimal Classification to acquire user profiles. By search using specific and exhaustive user profiles, information fusion techniques no longer rely on the statistics provided by agents. The model has been successfully evaluated using the large INEX data set simulating the distributed Web environment

    High-level Thesaurus (HILT) Phase III [Project] : Final Report

    Get PDF
    An evaluation stage of the HILT Phase III pilot M2M demonstrator was to be undertaken following completion of the main development work (November/December 2006). The aim was to determine whether the pilot demonstrator operates as specified in the requirements document and, hence, whether it correctly delivers the functionality needed to meet the five use cases (devised during the preceding feasibility study). Outcomes will be used to inform the system refinement process, due to occur in January 2007. Six SOAP functions were designed to meet the functionality required by each of the use cases, either singly or in combination, and the working pilot is best tested by examining whether each part of the system architecture (see Figure 1) operates as specified in the requirements document when any given one of the functions is called. This report documents the use cases being addressed, the nature of the functions designed to meet the use cases, how each part of the system is required to operate when a function is called, methodologies determined to assess the satisfactory performance of functions, and associated results. It is not the intention of this evaluation to study the quality of mappings or retrieval performance. Results presented will enable the identification of issues or errors within the system as it is currently implemented (or requirements as currently specified) and any additional requirements for development beyond Phase III will be noted

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
    • 

    corecore