6,273 research outputs found

    Ontology and framework for semantic labelling of document data and software methods

    Get PDF
    We present a metadata labelling framework for datasets, software tools, and workflows. An ontology for document image analysis was developed with deep support for historical data. An accompanying open source software framework was implemented to enable ontology editing, data and method annotation, workflow composition, and semantic search. A wide range of examples is used to illustrate real-world application

    A review of the state of the art in Machine Learning on the Semantic Web: Technical Report CSTR-05-003

    Get PDF

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Network service registration based on role-goal-process-service meta-model in a P2P network

    Get PDF
    Service composition-based network software customisation is currently a research hotspot in the field of software engineering. A key problem of the hotspot is how to efficiently discover services distributed over the Internet. In the service oriented architecture, service discovery suffers from the performance bottleneck of centralised universal description discovery and integration (UDDI), and inaccurate matching of service semantics. In this study, the authors describe a novel method for service labelling, registration and discovery, which is based on the role-goal-process-service meta-model. This approach enables ones to achieve accurate matching of service semantics by extending web service description language with RGP demand-information. The authors also suggest a peer-to-peer (P2P)-based architecture of service discovery to address the issues in the UDDI bottleneck and the complexity of semantic computation. By adopting the proposed approach, an experiment prototype system has been designed and implemented in Beijing municipal transportation system. The experimental results show the proposed approach is effective in addressing the aforementioned problems

    Towards a Universal Wordnet by Learning from Combined Evidenc

    Get PDF
    Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification

    Challenges in Bridging Social Semantics and Formal Semantics on the Web

    Get PDF
    This paper describes several results of Wimmics, a research lab which names stands for: web-instrumented man-machine interactions, communities, and semantics. The approaches introduced here rely on graph-oriented knowledge representation, reasoning and operationalization to model and support actors, actions and interactions in web-based epistemic communities. The re-search results are applied to support and foster interactions in online communities and manage their resources
    • 

    corecore