8,620 research outputs found

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    An infrastructure for building semantic web portals

    Get PDF
    In this paper, we present our KMi semantic web portal infrastructure, which supports two important tasks of semantic web portals, namely metadata extraction and data querying. Central to our infrastructure are three components: i) an automated metadata extraction tool, ASDI, which supports the extraction of high quality metadata from heterogeneous sources, ii) an ontology-driven question answering tool, AquaLog, which makes use of the domain specific ontology and the semantic metadata extracted by ASDI to answers questions in natural language format, and iii) a semantic search engine, which enhances traditional text-based searching by making use of the underlying ontologies and the extracted metadata. A semantic web portal application has been built, which illustrates the usage of this infrastructure

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Improving Knowledge Retrieval in Digital Libraries Applying Intelligent Techniques

    Get PDF
    Nowadays an enormous quantity of heterogeneous and distributed information is stored in the digital University. Exploring online collections to find knowledge relevant to a user’s interests is a challenging work. The artificial intelligence and Semantic Web provide a common framework that allows knowledge to be shared and reused in an efficient way. In this work we propose a comprehensive approach for discovering E-learning objects in large digital collections based on analysis of recorded semantic metadata in those objects and the application of expert system technologies. We have used Case Based-Reasoning methodology to develop a prototype for supporting efficient retrieval knowledge from online repositories. We suggest a conceptual architecture for a semantic search engine. OntoUS is a collaborative effort that proposes a new form of interaction between users and digital libraries, where the latter are adapted to users and their surroundings

    Applying semantic web technologies to knowledge sharing in aerospace engineering

    Get PDF
    This paper details an integrated methodology to optimise Knowledge reuse and sharing, illustrated with a use case in the aeronautics domain. It uses Ontologies as a central modelling strategy for the Capture of Knowledge from legacy docu-ments via automated means, or directly in systems interfacing with Knowledge workers, via user-defined, web-based forms. The domain ontologies used for Knowledge Capture also guide the retrieval of the Knowledge extracted from the data using a Semantic Search System that provides support for multiple modalities during search. This approach has been applied and evaluated successfully within the aerospace domain, and is currently being extended for use in other domains on an increasingly large scale

    CHORUS Deliverable 3.4: Vision Document

    Get PDF
    The goal of the CHORUS Vision Document is to create a high level vision on audio-visual search engines in order to give guidance to the future R&D work in this area and to highlight trends and challenges in this domain. The vision of CHORUS is strongly connected to the CHORUS Roadmap Document (D2.3). A concise document integrating the outcomes of the two deliverables will be prepared for the end of the project (NEM Summit)

    JISC Final Report: IncReASe (Increasing Repository Content through Automation and Services)

    Get PDF
    The IncReASe (Increasing Repository Content through Automation and Services) was an eighteen month project (subsequently extended to twenty months) to enhance White Rose Research Online (WRRO)1. WRRO is a shared repository of research outputs (primarily publications) from the Universities of Leeds, Sheffield and York; it runs on the EPrints open source repository platform. The repository was created in 2004 and had steady growth but, in common with many other similar repositories, had difficulty in achieving a “critical mass” of content and in becoming truly embedded within researchers’ workflows. The main aim of the IncReASe project was to assess ingestion routes into WRRO with a view to lowering barriers to deposit. We reviewed the feasibility of bulk import of pre-existing metadata and/or full-text research outputs, hoping this activity would have a positive knock-on effect on repository growth and embedding. Prior to the project, we had identified researchers’ reluctance to duplicate effort in metadata creation as a significant barrier to WRRO uptake; we investigated how WRRO might share data with internal and external IT systems. This work included a review of how WRRO, as an institutional based repository, might interact with the subject repository of the Economic and Social Research Council (ESRC). The project addressed four main areas: (i) researcher behaviour: we investigated researcher awareness, motivation and workflow through a survey of archiving activity on the university web sites, a questionnaire and discussions with researchers (ii) bulk import: we imported data from local systems, including York’s submission data for the 2008 Research Assessment Exercise (RAE), and developed an import plug-in for use with the arXiv2 repository (iii) interoperability: we looked at how WRRO might interact with university and departmental publication databases and ESRC’s repository. (iv) metadata: we assessed metadata issues raised by importing publication data from a variety of sources. A number of outputs from the project have been made available from the IncReASe project web site http://eprints.whiterose.ac.uk/increase/. The project highlighted the low levels of researcher awareness of WRRO - and of broader open access issues, including research funders’ deposit requirements. We designed some new publicity materials to start to address this. Departmental publication databases provided a useful jumping off point for advocacy and liaison; this activity was helpful in promoting awareness of WRRO. Bulk import proved time consuming – both in terms of adjusting EPrints plug-ins to incorporate different datasets and in the staff time required to improve publication metadata. A number of deposit scenarios were developed in the context of our work with ESRC; we concentrated on investigating how a local deposit of a research paper and attendant metadata in WRRO might be used to populate ESRC’s repository. This work improved our understanding of researcher workflows and of the SWORD protocol as a potential (if partial) solution to the single deposit, multiple destination model we wish to develop; we think the prospect of institutional repository / ESRC data sharing is now a step closer. IncReASe experienced some staff recruitment difficulties. It was also necessary to adapt the project to the changing IT landscape at the three partner institutions – in particular, the introduction of a centralised publication management system at the University of Leeds. Although these factors had some impact on deliverables, the aims and objectives of the project were largely achieved

    WIKINGER – Wiki Next Generation Enhanced Repositories

    No full text
    The regular indexing of text documents is based on the textual representation and does not evaluate the actual document content. In the semantic web approach, human-authored text documents are transformed into machine-readable content data which can be used to create semantic relations among documents. In this paper, we present on-going work in the WIKINGER project which aims to build a web-based system for semantic indexing of text documents by evaluating manual and semi-automatic annotations. A particular feature is the continuous refinement of the automatically generated semantic network by considering community feedback. The feasibility of the approach will be validated in a pilot application

    Bridging the Gap Between Traditional Metadata and the Requirements of an Academic SDI for Interdisciplinary Research

    Get PDF
    Metadata has long been understood as a fundamental component of any Spatial Data Infrastructure, providing information relating to discovery, evaluation and use of datasets and describing their quality. Having good metadata about a dataset is fundamental to using it correctly and to understanding the implications of issues such as missing data or incorrect attribution on the results obtained for any analysis carried out. Traditionally, spatial data was created by expert users (e.g. national mapping agencies), who created metadata for the data. Increasingly, however, data used in spatial analysis comes from multiple sources and could be captured or used by nonexpert users – for example academic researchers ‐ many of whom are from non‐GIS disciplinary backgrounds, not familiar with metadata and perhaps working in geographically dispersed teams. This paper examines the applicability of metadata in this academic context, using a multi‐national coastal/environmental project as a case study. The work to date highlights a number of suggestions for good practice, issues and research questions relevant to Academic SDI, particularly given the increased levels of research data sharing and reuse required by UK and EU funders
    • 

    corecore