Search CORE

14 research outputs found

Implementation of metadata for OmniPaper RDF prototype

Author: Ariza Ávila Cesar E.
Baptista Ana Alice
Pereira T.
Yaginuma T.
Publication venue
Publication date: 01/01/2004
Field of study

Information Society Technologies (IST) funded OmniPaper project investigates efficient ways for access to distributed and heterogeneous digital news archives using state-of-the-art technologies such as RDF, XTM and SOAP. An approach taken is to create small prototypes based on each of them. This paper presents the first stage of the prototype development, particularly of RDF approach, including analysis on existing news text format standards and metadata vocabularies, definition of metadata elements for OmniPaper, implementation of application profile and RDF schema and development of the RDF prototype in a web-based RDF specific application. The elaborated analysis shows that Dublin Core Metadata Element Set has to be a principal vocabulary to implement the OmniPaper application profile as it provides greater interoperability. The RDF prototype provides RDF “metadatabase” with searchable interface for simple and advance search on the defined metadata elements

Universidade do Minho: RepositoriUM

The extension of the omnipaper system in the context of scientific publications

Author: Baptista Ana Alice
Pereira T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

Today the Internet is an important information source, which facilitates the search and access to information contents on the Web. In fact, the Internet has become an important tool used daily by scholars in the development of their work. However the contents published on the Web increase daily and consequently difficult the identification of new contents published in various information sources. In this context the RSS technology introduces a new dimension in the access and dis-tribution mechanisms of new contents published by distributed information sources. In the scope of scientific contents the use of RSS technology helps the scholars to be up to date of new scientific resources provided by several and dis-tributed information sources. An instance of the OmniPaper RDF prototype has been developed in order to instantiate the mechanisms of distributed information retrieval investigated in the context of the news published in newspapers and use them in the context of scientific contents. In addition a central metadatabase was developed through the RSS approach, in order to enable the scientific content syn-dication. This paper intends to describe the steps involved in the development of the instantiation system of the OmniPaper RDF prototype

Universidade do Minho: RepositoriUM

Crossref

Design of metadata elements for digital news articles in the omnipaper project

Author: Baptista Ana Alice
Pereira T.
Yaginuma T.
Publication venue: Universidade do Minho
Publication date: 01/06/2003
Field of study

This paper examines and proposes a set of metadata elements for describing digital news articles for the benefit of distributed and heterogeneous news resource discovery. Existing digital news description standards such as NITF and NewsML are analysed and compared with Dublin Core Metadata Element Set (DCMES), which results in that the use of Dublin Core is encouraged for interoperability of the resources. The suggested metadata elements are carefully selected and defined considering the characteristics of news articles. Some elements are detailed with refinement qualifiers and recommended encoding scheme. This set of metadata has been developed as a part of the tasks in the IST (Information Society Technologies)-funded European project OmniPaper (Smart Access to European Newspapers, IST-2001-32174)

Universidade do Minho: RepositoriUM

The instantiation of Omnipaper RDF prototype in the context of scientific publications

Author: Baptista Ana Alice
Pereira T.
Publication venue: 'Emerald'
Publication date: 01/01/2009
Field of study

The purpose of this paper is to present an instance of the system developed in the OmniPaper project, regarding the mechanisms of distributed information retrieval. These mechanisms were developed for newspapers’ articles and they were then instantiated in the context of the scientific publication. Another goal concerns the use of a central metadatabase developed to accomplish the syndication of contents, through the RSS approach. Design/methodology/approach One of the steps of the system’s development was the definition of the metadata layer that supports the research and the navigation functionalities as well as the contents’ syndication. Several tasks were performed for the definition of the metadata layer, namely: (1) analysis of several metadata standard vocabularies; (2) Selection of the metadata elements; (3) Definition of an application profile and the RSS template; (4) Development of a metadatabase, through the use of a native RDF database management system to store the RSS descriptions of the scientific publications; (5) Implementation of the search and navigation processes developed in the prototype through the use of the RDFS version of the WordNet and the RDFS version of classification system of Association for Computing Machinery Computing Classification System (ACM CCS); finally (5) Tests and validation of all developed functionalities. Findings and value The OmniPaper system can be instantiated to other domains other than news published in newspapers. The RSS technology is well suited for handling the description of scientific contents. RDF records that were used in the OmniPaper RDF prototype were replaced by RSS. The subject and lexical thesauri were kept. This strong metadata layer allows the creation of several services that facilitate the conceptual search of scientific contents. Originality and value of paper This paper presents a system that uses a central metadatabase to support conceptual searching mechanisms. The metadatabase consists of RDF triples generated from: (1) RSS files that were, by their turn generated from OAI-PMH harvested metadata records; (2) a controlled vocabulary (ACM-CCS) implemented in RDF Schema and (3) an RDF version of WordNet. This is a solution for a value-added service for the scientific community that is fully based in state-of-the-art standard technologies and is fully open for integration with other systems. Moreover this could be implemented by journals to improve the current mechanisms used to access, distribute and disseminate the scientific research developments. Research limitations/implications (if applicable) The system implemented was tested but not evaluated in a real environment with specific users

Universidade do Minho: RepositoriUM

Smart Search in Newspaper Archives Using Topic Maps

Author: B. Paepen
J. Engelen
S. Van Hemel
Publication venue
Publication date
Field of study

The OmniPaper project has implemented three information retrieval prototypes in the area of electronic news publishing. One prototype uses SOAP as communication protocol between the central system and a number of distributed news archives. The second prototype uses an RDF metadata database, enabling direct metadata queries to the central system. Finally the Topic Map prototype uses query expansion and semantic linking for smart metadata search. The Topic Map prototype enhances thesearch experience by implementing a knowledge layer that combines the semantic content of a lexical database, consisting of concepts and keywords, with a metadata-set of newspaper articles. The linking between both is currently implemented at the level of keywords but will be developed at the level of concepts in the final prototype. The knowledge layer has been designed from a Topic Map point of view, although the XTM syntax has not been used to avoid performance issues. The consortium’s adopted view on information publishing and retrieval considers querying and navigation as two very related actions that can both be captured under the name “search for relevant information”. Navigation forces the user to followpredefined paths whereas querying enables the user to look freely for a suitable starting point. The query and navigation functionality is provided through a web engine and is build on top of the information structure of the knowledge layer

The Omnipaper metadata RDF/XML prototype implementation

Author: Baptista Ana Alice
Pereira T.
Publication venue: Universidade do Minho
Publication date: 01/01/2003
Field of study

Omnipaper (Smart Access to European Newspapers, IST-2001-32174) is a project from the European Commission IST program (Information Society Technologies) that investigates and proposes ways for access to different types of distributed information sources. This article intends to introduce the technology Resource Description Framework - RDF, developed by W3C for the Web based on metadata, and its practical use in the Omnipaper project, which the authors are involved. We intend to achieve the implementation of a prototype that enables users (professional journalists and occasional users) to have simultaneous and structured access to the articles of a large number of digital European news providers. Omnipaper is not a project about digitalization of news, but about bringing digitized news originating from various sources (and in various formats) together. In this article will be described the procedure implemented in the description of our newspaper articles using the RDF technology, followed by a elaborated description on the manipulation process and treatment of the information structured in RDF, through the RDF Gateway

Universidade do Minho: RepositoriUM

Metadata elements for digital news resource description

Author: Baptista Ana Alice
Pereira T.
Yaginuma T.
Publication venue
Publication date: 01/01/2003
Field of study

Universidade do Minho: RepositoriUM

Incorporating a semantically enriched navigation layer onto an RDF metadatabase

Author: Baptista Ana Alice
Pereira T.
Publication venue
Publication date: 01/01/2004
Field of study

Information Society Technologies (IST) funded Omnipaper project, proposes to investigate efficient ways to enable an access to distributed, and heterogeneous digital news archives through the use of state-of-the-art technologies such as RDF, and XTM. In the Omnipaper project we intend to achieve the implementation of a final prototype that enables users (professional journalists and occasional users) to have simultaneous and structured access to the articles of a large number of digital European news providers. This paper proposes to describe the work developed in the Omnipaper RDF prototype focusing the use of the IPTC Subject Codes in order to incorporate a semantically enriched navigation layer onto an RDF/XML metadata descriptions developed in the RDF prototype

Universidade do Minho: RepositoriUM

Uso de RDF y bases de datos de metadatos nativas dentro del proyecto Omnipaper

Author: Ariza Ávila Cesar E.
Baptista Ana Alice
Publication venue: Universidade do Porto. Faculdade de Engenharia (FEUP)
Publication date: 13/02/2004
Field of study

Este artículo describe el trabajo realizado para la creación de un prototipo para la búsqueda de información en archivos digitales distribuidos utilizando la tecnología RDF y una base de datos nativa. El articulo reseña los prerrequisitos para la descripción y normalización de la información de los archivos distribuidos, luego los criterios para la selección de la base de datos nativa, muestra las funcionalidades del prototipo creado y al final tiene una síntesis de las lecciones aprendidas y el trabajo futuro.Comissão Europeia - IST (OmniPaper - Smart Access to European Newspapers

Universidade do Minho: RepositoriUM

Perspectiva sobre a utilização da tecnologia RSS no contexto da comunicação científica

Author: Pereira T.
Publication venue
Publication date: 16/03/2007
Field of study

Dissertação de Mestrado em Sistemas de InformaçãoActualmente a Internet é uma importante fonte de informação na descoberta de recursos de informação na Web. De facto a Internet tornou-se num instrumento de aplicação por parte dos investigadores e cientistas, no desenvolvimento diário do seu trabalho. O seu crescimento tem contribuído para a transformação dos processos de distribuição e disseminação do conhecimento científico produzido no seio das comunidades científicas e consequentemente a reestruturação do sistema de comunicação científica. Esta dissertação de mestrado tem como objectivo instanciar o sistema desenvolvido no projecto OmniPaper no que se refere aos mecanismos de recuperação de informação distribuída, desenvolvidos no âmbito das notícias publicadas em jornais, e instanciá-los no contexto da publicação científica. Outro objectivo prende-se com a utilização de uma camada central de metadados, desenvolvida com o intuito de proceder à sindicância de conteúdos científicos, seguindo a abordagem tecnológica RSS. O RSS é um formato normalizado para agregação e distribuição de conteúdos da Web, facilitando o processo de consulta e partilha de informação proveniente de diversas fontes de informação, que periodicamente está sujeita a alterações ou actualizações. A concepção do protótipo pretende alcançar os objectivos propostos nesta dissertação de mestrado. Assim, foi definida uma camada de metadados que suporta as funcionalidades de pesquisa e de navegação desenvolvidas, e permite proceder à sindicância de conteúdos. Na definição camada de metadados estiveram envolvidas diversas tarefas, nomeadamente o levantamento e análise de diversos vocabulários normalizados de metadados e extensamente utilizados no domínio da literatura científica, selecção dos elementos de metadados adequados à descrição dos artigos científicos, seguido da definição do perfil de aplicação e do template RSS. As descrições RSS das publicações científicas foram armazenadas numa base de metadados e a sua gestão e manipulação é executada por um sistema de gestão de base de dados nativa RDF. A implementação dos processos de pesquisa e de navegação foram desenvolvidos no protótipo utilizando a versão RDFS do WordNet e do sistema de classificação da Association for Computing Machinery Computing Classification System (ACM CCS). O desenvolvimento destas tarefas resultou num protótipo que é uma instância do sistema desenvolvido no projecto OmniPaper no contexto da publicação científica, e agrega os metadados dos artigos científicos provenientes do repositório da APSI, facilitando a sua pesquisa. Como trabalho futuro sugere-se, entre outras coisas, a implementação do processo de recolha dos metadados de vários repositórios para a base de metadados aqui implementada possibilitando, assim, o fornecimento de um serviço mais completo e com mais informação.At present, the Internet is an important source, which enables the finding and access of information on the Web. In fact, the Internet has become an important tool used daily by researchers and scientists in the development of their work. Its growth has been promoting the transformation of distribution and dissemination processes of knowledge produced by scientific communities and as a result the restructuring of the scientific communication system. This dissertation, intends to instance the system developed in the OmniPaper project, regarding the mechanisms of distributed information retrieval, developed in the scope of the news published in newspapers, instancing them in the context of the scientific publication. Another goal, concerns the use of a central metadatabase developed to accomplish the syndication of contents, through the RSS approach. The RSS is a format used for the syndicating of content of news published on the Web, facilitating the access and sharing of information from several sources which are constantly changing. The implementation of the prototype intends to reach the goals proposed in this master dissertation. In this sense, a metadata layer supporting the research and developed navigation functionalities was defined, and it allows the contents syndication. In the definition of the metadata layer several tasks were involved, namely the analysis of several metadata standard vocabularies widely used in the domain of scientific literature, selection of the metadata elements which best describe the features of the scientific contents. These elements will be included in the application profile and in the RSS template defined. The RSS descriptions of the scientific publications were stored in a metadatabase, through the use of a native RDF database management system. The implementation of the search and navigation processes was developed in the prototype through the use of the RDFS version of the WordNet and the RDFS version of classification system of Association for Computing Machinery Computing Classification System (ACM CCS). The development of these tasks resulted in a prototype that is an instance of the system developed in the OmniPaper project, in the context of the scientific publication, and aggregates the metadata of the scientific articles provided by the APSI repository, facilitating its research. As future work, among other things, the implement of metadata harvesting from several repositories to the metadata database implemented in this system is suggested, thus providing a more complete service and one with more information

Universidade do Minho: RepositoriUM