Search CORE

3,740 research outputs found

NERD: Evaluating Named Entity Recognition Tools in the Web of Data

Author: Rizzo G. Troncy R.
Publication venue
Publication date: 01/01/2011
Field of study

EURECOM Repository

PORTO Publications Open Repository TOrino

Interoperability in IoT through the semantic profiling of objects

Author: Correia Noélia
Martins Jaime
Mazayev Andriy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

The emergence of smarter and broader people-oriented IoT applications and services requires interoperability at both data and knowledge levels. However, although some semantic IoT architectures have been proposed, achieving a high degree of interoperability requires dealing with a sea of non-integrated data, scattered across vertical silos. Also, these architectures do not fit into the machine-to-machine requirements, as data annotation has no knowledge on object interactions behind arriving data. This paper presents a vision of how to overcome these issues. More specifically, the semantic profiling of objects, through CoRE related standards, is envisaged as the key for data integration, allowing more powerful data annotation, validation, and reasoning. These are the key blocks for the development of intelligent applications.Portuguese Science and Technology Foundation (FCT) [UID/MULTI/00631/2013

Crossref

Sapientia

An evaluation resource for geographic information retrieval

Author: Di Nunzio G.
Ferro N.
Gey F.
Mandl T.
Sanderson M.
Santos D.
Womser-Hacker C.
Publication venue
Publication date: 01/01/2008
Field of study

In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic information retrieval requires an evaluation resource which represents realistic information needs and which is geographically challenging. Some experimental results and analysis are reported

White Rose Research Online

Archivio istituzionale della ricerca - Università di Padova

Enriching the 1758 Portuguese Parish Memories (Alentejo) with Named Entities

Author: Cameron Helena
Olival Fernanda
Santos Ivo
Santos Joaquim
Sequeira Ofelia
Vieira Renata
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/09/2021
Field of study

This work presents an enriched version of the Parish Memories (1758–1761), an essential Portuguese historical source manually transcribed. It is enriched with annotations of named entities of the types PERSON, LOCATION, and ORGANIZATION. The annotation was done automatically for the whole collection where two researchers annotated a portion of it manually for evaluation purposes. In this dataset, we provide the tagged texts, the lists of extracted entities, and frequency counts. The corpus is useful for historians, allowing, for instance, comparative analyses between parishes and regions or to calculate the area of influence of a locality. The paper describes the creation and evaluation of the corpus, discusses its applications and limitations. This first release may be improved by other researchers interested in the historical source itself or in the technology employed in its annotation.FCT CEECIND/01997/2017, UIDB/00057/202

Directory of Open Access Journals

Repositório Científico da Universidade de Évora

Building Portuguese Language Resources for Natural Language Processing Tasks

Author: Rúben Filipe Seabra de Almeida
Publication venue
Publication date: 20/07/2023
Field of study

Repositório Aberto da Universidade do Porto

Towards Automatic Creation of Annotations to Foster Development of Named Entity Recognizers

Author: Matos Emanuel
Miguel Pedro
Publication venue: OASIcs - OpenAccess Series in Informatics. 10th Symposium on Languages, Applications and Technologies (SLATE 2021)
Publication date: 01/01/2021
Field of study

Named Entity Recognition (NER) is an essential step for many natural language processing tasks, including Information Extraction. Despite recent advances, particularly using deep learning techniques, the creation of accurate named entity recognizers continues a complex task, highly dependent on annotated data availability. To foster existence of NER systems for new domains it is crucial to obtain the required large volumes of annotated data with low or no manual labor. In this paper it is proposed a system to create the annotated data automatically, by resorting to a set of existing NERs and information sources (DBpedia). The approach was tested with documents of the Tourism domain. Distinct methods were applied for deciding the final named entities and respective tags. The results show that this approach can increase the confidence on annotations and/or augment the number of categories possible to annotate. This paper also presents examples of new NERs that can be rapidly created with the obtained annotated data. The annotated data, combined with the possibility to apply both the ensemble of NER systems and the new Gazetteer-based NERs to large corpora, create the necessary conditions to explore the recent neural deep learning state-of-art approaches to NER (ex: BERT) in domains with scarce or nonexistent data for training

Dagstuhl Research Online Publication Server

Sentiment and behaviour annotation in a corpus of dialogue summaries

Author: Alvares Alexandre Rossi
Carvalho Ariadne Maria Brito Rizzoni
Piwek Paul
Roman Norton Trevisan
Publication venue
Publication date: 01/01/2015
Field of study

This paper proposes a scheme for sentiment annotation. We show how the task can be made tractable by focusing on one of the many aspects of sentiment: sentiment as it is recorded in behaviour reports of people and their interactions. Together with a number of measures for supporting the reliable application of the scheme, this allows us to obtain sufficient to good agreement scores (in terms of Krippendorf's alpha) on three key dimensions: polarity, evaluated party and type of clause. Evaluation of the scheme is carried out through the annotation of an existing corpus of dialogue summaries (in English and Portuguese) by nine annotators. Our contribution to the field is twofold: (i) a reliable multi-dimensional annotation scheme for sentiment in behaviour reports; and (ii) an annotated corpus that was used for testing the reliability of the scheme and which is made available to the research community

ZENODO

Open Research Online (The Open University)

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

ARPHA OAI-PMH Endpoint

ARPHA Preprints