3,740 research outputs found
Interoperability in IoT through the semantic profiling of objects
The emergence of smarter and broader people-oriented IoT applications and services requires interoperability at both data and knowledge levels. However, although some semantic IoT architectures have been proposed, achieving a high degree of interoperability requires dealing with a sea of non-integrated data, scattered across vertical silos. Also, these architectures do not fit into the machine-to-machine requirements, as data annotation has no knowledge on object interactions behind arriving data. This paper presents a vision of how to overcome these issues. More specifically, the semantic profiling of objects, through CoRE related standards, is envisaged as the key for data integration, allowing more powerful data annotation, validation, and reasoning. These are the key blocks for the development of intelligent applications.Portuguese Science and Technology Foundation (FCT) [UID/MULTI/00631/2013
An evaluation resource for geographic information retrieval
In this paper we present an evaluation resource for geographic information retrieval developed within the Cross Language Evaluation
Forum (CLEF). The GeoCLEF track is dedicated to the evaluation of geographic information retrieval systems. The resource
encompasses more than 600,000 documents, 75 topics so far, and more than 100,000 relevance judgments for these topics. Geographic
information retrieval requires an evaluation resource which represents realistic information needs and which is geographically
challenging. Some experimental results and analysis are reported
Enriching the 1758 Portuguese Parish Memories (Alentejo) with Named Entities
This work presents an enriched version of the Parish Memories (1758–1761), an essential Portuguese historical source manually transcribed. It is enriched with annotations of named entities of the types PERSON, LOCATION, and ORGANIZATION. The annotation was done automatically for the whole collection where two researchers annotated a portion of it manually for evaluation purposes. In this dataset, we provide the tagged texts, the lists of extracted entities, and frequency counts. The corpus is useful for historians, allowing, for instance, comparative analyses between parishes and regions or to calculate the area of influence of a locality. The paper describes the creation and evaluation of the corpus, discusses its applications and limitations. This first release may be improved by other researchers interested in the historical source itself or in the technology employed in its annotation.FCT CEECIND/01997/2017, UIDB/00057/202
Towards Automatic Creation of Annotations to Foster Development of Named Entity Recognizers
Named Entity Recognition (NER) is an essential step for many natural language processing tasks, including Information Extraction. Despite recent advances, particularly using deep learning techniques, the creation of accurate named entity recognizers continues a complex task, highly dependent on annotated data availability. To foster existence of NER systems for new domains it is crucial to obtain the required large volumes of annotated data with low or no manual labor. In this paper it is proposed a system to create the annotated data automatically, by resorting to a set of existing NERs and information sources (DBpedia). The approach was tested with documents of the Tourism domain. Distinct methods were applied for deciding the final named entities and respective tags. The results show that this approach can increase the confidence on annotations and/or augment the number of categories possible to annotate. This paper also presents examples of new NERs that can be rapidly created with the obtained annotated data. The annotated data, combined with the possibility to apply both the ensemble of NER systems and the new Gazetteer-based NERs to large corpora, create the necessary conditions to explore the recent neural deep learning state-of-art approaches to NER (ex: BERT) in domains with scarce or nonexistent data for training
Sentiment and behaviour annotation in a corpus of dialogue summaries
This paper proposes a scheme for sentiment annotation. We show how the task can be made tractable by focusing on one of the many aspects of sentiment: sentiment as it is recorded in behaviour reports of people and their interactions. Together with a number of measures for supporting the reliable application of the scheme, this allows us to obtain sufficient to good agreement scores (in terms of Krippendorf's alpha) on three key dimensions: polarity, evaluated party and type of clause. Evaluation of the scheme is carried out through the annotation of an existing corpus of dialogue summaries (in English and Portuguese) by nine annotators. Our contribution to the field is twofold: (i) a reliable multi-dimensional annotation scheme for sentiment in behaviour reports; and (ii) an annotated corpus that was used for testing the reliability of the scheme and which is made available to the research community
- …