Annotated documents and expanded CIDOC-CRM ontology in the automatic construction of a virtual museum

Abstract

The Museum of the Person (Museu da Pessoa, MP) is a virtual museum with the purpose of exhibit life stories of common people. Its assets are composed of several interviews involving people whose stories we want to perpetuate. So the museum holds an heterogeneous collection of XML (eXtensible Markup Language) documents that constitute the working repository. The main idea is to extract automatically the information included in the repository in order to build the virtual museum’s exhibition rooms. The goal of this paper is to describe an architectural approach to build a system that will create the virtual rooms from the XML repository to enable visitors to lookup individual life stories and also inter-cross information among them. We adopted the standard for museum ontologies CIDOC-CRM (CIDOC Conceptual Reference Model) refined with FOAF (Friend of a Friend) and DBpedia ontologies to represent OntoMP. That ontology is intended to allow a conceptual navigation over the available information. The approach here discussed is based on a TripleStore and uses SPARQL (SPARQL Protocol and RDF Query Language) to extract the information. Aiming at the extraction of meaningful information, we built a text filter that converts the interviews into a RDF triples file that reflects the assets described by the ontology.This work has been supported by COMPETE: POCI-01-0145-FEDER-007043 and FCT – Fundação para a Ciência e Tecnologia within the Project Scope: UID/CEC/00319/2013. The work of Ricardo Martini is supported by CNPq, grant 201772/2014-0

    Similar works