Location of Repository

Cultural heritage digital resources: from extraction to querying

By M. Genereux

Abstract

This article presents a method to extract and query Cultural Heritage (CH) textual digital resources. The extraction and querying phases are linked by a common ontological representation (CIDOC-CRM). A transport format (RDF) allows the ontology to be queried in a suitable query language (SPARQL), on top of which an interface makes it possible to formulate queries in Natural Language (NL). The extraction phase exploits the propositional nature of \ud the ontology. The query interface is based on the Generate and Select principle, where potentially suitable queries are generated to match the user input, only for the most semantically similar candidate to be selected. \ud In the process we evaluate data extracted from the description of a medieval city (Wolfenbuttel), transform and develop two methods of computing similarity between sentences based on WordNet. Experiments are described that compare the pros and cons of the similarity measures and evaluate \ud them

Topics: Q100 Linguistics, V000 Historical and Philosophical studies, G700 Artificial Intelligence
Year: 2007
OAI identifier: oai:eprints.brighton.ac.uk:3211

Suggested articles

Preview

Citations

  1. All web references visited on
  2. (2003). Capturing and applying existing knowledge to semantic applications.
  3. (2006). Constructing a generic natural language interface for an xml database. doi
  4. (1998). Learning stringedit distance. doi
  5. Monolingual Machine Translation for Paraphrase Generation. doi
  6. (1995). Natural language interfaces to databases - an introduction. doi
  7. (2006). Paraphrasing for Automatic Evaluation. doi
  8. (2005). RelExt: A Tool for Relation Extraction in Ontology Extension. In: doi
  9. (2001). Semantic distance in wordnet : an experimental, applicationoriented evaluation of five measures.
  10. (2005). The CIDOC CRM, an Ontological Approach to Schema Heterogeneity. Semantic Interoperability and Integration.
  11. (2000). The Impact of Web-shared Knowledge on Archaeological Scientific Research.
  12. (2006). The PASCAL Recognising Textual Entailment Challenge. doi
  13. (1989). Word association norms, mutual information, and lexicography. doi
  14. (2002). XML Encoding of Archaeological Unstructured Data.

To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.