147 research outputs found
Accurator: Nichesourcing for Cultural Heritage
With more and more cultural heritage data being published online, their
usefulness in this open context depends on the quality and diversity of
descriptive metadata for collection objects. In many cases, existing metadata
is not adequate for a variety of retrieval and research tasks and more specific
annotations are necessary. However, eliciting such annotations is a challenge
since it often requires domain-specific knowledge. Where crowdsourcing can be
successfully used for eliciting simple annotations, identifying people with the
required expertise might prove troublesome for tasks requiring more complex or
domain-specific knowledge. Nichesourcing addresses this problem, by tapping
into the expert knowledge available in niche communities. This paper presents
Accurator, a methodology for conducting nichesourcing campaigns for cultural
heritage institutions, by addressing communities, organizing events and
tailoring a web-based annotation tool to a domain of choice. The contribution
of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation
tool for experts and 3) validation of the methodology and tool in three case
studies. The three domains of the case studies are birds on art, bible prints
and fashion images. We compare the quality and quantity of obtained annotations
in the three case studies, showing that the nichesourcing methodology in
combination with the image annotation tool can be used to collect high quality
annotations in a variety of domains and annotation tasks. A user evaluation
indicates the tool is suited and usable for domain specific annotation tasks
Multimedia Annotations on the Semantic Web
Multimedia in all forms (images, video, graphics, music, speech) is exploding on the Web. The content needs to be annotated and indexed to enable effective search and retrieval. However, recent standards and best practices for multimedia metadata don't provide semantically rich descriptions of multimedia content. On the other hand, the World Wide Web Consortium's (W3C's) Semantic Web effort has been making great progress in advancing techniques for annotating semantics of Web resources. To bridge this gap, a new W3C task force has been created to investigate multimedia annotations on the Semantic Web. This article examines the problems of semantically annotating multimedia and describes the integration of multimedia metadata with the Semantic Web. (Editor's note by John R. Smith)
Thesaurus-based search in large heterogeneous collections
In cultural heritage, large virtual collections are coming into
existence. Such collections contain heterogeneous sets of metadata and
vocabulary concepts, originating from multiple sources. In the context
of the E-Culture demonstrator we have shown earlier that such virtual
collections can be effectively explored with keyword search and semantic
clustering. In this paper we describe the design rationale of ClioPatria,
an open-source system which provides APIs for scalable semantic graph
search. The use of ClioPatria’s search strategies is illustrated with a
realistic use case: searching for ”Picasso”. We discuss details of scalable
graph search, the required OWL reasoning functionalities and show why
SPARQL queries are insufficient for solving the search problem
On the Role of User-generated Metadata in Audio Visual Collections
Recently, various crowdsourcing initiatives showed that targeted efforts of
user communities result in massive amo
Searching in semantically rich linked data: a case study in cultural heritage
Traditionally the relations between concepts from a controlled vocabulary, such as the hierarchical and associative relations in a thesaurus, have been used to support users in their search process. In the context of the Semantic Web, multiple interlinked vocabularies are becoming available, providing a large number of different relations between concepts. However, for a specific search task, only a small fraction of these will be meaningful to the user, and currently we have little understanding of which methods can be used to determine this. In this paper, we describe a case study in the cultural heritage domain that investigates support for the specific task of finding artworks in a data set of multiple linked art collections and vocabularies. In a first experiment a number of use cases from domain experts ar
Trusting Semi-structured Web Data
Abstract. The growth of the Web brings an uncountable amount of useful information to everybody who can access it. These data are often crowdsourced or provided by heterogenous or unknown sources, therefore they might be maliciously manipulated or unreliable. Moreover, because of their amount it is often impossible to extensively check them, and this gives rise to massive and ever growing trust issues. The research presented in this paper aims at investigating the use of data sources and reasoning techniques to address trust issues about Web data. In particular, these investigations include the use of trusted Web sources, of uncertainty reasoning, of semantic similarity measures and of provenance information as possible bases for trust estimation. The intended result of this thesis is a series of analyses and tools that allow to better understand and address the problem of trusting semi-structured Web data
LCSH, SKOS and Linked Data
A technique for converting Library of Congress Subject Headings MARCXML to
Simple Knowledge Organization System (SKOS) RDF is described. Strengths of the
SKOS vocabulary are highlighted, as well as possible points for extension, and
the integration of other semantic web vocabularies such as Dublin Core. An
application for making the vocabulary available as linked-data on the Web is
also described.Comment: Submission for the Dublin Core 2008 conference in Berli
- …