15,004 research outputs found
Accurator: Nichesourcing for Cultural Heritage
With more and more cultural heritage data being published online, their
usefulness in this open context depends on the quality and diversity of
descriptive metadata for collection objects. In many cases, existing metadata
is not adequate for a variety of retrieval and research tasks and more specific
annotations are necessary. However, eliciting such annotations is a challenge
since it often requires domain-specific knowledge. Where crowdsourcing can be
successfully used for eliciting simple annotations, identifying people with the
required expertise might prove troublesome for tasks requiring more complex or
domain-specific knowledge. Nichesourcing addresses this problem, by tapping
into the expert knowledge available in niche communities. This paper presents
Accurator, a methodology for conducting nichesourcing campaigns for cultural
heritage institutions, by addressing communities, organizing events and
tailoring a web-based annotation tool to a domain of choice. The contribution
of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation
tool for experts and 3) validation of the methodology and tool in three case
studies. The three domains of the case studies are birds on art, bible prints
and fashion images. We compare the quality and quantity of obtained annotations
in the three case studies, showing that the nichesourcing methodology in
combination with the image annotation tool can be used to collect high quality
annotations in a variety of domains and annotation tasks. A user evaluation
indicates the tool is suited and usable for domain specific annotation tasks
Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services
Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced
services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers
Methodological considerations concerning manual annotation of musical audio in function of algorithm development
In research on musical audio-mining, annotated music databases are needed which allow the development of computational tools that extract from the musical audiostream the kind of high-level content that users can deal with in Music Information Retrieval (MIR) contexts. The notion of musical content, and therefore the notion of annotation, is ill-defined, however, both in the syntactic and semantic sense. As a consequence, annotation has been approached from a variety of perspectives (but mainly linguistic-symbolic oriented), and a general methodology is lacking. This paper is a step towards the definition of a general framework for manual annotation of musical audio in function of a computational approach to musical audio-mining that is based on algorithms that learn from annotated data. 1
Recommended from our members
Annotation evolution: how Web 2.0 technologies are enabling a change in annotation practice
Are Web 2.0 tools and technologies changing how and why scholars annotate their research sources? We begin to answer this question by assessing current technology and tools that support new functions for one of the most common scholarly research activity: taking notes. The results suggest a new approach to personalized information retrieval.published or submitted for publicationis peer reviewe
Exploring manuscripts: sharing ancient wisdoms across the semantic web
Recent work in digital humanities has seen researchers in-creasingly producing online editions of texts and manuscripts, particularly in adoption of the TEI XML format for online publishing. The benefits of semantic web techniques are un-derexplored in such research, however, with a lack of sharing and communication of research information. The Sharing Ancient Wisdoms (SAWS) project applies linked data prac-tices to enhance and expand on what is possible with these digital text editions. Focussing on Greek and Arabic col-lections of ancient wise sayings, which are often related to each other, we use RDF to annotate and extract seman-tic information from the TEI documents as RDF triples. This allows researchers to explore the conceptual networks that arise from these interconnected sayings. The SAWS project advocates a semantic-web-based methodology, en-hancing rather than replacing current workflow processes, for digital humanities researchers to share their findings and collectively benefit from each other’s work
Enriching Existing Test Collections with OXPath
Extending TREC-style test collections by incorporating external resources is
a time consuming and challenging task. Making use of freely available web data
requires technical skills to work with APIs or to create a web scraping program
specifically tailored to the task at hand. We present a light-weight
alternative that employs the web data extraction language OXPath to harvest
data to be added to an existing test collection from web resources. We
demonstrate this by creating an extended version of GIRT4 called GIRT4-XT with
additional metadata fields harvested via OXPath from the social sciences portal
Sowiport. This allows the re-use of this collection for other evaluation
purposes like bibliometrics-enhanced retrieval. The demonstrated method can be
applied to a variety of similar scenarios and is not limited to extending
existing collections but can also be used to create completely new ones with
little effort.Comment: Experimental IR Meets Multilinguality, Multimodality, and Interaction
- 8th International Conference of the CLEF Association, CLEF 2017, Dublin,
Ireland, September 11-14, 201
Recommended from our members
Introduction
This book brings together for the first time the collected wisdom of international leaders in the theory and practice in the emerging field of cultural heritage crowdsourcing. It features eight accessible case studies of groundbreaking projects from leading cultural heritage and academic institutions, and four thought-‐provoking essays that reflect on the wider implications of this engagement for participants and on the institutions themselves
Information seeking retrieval, reading and storing behaviour of library users
In the interest of digital libraries, it is advisable that designers be aware of the potential behaviour of the users of such a system. There are two distinct parts under investigation, the interaction between traditional libraries involving the seeking and retrieval of relevant material, and the reading and storage behaviours ensuing. Through this analysis, the findings could be incorporated into digital library facilities. There has been copious amounts of research on information seeking leading to the development of behavioural models to describe the process. Often research on the information seeking practices of individuals is based on the task and field of study. The information seeking model, presented by Ellis et al. (1993), characterises the format of this study where it is used to compare various research on the information seeking practices of groups of people (from academics to professionals). It is found that, although researchers do make use of library facilities, they tend to rely heavily on their own collections and primarily use the library as a source for previously identified information, browsing and interloan. It was found that there are significant differences in user behaviour between the groups analysed. When looking at the reading and storage of material it was hard to draw conclusions, due to the lack of substantial research and information on the topic. However, through the use of reading strategies, a general idea on how readers behave can be developed. Designers of digital libraries can benefit from the guidelines presented here to better understand their audience
Are e-readers suitable tools for scholarly work?
This paper aims to offer insights into the usability, acceptance and
limitations of e-readers with regard to the specific requirements of scholarly
text work. To fit into the academic workflow non-linear reading, bookmarking,
commenting, extracting text or the integration of non-textual elements must be
supported. A group of social science students were questioned about their
experiences with electronic publications for study purposes. This same group
executed several text-related tasks with the digitized material presented to
them in two different file formats on four different e-readers. Their
performances were subsequently evaluated by means of frequency analyses in
detail. Findings - e-Publications have made advances in the academic world;
however e-readers do not yet fit seamlessly into the established chain of
scholarly text-processing focusing on how readers use material during and after
reading. Our tests revealed major deficiencies in these techniques. With a
small number of participants (n=26) qualitative insights can be obtained, not
representative results. Further testing with participants from various
disciplines and of varying academic status is required to arrive at more
broadly applicable results. Practical implications - Our test results help to
optimize file conversion routines for scholarly texts. We evaluated our data on
the basis of descriptive statistics and abstained from any statistical
significance test. The usability test of e-readers in a scientific context
aligns with both studies on the prevalence of e-books in the sciences and
technical test reports of portable reading devices. Still, it takes a
distinctive angle in focusing on the characteristics and procedures of textual
work in the social sciences and measures the usability of e-readers and
file-features against these standards.Comment: 22 pages, 6 figures, accepted for publication in Online Information
Revie
- …