15,004 research outputs found

    Accurator: Nichesourcing for Cultural Heritage

    Full text link
    With more and more cultural heritage data being published online, their usefulness in this open context depends on the quality and diversity of descriptive metadata for collection objects. In many cases, existing metadata is not adequate for a variety of retrieval and research tasks and more specific annotations are necessary. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used for eliciting simple annotations, identifying people with the required expertise might prove troublesome for tasks requiring more complex or domain-specific knowledge. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation tool for experts and 3) validation of the methodology and tool in three case studies. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high quality annotations in a variety of domains and annotation tasks. A user evaluation indicates the tool is suited and usable for domain specific annotation tasks

    Extending the 5S Framework of Digital Libraries to support Complex Objects, Superimposed Information, and Content-Based Image Retrieval Services

    Get PDF
    Advanced services in digital libraries (DLs) have been developed and widely used to address the required capabilities of an assortment of systems as DLs expand into diverse application domains. These systems may require support for images (e.g., Content-Based Image Retrieval), Complex (information) Objects, and use of content at fine grain (e.g., Superimposed Information). Due to the lack of consensus on precise theoretical definitions for those services, implementation efforts often involve ad hoc development, leading to duplication and interoperability problems. This article presents a methodology to address those problems by extending a precisely specified minimal digital library (in the 5S framework) with formal definitions of aforementioned services. The theoretical extensions of digital library functionality presented here are reinforced with practical case studies as well as scenarios for the individual and integrative use of services to balance theory and practice. This methodology has implications that other advanced services can be continuously integrated into our current extended framework whenever they are identified. The theoretical definitions and case study we present may impact future development efforts and a wide range of digital library researchers, designers, and developers

    Methodological considerations concerning manual annotation of musical audio in function of algorithm development

    Get PDF
    In research on musical audio-mining, annotated music databases are needed which allow the development of computational tools that extract from the musical audiostream the kind of high-level content that users can deal with in Music Information Retrieval (MIR) contexts. The notion of musical content, and therefore the notion of annotation, is ill-defined, however, both in the syntactic and semantic sense. As a consequence, annotation has been approached from a variety of perspectives (but mainly linguistic-symbolic oriented), and a general methodology is lacking. This paper is a step towards the definition of a general framework for manual annotation of musical audio in function of a computational approach to musical audio-mining that is based on algorithms that learn from annotated data. 1

    Exploring manuscripts: sharing ancient wisdoms across the semantic web

    Get PDF
    Recent work in digital humanities has seen researchers in-creasingly producing online editions of texts and manuscripts, particularly in adoption of the TEI XML format for online publishing. The benefits of semantic web techniques are un-derexplored in such research, however, with a lack of sharing and communication of research information. The Sharing Ancient Wisdoms (SAWS) project applies linked data prac-tices to enhance and expand on what is possible with these digital text editions. Focussing on Greek and Arabic col-lections of ancient wise sayings, which are often related to each other, we use RDF to annotate and extract seman-tic information from the TEI documents as RDF triples. This allows researchers to explore the conceptual networks that arise from these interconnected sayings. The SAWS project advocates a semantic-web-based methodology, en-hancing rather than replacing current workflow processes, for digital humanities researchers to share their findings and collectively benefit from each other’s work

    Enriching Existing Test Collections with OXPath

    Full text link
    Extending TREC-style test collections by incorporating external resources is a time consuming and challenging task. Making use of freely available web data requires technical skills to work with APIs or to create a web scraping program specifically tailored to the task at hand. We present a light-weight alternative that employs the web data extraction language OXPath to harvest data to be added to an existing test collection from web resources. We demonstrate this by creating an extended version of GIRT4 called GIRT4-XT with additional metadata fields harvested via OXPath from the social sciences portal Sowiport. This allows the re-use of this collection for other evaluation purposes like bibliometrics-enhanced retrieval. The demonstrated method can be applied to a variety of similar scenarios and is not limited to extending existing collections but can also be used to create completely new ones with little effort.Comment: Experimental IR Meets Multilinguality, Multimodality, and Interaction - 8th International Conference of the CLEF Association, CLEF 2017, Dublin, Ireland, September 11-14, 201

    Information seeking retrieval, reading and storing behaviour of library users

    Get PDF
    In the interest of digital libraries, it is advisable that designers be aware of the potential behaviour of the users of such a system. There are two distinct parts under investigation, the interaction between traditional libraries involving the seeking and retrieval of relevant material, and the reading and storage behaviours ensuing. Through this analysis, the findings could be incorporated into digital library facilities. There has been copious amounts of research on information seeking leading to the development of behavioural models to describe the process. Often research on the information seeking practices of individuals is based on the task and field of study. The information seeking model, presented by Ellis et al. (1993), characterises the format of this study where it is used to compare various research on the information seeking practices of groups of people (from academics to professionals). It is found that, although researchers do make use of library facilities, they tend to rely heavily on their own collections and primarily use the library as a source for previously identified information, browsing and interloan. It was found that there are significant differences in user behaviour between the groups analysed. When looking at the reading and storage of material it was hard to draw conclusions, due to the lack of substantial research and information on the topic. However, through the use of reading strategies, a general idea on how readers behave can be developed. Designers of digital libraries can benefit from the guidelines presented here to better understand their audience

    Are e-readers suitable tools for scholarly work?

    Full text link
    This paper aims to offer insights into the usability, acceptance and limitations of e-readers with regard to the specific requirements of scholarly text work. To fit into the academic workflow non-linear reading, bookmarking, commenting, extracting text or the integration of non-textual elements must be supported. A group of social science students were questioned about their experiences with electronic publications for study purposes. This same group executed several text-related tasks with the digitized material presented to them in two different file formats on four different e-readers. Their performances were subsequently evaluated by means of frequency analyses in detail. Findings - e-Publications have made advances in the academic world; however e-readers do not yet fit seamlessly into the established chain of scholarly text-processing focusing on how readers use material during and after reading. Our tests revealed major deficiencies in these techniques. With a small number of participants (n=26) qualitative insights can be obtained, not representative results. Further testing with participants from various disciplines and of varying academic status is required to arrive at more broadly applicable results. Practical implications - Our test results help to optimize file conversion routines for scholarly texts. We evaluated our data on the basis of descriptive statistics and abstained from any statistical significance test. The usability test of e-readers in a scientific context aligns with both studies on the prevalence of e-books in the sciences and technical test reports of portable reading devices. Still, it takes a distinctive angle in focusing on the characteristics and procedures of textual work in the social sciences and measures the usability of e-readers and file-features against these standards.Comment: 22 pages, 6 figures, accepted for publication in Online Information Revie
    corecore